# Column selectivityLast update (2008-04-09 14:30:59)                                                                                                                   Date added (2008-04-09 12:48:06)

Summary
The ratio of the Number of Distinct Values in the indexed column / number of rows in the table represents the Selectivity of a column.

The ideal selectivity is 1. Such a selectivity can be reached only by unique indexes on NOT NULL columns.

Follow the example to calculate the columns selectivity
```CREATE TABLE HELPDESK.EXAMPLE
(ID NUMBER(9),
GROUP_NAME VARCHAR2(30),
ACCOUNT_NAME VARCHAR2(30))
TABLESPACE HELPDESK;

CREATE SEQUENCE HELPDESK.EXAMPLE_SEQ
INCREMENT BY 1
MINVALUE 0
NOCACHE
NOCYCLE
NOORDER;
```
Now run the pl/sql code to fill the table
```BEGIN
FOR i IN 1..10 LOOP
FOR k IN 1..100000 LOOP
INSERT INTO HELPDESK.EXAMPLE VALUES (helpdesk.example_seq.NEXTVAL, 'GROUP' || i, 'ACCOUNT' || k);
END LOOP;
COMMIT;
END LOOP;
END;
/
```
You must calculate the statistics for the table
```BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname=>'HELPDESK', tabname=>'EXAMPLE',
END;
/
```
First calculate the number of rows in the table
```SQL> SELECT TABLE_NAME, LAST_ANALYZED, NUM_ROWS
2  FROM DBA_TABLES
3  WHERE TABLE_NAME = 'EXAMPLE';

TABLE_NAME                     LAST_ANAL   NUM_ROWS
------------------------------ --------- ----------
EXAMPLE                        09-APR-08    1000000
```
The table has 1million rows so it is also possible to query DBA_TAB_COLUMNS to investigate the selectivity of each column individually.
```SQL> SELECT COLUMN_NAME, LAST_ANALYZED, NUM_DISTINCT, (NUM_DISTINCT/1000000)*100 || '%' SELECTIVITY
2  FROM DBA_TAB_COLUMNS
3  WHERE TABLE_NAME = 'EXAMPLE'
4  ORDER BY COLUMN_ID ASC;

COLUMN_NAME                    LAST_ANAL NUM_DISTINCT SELECTIVITY
------------------------------ --------- ------------ -----------------------------------------
ID                             09-APR-08      1000000 100%
GROUP_NAME                     09-APR-08           10 .001%
ACCOUNT_NAME                   09-APR-08       100000 10%
```
If you are investigating to create a composite index with columns GROUP_NAME and ACCOUNT_NAME then put as a leading column the column with the highest selectivity. In this case the ACCOUNT_NAME should be the leading column.