Column selectivity
                                                 Last update (2008-04-09 14:30:59)
                                                                                                                   Date added (2008-04-09 12:48:06)

Summary
The ratio of the Number of Distinct Values in the indexed column / number of rows in the table represents the Selectivity of a column.

The ideal selectivity is 1. Such a selectivity can be reached only by unique indexes on NOT NULL columns.

Follow the example to calculate the columns selectivity
CREATE TABLE HELPDESK.EXAMPLE 
(ID NUMBER(9), 
GROUP_NAME VARCHAR2(30), 
ACCOUNT_NAME VARCHAR2(30)) 
TABLESPACE HELPDESK;

CREATE SEQUENCE HELPDESK.EXAMPLE_SEQ
START WITH 0
INCREMENT BY 1
MINVALUE 0
NOCACHE 
NOCYCLE 
NOORDER;
Now run the pl/sql code to fill the table
BEGIN
    FOR i IN 1..10 LOOP
	  FOR k IN 1..100000 LOOP 
      INSERT INTO HELPDESK.EXAMPLE VALUES (helpdesk.example_seq.NEXTVAL, 'GROUP' || i, 'ACCOUNT' || k);
	  END LOOP;
	  COMMIT;
    END LOOP;
    END;
  /
You must calculate the statistics for the table
BEGIN
 DBMS_STATS.GATHER_TABLE_STATS(ownname=>'HELPDESK', tabname=>'EXAMPLE', 
 estimate_percent=>100, CASCADE=>TRUE, DEGREE=>8);
END;
/
First calculate the number of rows in the table
SQL> SELECT TABLE_NAME, LAST_ANALYZED, NUM_ROWS 
  2  FROM DBA_TABLES 
  3  WHERE TABLE_NAME = 'EXAMPLE';

TABLE_NAME                     LAST_ANAL   NUM_ROWS
------------------------------ --------- ----------
EXAMPLE                        09-APR-08    1000000
The table has 1million rows so it is also possible to query DBA_TAB_COLUMNS to investigate the selectivity of each column individually.
SQL> SELECT COLUMN_NAME, LAST_ANALYZED, NUM_DISTINCT, (NUM_DISTINCT/1000000)*100 || '%' SELECTIVITY
  2  FROM DBA_TAB_COLUMNS 
  3  WHERE TABLE_NAME = 'EXAMPLE' 
  4  ORDER BY COLUMN_ID ASC;

COLUMN_NAME                    LAST_ANAL NUM_DISTINCT SELECTIVITY
------------------------------ --------- ------------ -----------------------------------------
ID                             09-APR-08      1000000 100%
GROUP_NAME                     09-APR-08           10 .001%
ACCOUNT_NAME                   09-APR-08       100000 10%
If you are investigating to create a composite index with columns GROUP_NAME and ACCOUNT_NAME then put as a leading column the column with the highest selectivity. In this case the ACCOUNT_NAME should be the leading column.

Other links
Concatenated index columns order
Change CBO Stats Selectivity for given Index and Column
Reviews
Categories
Filters
Search