Concatenated index columns order
                                                 Last update (2008-04-09 11:08:41)
                                                                                                                   Date added (2008-04-08 13:56:40)

Summary
Is index columns order important? Use the following example. Setup 2 identical tables with 1.000.000 rows with 3 columns,
with the same index, but with different columns order.
CREATE TABLE HELPDESK.EXAMPLE 
(ID NUMBER(9), 
GROUP_NAME VARCHAR2(30), 
ACCOUNT_NAME VARCHAR2(30)) 
TABLESPACE HELPDESK;

CREATE SEQUENCE HELPDESK.EXAMPLE_SEQ
START WITH 0
INCREMENT BY 1
MINVALUE 0
NOCACHE 
NOCYCLE 
NOORDER;
Now run the pl/sql code to fill the table
BEGIN
    FOR i IN 1..10 LOOP
	  FOR k IN 1..100000 LOOP 
      INSERT INTO HELPDESK.EXAMPLE VALUES (helpdesk.example_seq.NEXTVAL, 'GROUP' || i, 'ACCOUNT' || k);
	  END LOOP;
	  COMMIT;
    END LOOP;
    END;
  /
It took 16min in a 16CPUs HP-UX PA-RISC Superdome. Now query to see the results
helpdesk@OTE> SELECT GROUP_NAME, COUNT(*) 
  2    FROM HELPDESK.EXAMPLE 
  3    GROUP BY GROUP_NAME;

GROUP_NAME                       COUNT(*)
------------------------------ ----------
GROUP1                             100000
GROUP10                            100000
GROUP2                             100000
GROUP3                             100000
GROUP4                             100000
GROUP5                             100000
GROUP6                             100000
GROUP7                             100000
GROUP8                             100000
GROUP9                             100000

10 rows selected.

Elapsed: 00:00:03.84
Create the second identical table HELPDESK.EXAMPLE1
CREATE TABLE HELPDESK.EXAMPLE1 TABLESPACE HELPDESK 
AS SELECT * FROM HELPDESK.EXAMPLE;
Create the two indexes with different column order and calculate the statistics.
  CREATE INDEX HELPDESK.EXAMPLE_N1 ON 
  HELPDESK.EXAMPLE (GROUP_NAME, ACCOUNT_NAME) 
  TABLESPACE HELPDESK;

  CREATE INDEX HELPDESK.EXAMPLE_N2 ON 
  HELPDESK.EXAMPLE1 (ACCOUNT_NAME, GROUP_NAME) 
  TABLESPACE HELPDESK;

BEGIN
 DBMS_STATS.GATHER_TABLE_STATS(ownname=>'HELPDESK', tabname=>'EXAMPLE', 
 estimate_percent=>100, CASCADE=>TRUE, DEGREE=>8);
END;
/

BEGIN
 DBMS_STATS.GATHER_TABLE_STATS(ownname=>'HELPDESK', tabname=>'EXAMPLE1', 
 estimate_percent=>100, CASCADE=>TRUE, DEGREE=>8);
END;
/
After the statistics you can see the CLUSTERING_FACTOR of the indexes!
SELECT INDEX_NAME, BLEVEL, DISTINCT_KEYS, CLUSTERING_FACTOR 
FROM DBA_INDEXES WHERE INDEX_NAME LIKE 'EXAMPLE_N%';

INDEX_NAME                         BLEVEL DISTINCT_KEYS CLUSTERING_FACTOR
------------------------------ ---------- ------------- -----------------
EXAMPLE_N1                              2       1000000            193240
EXAMPLE_N2                              2       1000000           1000000
Now lets examine the behavior of the two indexes with two sql queries.

1. First query search for GROUP_NAME and ACCOUNT_NAME.
SELECT * FROM HELPDESK.EXAMPLE 
  2  WHERE GROUP_NAME = 'GROUP5' 
  3  AND ACCOUNT_NAME = 'ACCOUNT4567';

        ID GROUP_NAME                     ACCOUNT_NAME
---------- ------------------------------ ------------------------------
    514566 GROUP5                         ACCOUNT4567

Elapsed: 00:00:23.65

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=5 Card=1 Bytes=25)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EXAMPLE' (Cost=5 Card=1
          Bytes=25)

   2    1     INDEX (RANGE SCAN) OF 'EXAMPLE_N1' (NON-UNIQUE) (Cost=4
          Card=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        295  bytes sent via SQL*Net to client
        236  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed


> SELECT * FROM HELPDESK.EXAMPLE1 
  2  WHERE GROUP_NAME = 'GROUP5' 
  3  AND ACCOUNT_NAME = 'ACCOUNT4567';

        ID GROUP_NAME                     ACCOUNT_NAME
---------- ------------------------------ ------------------------------
    514566 GROUP5                         ACCOUNT4567

Elapsed: 00:00:23.71

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=6 Card=1 Bytes=25)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EXAMPLE1' (Cost=6 Card=1
           Bytes=25)

   2    1     INDEX (RANGE SCAN) OF 'EXAMPLE_N2' (NON-UNIQUE) (Cost=4
          Card=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        295  bytes sent via SQL*Net to client
        236  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
As you can see when searching with all index columns the Cost is the same. Just 5 consistent gets.

2. Now query search for ACCOUNT_NAME only.
SELECT * FROM HELPDESK.EXAMPLE 
  2  WHERE ACCOUNT_NAME = 'ACCOUNT4567'; 

        ID GROUP_NAME                     ACCOUNT_NAME
---------- ------------------------------ ------------------------------
    114566 GROUP1                         ACCOUNT4567
   1014566 GROUP10                        ACCOUNT4567
    214566 GROUP2                         ACCOUNT4567
    314566 GROUP3                         ACCOUNT4567
    414566 GROUP4                         ACCOUNT4567
    514566 GROUP5                         ACCOUNT4567
    614566 GROUP6                         ACCOUNT4567
    714566 GROUP7                         ACCOUNT4567
    814566 GROUP8                         ACCOUNT4567
    914566 GROUP9                         ACCOUNT4567

10 rows selected.

Elapsed: 00:00:23.76

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=15 Card=10 Bytes=250
          )

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EXAMPLE' (Cost=15 Card=1
          0 Bytes=250)

   2    1     INDEX (SKIP SCAN) OF 'EXAMPLE_N1' (NON-UNIQUE) (Cost=13
          Card=10)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         50  consistent gets
          0  physical reads
          0  redo size
        459  bytes sent via SQL*Net to client
        236  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

SELECT * FROM HELPDESK.EXAMPLE1 
  2  WHERE ACCOUNT_NAME = 'ACCOUNT4567';

        ID GROUP_NAME                     ACCOUNT_NAME
---------- ------------------------------ ----------------------------
    114566 GROUP1                         ACCOUNT4567
   1014566 GROUP10                        ACCOUNT4567
    214566 GROUP2                         ACCOUNT4567
    314566 GROUP3                         ACCOUNT4567
    414566 GROUP4                         ACCOUNT4567
    514566 GROUP5                         ACCOUNT4567
    614566 GROUP6                         ACCOUNT4567
    714566 GROUP7                         ACCOUNT4567
    814566 GROUP8                         ACCOUNT4567
    914566 GROUP9                         ACCOUNT4567

10 rows selected.

Elapsed: 00:00:25.76

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=14 Card=10 Bytes=250
          )

   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'EXAMPLE1' (Cost=14 Card=
          10 Bytes=250)

   2    1     INDEX (RANGE SCAN) OF 'EXAMPLE_N2' (NON-UNIQUE) (Cost=4
          Card=10)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         14  consistent gets
          0  physical reads
          0  redo size
        459  bytes sent via SQL*Net to client
        236  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed
The second index EXAMPLE_N2 has better cost than EXAMPLE_N1, only 14 consistent gets against 50.

So the conclusion is that when you create concatenated indexes it would be better if you put the columns with more distinct values first

Tip:For more information about Predicate Selectivity you can see Metalink Note: 68992.1

Other links
Change CBO Stats Selectivity for given Index and Column
Reviews
Categories
Filters
Search