1.Clustering Factor is a measure of the orderedness of an index in comparison to the table that is based upon
It is used as an indicator for computing the estimated cost of the table lookup following an index access.
2.The Clustering Factor records the number of data blocks that will be accessed when scanning an index.
3.Clustering Factor only happen on Physical reads.
SQL> create table tt5(id1 int,id2 int);
Table created.
SQL> create index tt5_i1 on tt5(id1);
Index created.
SQL> create index tt5_i2 on tt5(id2);
Index created.
SQL> insert into tt5 select level,trunc(dbms_random.value(0,10000)) from dual connect by level<=10000;
10000 rows created.
SQL> commit;
Commit complete.
SQL> exec dbms_stats.gather_table_stats('scott','tt5');
PL/SQL procedure successfully completed.
SQL> select table_name,num_rows,blocks,sample_size from user_tables where table_name='TT5';
TABLE_NAME NUM_ROWS BLOCKS SAMPLE_SIZE
-------------------- ---------- ---------- -----------
TT5 10000 20 10000
SQL> select column_name,num_distinct,num_nulls,density from user_tab_columns where table_name='TT5';
COLUMN_NAME NUM_DISTINCT NUM_NULLS DENSITY
-------------------- ------------ ---------- ----------
ID1 10000 0 .0001
ID2 6262 0 .000159693
SQL> select index_name,blevel,leaf_blocks,clustering_factor from user_indexes where table_name='TT5';
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
TT5_I1 1 19 18
TT5_I2 1 30 9419
index tt5_i1's clustering factor is close tt5 table's blocks number.This is good.
index tt5_i2's clustering factor is far than tt5 table's blocks number.This is not good.
test:
sql1
SQL> explain plan for select * from tt5 where id1 between 1 and 100;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------
Plan hash value: 3764321786
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 700 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT5 | 100 | 700 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TT5_I1 | 100 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID1">=1 AND "ID1"<=100)
14 rows selected.
sql2
SQL> explain plan for select * from tt5 where id2 between 1 and 100;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 3997547206
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 101 | 707 | 6 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TT5 | 101 | 707 | 6 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID2"<=100 AND "ID2">=1)
13 rows selected.
可以发现第两条sql的cost比第一条的.而且sql2的选择的行数比例为101/10000 约等于 1%,确没有走index,比较奇怪。
看一下强制走index,cost是多少.
SQL> explain plan for select /*+index(tt5)*/ * from tt5 where id2 between 1 and 100;
Explained.
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
Plan hash value: 2796517782
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 101 | 707 | 97 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| TT5 | 101 | 707 | 97 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | TT5_I2 | 101 | | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID2">=1 AND "ID2"<=100)
14 rows selected.
可以发现强制走index的时候cost为97,cost更高。
clustering factor的解决办法只有一个就是rebuild table