When the WHERE clause of a query specifies multiple
columns from a single table (multiple single column predicates), the
relationship between the columns can strongly affect the combined
selectivity for the column group.For example, consider the customers table in the SH schema. The columns cust_state_province and country_id are related, with cust_state_province determining the country_id for each customer. Suppose you query the customers table where the cust_state_province is California:
SQL> SELECT COUNT(*)
2 FROM sh.customers
3 WHERE cust_state_province = 'CA';
COUNT(*)
----------
3341
SQL> SELECT COUNT(*)
2 FROM sh.customers
3 WHERE cust_state_province = 'CA'
4 AND country_id=52790;
COUNT(*)
----------
3341
SQL> SELECT COUNT(*)
2 FROM sh.customers
3 WHERE cust_state_province = 'CA'
4 AND country_id=52775;
COUNT(*)
----------
0
With individual column statistics, the optimizer has no way of knowing that the cust_state_province and the country_id columns are related. By gathering statistics on these columns as a group (column group), the optimizer has a more accurate selectivity value for the group, instead of having to generate the value based on the individual column statistics.
You can create column groups manually by using the DBMS_STATS package. You can use this package to create a column group, get the
name of a column group, or delete a column group from a table.
1 Creating a Column Group
Use the create_extended_statistics function to create a column group. The create_extended_statistics function returns the system-generated name of the newly created column group. Table 1 lists the input parameters for this function.
Table 1 Parameters for the create_extended_statistics Function
Parameter | Description |
---|---|
owner |
Schema owner. NULL indicates current schema. |
tab_name |
Name of the table to which the column group is added. |
extension |
Columns in the column group. |
For example, to add a column group consisting of the cust_state_province and country_id columns to the customers table in SH schema, run the following PL/SQL block:
SQL> DECLARE2 cg_name varchar2(30);
3 BEGIN
4 cg_name := dbms_stats.create_extended_stats('sh','customers',
5 '(CUST_CITY,cust_state_province,country_id)');
6 END;
7 /
PL/SQL procedure successfully completed.
2 Getting a Column Group
Use the show_extended_stats_name function to obtain the name of the column group for a given set of columns. Table 2 lists the input parameters for this function.
Table 2 Parameters for the show_extended_stats_name Function
Parameter | Description |
---|---|
owner |
Schema owner. NULL indicates current schema. |
tab_name |
Name of the table to which the column group belongs. |
extension |
Name of the column group. |
For example, use the following query to obtain the column group name for a set of columns on the customers table:
SQL> select sys.dbms_stats.show_extended_stats_name('sh','customers','(CUST_CITY,CUST_STATE_PROVINCE,COUNTRY_ID)') cg_name from dual;CG_NAME
--------------------------------------------------------------------------------
SYS_STUMZ$C3AIHLPBROI#SKA58H_N
3 Monitoring Column Groups
Use the dictionary table user_stat_extensions to obtain information about MultiColumn statistics:
SQL> Select extension_name, extension from dba_stat_extensions where table_name='CUSTOMERS';EXTENSION_NAME EXTENSION
------------------------------ --------------------------------------------------------------------------------
SYS_NC00028$ (UPPER("CUST_LAST_NAME"))
SYS_NC00029$ (UPPER("CUST_FIRST_NAME"))
SYS_STUMZ$C3AIHLPBROI#SKA58H_N ("CUST_CITY","CUST_STATE_PROVINCE","COUNTRY_ID")
Use the following query to find the number of distinct values and find whether a histogram has been created for a column group:
SQL> select e.extension col_group, t.num_distinct, t.histogram
2 from dba_stat_extensions e, dba_tab_col_statistics t
3 where e.extension_name=t.column_name
4 and e.table_name=t.table_name
5 and t.table_name='CUSTOMERS';
COL_GROUP NUM_DISTINCT HISTOGRAM
-------------------------------------------------------------------------------- ------------ ---------------
("CUST_CITY","CUST_STATE_PROVINCE","COUNTRY_ID") 620 HEIGHT BALANCED
(UPPER("CUST_FIRST_NAME")) 170 NONE
(UPPER("CUST_LAST_NAME")) 176 NONE
4 Gathering Statistics on Column Groups
The METHOD_OPT argument of the DBMS_STATS package enables you to gather statistics on column groups. If you set the value of this argument to FOR ALL COLUMNS SIZE AUTO, then the optimizer gathers statistics on all existing column groups. To collect statistics on a new column group, specify the group using FOR COLUMNS. The column group is automatically created as part of statistic gathering.
For example, the following statement creates a new column group for the customers table on the columns cust_state_province, country_id and gathers statistics (including histograms) on the entire table and the new column group:
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_CITY,CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY');PL/SQL procedure successfully completed.