[20190630]如何确定直方图类型.txt
http://dbaparadise.com/2018/06/which-histogram-will-oracle-pick/
let's explore how will Oracle decide which type of histograms to use.
Oracle is using 4 variables to determine the type of histogram to use. These variables are listed below:
1) number of distinct values in the column -> NDV. If the column in the table has the values red, blue, yellow, green,
then NDV=4
2) number of histogram buckets -> n. The default value for n is 254. If you don't explicitly specify the number of
buckets, then by default Oracle will set the number of buckets to 254.
3) internal variable, percentage threshold, p = (1–(1/n)) * 100; This is a formula set by Oracle.
4) the value of estimate_percent, when you gather statistics, whether or not it's set to default value AUTO_SAMPLE_SIZE,
when gathering statistics.
--//number of distinct values in the column -> NDV
--//n 指建立直方图桶的数量。缺省时254?
Here are the rules, set by Oracle when picking a specific type of histogram:
Oracle will pick a Frequency Histogram if:
NDV < n (n=254 by default)
if number of distinct values is less than the number of histogram buckets.
--//如果NDV<254 ,选择频率直方图.
Oracle will pick a Height Balanced Histogram if:
NDV > n (n=254 by default)
estimate_percent < > AUTO_SAMPLE_SIZE
if number of distinct values is greater than the number of histogram buckets
and
the estimate percent is not set to AUTO_SAMPLE_SIZE during stats gathering.
--//如果NDV>254,estimate_percent < > AUTO_SAMPLE_SIZE,选择Height Balanced Histogram.
Oracle will pick a Hybrid Histogram if:
NDV > n (n=254 by default)
estimate_percent = AUTO_SAMPLE_SIZE
percentage of rows for top n (n=254 by default) frequent values < p
If n=254, then p is 99.6
--// (1-1/254) *100 = 99.60629921259842519700
--//如果NDV > n,estimate_percent = AUTO_SAMPLE_SIZE.
--//流行值的比例 < p,选择Hybrid Histogram.
Oracle will pick a Top Frequency Histogram if
NDV > n (n=254 by default)
estimate_percent = AUTO_SAMPLE_SIZE
percentage of rows for top n (n=254 by default) frequent values >= p
--//如果NDV > n,estimate_percent = AUTO_SAMPLE_SIZE.
--//流行值的比例 < p,选择Top Frequency Histogram.
--//注:要生成Top Frequency Histogram 或者 Hybrid Histogram都要分析时estimate_percent = AUTO_SAMPLE_SIZE。
--//P的计算公式如下:
--//(1-1/n)*100= (1-1/254)*100 = 99.60629921259842519700