It's hard for me to remember the definitions and features of the two types of histograms, height-balanced (HBH) and frequency (aka value-based) histograms (FH or VBH). Here's my own summary.

        Criteria            |   Type   |           Meaning of
                            |          | xxx_tab_histograms.endpoint_number
----------------------------|----------|-----------------------------------
Distinct values >  #buckets | HBH      | ordinal number of bucket
Distinct values <= #buckets | FH (VBH) | cumulative count

Here's a ball-bucket analogy that helps me remember the definitions. We have some buckets and a lot of balls. Each ball has a number written on it. We sort all balls based on the numbers on them and put them in the buckets in the sorted order.

Although Oracle calls the second type frequency histogram in 10g documentation, the name value-based histogram as in 9i documentation better fits into our analogy; values or numbers on the balls in the same bucket span a same-width range for all buckets, not more, not less.

The name height-balanced histogram is a good one because you can imagine all buckets are filled to about the same depth (assuming the balls are of equal diameter!). The real technical name equi-depth histogram is even better. (Jonathan Lewis Cost Based Oracle p.157 says, according to Wolfgang Breitling, the technical term in computer science is not height balanced histogram, but equi-depth histogram, instead.)


10g xxx_tab_col_statistics has a histogram column that tells you what type of histogram you have. 9i doesn't have this. But the definition of 10g xxx_tab_col_statistics has the simple algorithm that determines the type and the algorithm works equally well in 9i.

select case when nvl(h.row_cnt,0) = 0 then 'NONE'
     when (h.bucket_cnt > 255
           or
           (h.bucket_cnt > h.distcnt and h.row_cnt = h.distcnt
            and h.density*h.bucket_cnt*2 <= 1))
         then 'FREQUENCY'
     else 'HEIGHT BALANCED'
end
from sys.hist_head$ h
where obj# = &object_id_for_the_table;

Even though documentation says "Frequency histograms are automatically created instead of height-balanced histograms when the number of distinct values is less than or equal to the number of histogram buckets specified", that's not a complete statement. The algorithm tells us this statement is not accurate; at least it should be appended with "or when the number of buckets is greater than 255".


Having said that, I don't think there's a way to create a histogram with more than or even equal to 255 buckets. The following is from 10gR2 (the same in 9i).

SQL> exec dbms_stats.gather_table_stats(user, 'testhist', method_opt=>'for columns mycolumn size 255')
BEGIN dbms_stats.gather_table_stats(user, 'testhist', method_opt=>'for columns mycolumn size 255'); END;

*
ERROR at line 1:
ORA-20000: Cannot parse for clause: FOR COLUMNS MYCOLUMN SIZE 255
ORA-06512: at "SYS.DBMS_STATS", line 13056
ORA-06512: at "SYS.DBMS_STATS", line 13076
ORA-06512: at line 1

SQL> analyze table testhist compute statistics for columns mycolumn size 255;
analyze table testhist compute statistics for columns mycolumn size 255
                                                                    *
ERROR at line 1:
ORA-01494: invalid SIZE specified

Indeed, documentation for both analyze and dbms_stats.gather_table_stats say the columns size should be <= 254.


To my OraNotes Page