Skip to main content

Balancing Histogram Optimality and Practicality for Query Result Size Estimation

Many current database systems use histograms to approximate the frequency distribution of values in the attributes of relations and based on them estimate query result sizes and access plan costs. In choosing among the various histograms, one has to balance between two conflicting goals: optimality, so that generated estimates have the least error, and practicality, so that histograms can be constructed and maintained efficiently. In this paper, we present both theoretical and experimental results on several issues related to this trade-off. Our overall conclusion is that the most effective approach is to focus on the class of histograms that accurately maintain the frequencies of a few attribute values and assume the uniform distribution for the rest, and choose for each relation the histogram in that class that is optimal for a self-join query.

Citation
Yannis Ioannidis, Viswanath Poosala, "Balancing Histogram Optimality and Practicality for Query Result Size Estimation ", ACM SIGMOD Conference, San Jose, CA, May 1995, pp. 233-244, 1995
TAGS
Access
Unknown
Published at
ACM SIGMOD Conference, San Jose, CA, May 1995, pp. 233-244
Related research area
No related research area
Related Organizations
No related organizations