Definition
Given a relation R and an attribute X of R, the domain D of X is the set of all possible values of X, and a finite set V(⊆D)V(⊆D) denotes the distinct values of X in an instance r of R. Let V be ordered, that is, V = {vi : 1 ≤ i ≤ n}V = {vi : 1 ≤ i ≤ n}, where vi < vjvi < vj if i < ji < j. The instance r of R restricted to X is denoted by T, and can be represented as T = {(v1, f1), ⋯(vn, fn)}T = {(v1, f1), ⋯(vn, fn)}. In T, each vi is distinct and is called a value of T; and fi is the occurrence of vi in T and is called the frequency of vi, and T is called the data distribution. A histogram on data distribution T is constructed by the following two steps:
- 1.
Partitioning the values of T into β(≥1)β(≥1) disjoint intervals (called buckets) – {Bi : 1 ≤ i ≤β}{Bi : 1 ≤ i ≤ β} – such that each value in Bi is smaller than that in Bi if i < ji < j
- 2.
Approximately representing the frequencies and values in each bucket
Key Points
Histogram, as a summarization of the data...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Buccafurri F, Rosaci D, Doutieri L, Sacca D. Improving range query estimation on histograms. In: Proceedings of the 18th International Conference on Data Engineering; 2002. p. 628–38.
Konig AC, Weikum G.. Combining histograms and parametric curve fitting for feedback-driven query result-size estimation. In: Proceedings of the 25th International Conference on Very Large Data Bases; 1999.
Poosala V, Ioannidis YE., Haas PJ, Shekita EJ. Improved histograms for selectivity estimation of range predicates. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1996. p. 294–305.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Zhang, Q. (2018). Histogram. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_544
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_544
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering