Skip to main content

Histogram

  • Reference work entry
  • First Online:
  • 27 Accesses

Definition

Given a relation R and an attribute X of R, the domain D of X is the set of all possible values of X, and a finite set V(⊆D)V(⊆D) denotes the distinct values of X in an instance r of R. Let V be ordered, that is, V = {vi : 1 ≤ in}V = {vi : 1 ≤ i ≤ n}, where vi < vjvi < vj if i < ji < j. The instance r of R restricted to X is denoted by T, and can be represented as T = {(v1, f1), ⋯(vn, fn)}T = {(v1, f1), ⋯(vn, fn)}. In T, each vi is distinct and is called a value of T; and fi is the occurrence of vi in T and is called the frequency of vi, and T is called the data distribution. A histogram on data distribution T is constructed by the following two steps:

  1. 1.

    Partitioning the values of T into β(≥1)β(≥1) disjoint intervals (called buckets) – {Bi : 1 ≤ iβ}{Bi : 1 ≤ i ≤ β} – such that each value in Bi is smaller than that in Bi if i < ji < j

  2. 2.

    Approximately representing the frequencies and values in each bucket

Key Points

Histogram, as a summarization of the data...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Buccafurri F, Rosaci D, Doutieri L, Sacca D. Improving range query estimation on histograms. In: Proceedings of the 18th International Conference on Data Engineering; 2002. p. 628–38.

    Google Scholar 

  2. Konig AC, Weikum G.. Combining histograms and parametric curve fitting for feedback-driven query result-size estimation. In: Proceedings of the 25th International Conference on Very Large Data Bases; 1999.

    Google Scholar 

  3. Poosala V, Ioannidis YE., Haas PJ, Shekita EJ. Improved histograms for selectivity estimation of range predicates. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 1996. p. 294–305.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qing Zhang .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Zhang, Q. (2018). Histogram. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_544

Download citation

Publish with us

Policies and ethics