Skip to main content

Histograms on Streams

  • Reference work entry
Encyclopedia of Database Systems
  • 131 Accesses

Synonyms

Piecewise-constant approximations

Definition

A B-bucket histogram of length N is a partition of the set [0,N) of N integers into intervals [b 0,b 1) ∪ [b 1,b 2) ∪...∪ [b B−1,b B ), where b 0 = 0 and b B = N, together with a collection of B heights h j , for 0 ≤ j < B, one for each bucket. On point query i, the histogram answer is h j , where j is the index of the interval (or “bucket”) containing i; that is, the unique j with b j i < b j+1. In vector notation, χ S is the vector that is 1 on the set S and zero elsewhere and the answer vector of a histogram is \(\vec{H} = \mathop{{\sum} }\nolimits_{0 \le j < B} h_j \chi_{\left[ \left.b_j ,b_{j+1}\right) \right.}\).

A histogram, \(\vec{H}\), is often used to approximate some other function, \(\vec{A}\), on [0,N). In building a B-bucket histogram, it is desirable to choose B − 1 boundaries b j and B heights h j that tend to minimize some distance, e.g., the sum square error \(\left\|\vec{A} -\vec{H}\right\|^2 =...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 2,500.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Cormode G. and Muthukrishnan S. An improved data stream summary: the count-min sketch and its applications. In Proc. 6th Latin American Symp. Theoretical Informatics, 2004, pp. 29–38.

    Google Scholar 

  2. Gilbert A., Guha S., Indyk P., Kotidis Y., Muthukrishnan S., and Strauss M. Fast, small-space algorithms for approximate histogram maintenance. In Proc. 34th Annual ACM Symp. on Theory of Computing, 2002, pp. 389–398.

    Google Scholar 

  3. Guha S., Koudas N., and Shim K. Approximation and streaming algorithms for histogram construction problems. ACM Trans. Database Sys., 31(1):396–438, March 2006.

    Article  Google Scholar 

  4. Ioannidis Y. The history of histograms (abridged). In Proc. 29th Int. Conf. on Very Large Data Bases, 2003, pp. 19–30.

    Google Scholar 

  5. Jagadish H., Koudas N., Muthukrishnan S., Poosala V., Sevcik K., and Suel T. Optimal histograms with quality guarantees. In Proc. 24th Int. Conf. on Very Large Data Bases, 1998, pp. 275–286.

    Google Scholar 

  6. Muthukrishnan S. and Strauss M. Approximate histogram and wavelet summaries of streaming data. In Data-Stream Management – Processing High-Speed Data Streams. Springer, New York (Data-Centric Systems and Applications Series), 2009.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this entry

Cite this entry

Strauss, M.J. (2009). Histograms on Streams. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_191

Download citation

Publish with us

Policies and ethics