Skip to main content

Universal Coding of Zipf Distributions

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2777))

Abstract

Background. One of the best known results in information theory says that a data sequence x 1,x 2,...,x n produced by independent random draws from a fixed distribution P over a discrete domain can be compressed into a binary sequence, or code whose expected length is at most nH(P)+1 bits, where H(P) = − ∑  i  P i  logP i is the entropy of P. It is also known that this compression is near optimal as nH(P) is the smallest achievable expected number of code bits.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Krichevsky, R.E., Trofimov, V.K.: The preformance of universal coding. IEEE Transactions on Information Theory 27, 199–207 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  2. Zipf, G.: Selective studies and the principle of relative frequency in language. Technical report, Harvard university press (1932)

    Google Scholar 

  3. Li, W.: North Shore LIJ Research Institure, http://linkage.rockefeller.edu/wli/zipf/

  4. Jevtic, N., Orlitsky, A., Santhanam, N.P.: Universal compression of unknown alphabets. In: IEEE Symposium on Information Theory (2002)

    Google Scholar 

  5. Aberg, J., Shtarkov, Y.M., Smeets, B.J.M.: Multialphabet coding with separate alphabet description. In: Proceedings of compression and complexity of sequences (1997)

    Google Scholar 

  6. Orlitsky, A., Santhanam, N.P., Zhang, J.: Bounds on compression of unknown alphabets. In: IEEE Symposium on Information Theory (2003)

    Google Scholar 

  7. Freund, Y.: Predicting a binary sequence almost as well as the optimal biased coin. Information and Computation 182, 73–94 (2003)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Freund, Y., Orlitsky, A., Santhanam, P., Zhang, J. (2003). Universal Coding of Zipf Distributions. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_57

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45167-9_57

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40720-1

  • Online ISBN: 978-3-540-45167-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics