Skip to main content

The Myriad Virtues of Wavelet Trees

  • Conference paper
Automata, Languages and Programming (ICALP 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4051))

Included in the following conference series:

Abstract

Wavelet Trees have been introduced in [Grossi, Gupta and Vitter, SODA ’03] and have been rapidly recognized as a very flexible tool for the design of compressed full-text indexes and data compressors. Although several papers have investigated the beauty and usefulness of this data structure in the full-text indexing scenario, its impact on data compression has not been fully explored. In this paper we provide a complete theoretical analysis of a wide class of compression algorithms based on Wavelet Trees. We also show how to improve their asymptotic performance by introducing a novel framework, called Generalized Wavelet Trees, that aims for the best combination of binary compressors (like, Run-Length encoders) versus non-binary compressors (like, Huffman and Arithmetic encoders) and Wavelet Trees of properly-designed shapes. As a corollary, we prove high-order entropy bounds for the challenging combination of Burrows-Wheeler Transform and Wavelet Trees.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abel, J.: Improvements to the Burrows-Wheeler compression algorithm: After BWT stages, http://citeseer.ist.psu.edu/abel03improvements.html

  2. Arnavut, Z., Magliveras, S.: Block sorting and compression. In: DCC: Data Compression Conference, pp. 181–190. IEEE Computer Society TCC, Los Alamitos (1997)

    Chapter  Google Scholar 

  3. Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Technical Report 124, Digital Equipment Corporation (1994)

    Google Scholar 

  4. Ferragina, P., Giancarlo, R., Manzini, G., Sciortino, M.: Boosting textual compression in optimal linear time. Journal of the ACM 52, 688–713 (2005)

    Article  MathSciNet  Google Scholar 

  5. Ferragina, P., Manzini, G.: Indexing compressed text. Journal of the ACM 52(4), 552–581 (2005)

    Article  MathSciNet  Google Scholar 

  6. Foschini, L., Grossi, R., Gupta, A., Vitter, J.: Fast compression with a static model in high order entropy. In: DCC: Data Compression Conference, pp. 62–71. IEEE Computer Society TCC, Los Alamitos (2004)

    Google Scholar 

  7. Grossi, R., Gupta, A., Vitter, J.: High-order entropy-compressed text indexes. In: Proc. 14th Annual ACM-SIAM Symp. on Discrete Algorithms (SODA 2003), pp. 841–850 (2003)

    Google Scholar 

  8. Grossi, R., Gupta, A., Vitter, J.: When indexing equals compression: Experiments on compressing suffix arrays and applications. In: Proc. 15th Annual ACM-SIAM Symp. on Discrete Algorithms (SODA 2004), pp. 636–645 (2004)

    Google Scholar 

  9. Grossi, R., Vitter, J.: Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SIAM Journal on Computing 35, 378–407 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  10. Mäkinen, V., Navarro, G.: Succinct suffix arrays based on rul-length encoding. Nordic Journal of Computing 12(1), 40–66 (2005)

    MathSciNet  Google Scholar 

  11. Manzini, G.: An analysis of the Burrows-Wheeler transform. Journal of the ACM 48(3), 407–430 (2001)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ferragina, P., Giancarlo, R., Manzini, G. (2006). The Myriad Virtues of Wavelet Trees. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds) Automata, Languages and Programming. ICALP 2006. Lecture Notes in Computer Science, vol 4051. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11786986_49

Download citation

  • DOI: https://doi.org/10.1007/11786986_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35904-3

  • Online ISBN: 978-3-540-35905-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics