Skip to main content

Compressing and Indexing Structured Text

  • Reference work entry
  • First Online:
Encyclopedia of Algorithms

Years and Authors of Summarized Original Work

  • 2005; Ferragina, Luccio, Manzini, Muthukrishnan

Problem Definition

Trees are a fundamental structure in computing. They are used in almost every aspect of modeling and representation for computations like searching for keys, maintaining directories, and representations of parsing or execution traces, to name just a few. One of the latest uses of trees is XML, the de facto format for data storage, integration, and exchange over the Internet (see http://www.w3.org/XML/). Explicit storage of trees, with one pointer per child as well as other auxiliary information (e.g., label), is often taken as given but can account for the dominant storage cost. Just to have an idea, a simple tree encoding needs at least 16 bytes per tree node: one pointer to the auxiliary information (e.g., node label) plus three node pointers to the parent, the first child, and the next sibling. This large space occupancy may even prevent the processing of medium-sized...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,599.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 1,999.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Barbay J, Golynski A, Munro JI, Rao SS (2007) Adaptive searching in succinctly encoded binary relations and tree-structured documents. Theor Comput Sci 387:284–297

    Article  MathSciNet  MATH  Google Scholar 

  2. Barbay J, He M, Munro JI, Rao SS (2011) Succinct indexes for strings, binary relations and multi-labeled trees. ACM Trans Algorithms 7(4):article 52

    Google Scholar 

  3. Benoit D, Demaine E, Munro JI, Raman R, Raman V, Rao SS (2005) Representing trees of higher degree. Algorithmica 43:275–292

    Article  MathSciNet  MATH  Google Scholar 

  4. Burrows M, Wheeler D (1994) A block sorting lossless data compression algorithm. Technical report 124, Digital Equipment Corporation

    Google Scholar 

  5. Farzan A, Munro JI (2014) A uniform paradigm to succinctly encode various families of trees. Algorithmica 68(1):16–40

    Article  MathSciNet  MATH  Google Scholar 

  6. Farzan A, Raman R, Rao SS (2009) Universal succinct representations of trees? In: Proceedings of the 36th international colloquium on automata, languages and programming (ICALP, Part I), Rhodes. Lecture notes in computer science, vol 5555. Springer, pp 451–462

    Google Scholar 

  7. Ferragina P, Venturini R (2007) A simple storage scheme for strings achieving entropy bounds. Theor Comput Sci 372(1):115–121

    Article  MathSciNet  MATH  Google Scholar 

  8. Ferragina P, Luccio F, Manzini G, Muthukrishnan S (2005) Structuring labeled trees for optimal succinctness, and beyond. In: Proceedings of the 46th IEEE symposium on foundations of computer science (FOCS), Cambridge, pp 184–193. The journal version of this paper appear in J ACM 57(1) (2009)

    Google Scholar 

  9. Ferragina P, Luccio F, Manzini G, Muthukrishnan S (2006) Compressing and searching XML data via two zips. In: Proceedings of the 15th World Wide Web conference (WWW), Edingburg, pp 751–760

    Google Scholar 

  10. Geary R, Raman R, Raman V (2006) Succinct ordinal trees with level-ancestor queries. ACM Trans Algorithms 2:510–534

    Article  MathSciNet  MATH  Google Scholar 

  11. He M, Munro JI, Rao SS (2012) Succinct ordinal trees based on tree covering. ACM Trans Algorithms 8(4):article 42

    Google Scholar 

  12. He M, Munro JI, Zhou G (2012) A framework for succinct labeled ordinal trees over large alphabets. In: Proceedings of the 23rd international symposium on algorithms and computation (ISAAC), Taipei. Lecture notes in computer science, vol 7676. Springer, pp 537–547

    Google Scholar 

  13. Jacobson G (1989) Space-efficient static trees and graphs. In: Proceedings of the 30th IEEE symposium on foundations of computer science (FOCS), Triangle Park, pp 549–554

    Google Scholar 

  14. Jansson J, Sadakane K, Sung W Ultra-succinct representation of ordered trees.  J Comput Syst Sci 78: 619–631 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  15. Kosaraju SR (1989) Efficient tree pattern matching. In: Proceedings of the 20th IEEE foundations of computer science (FOCS), Triangle Park, pp 178–183

    Google Scholar 

  16. Munro JI, Raman V (2001) Succinct representation of balanced parentheses and static trees. SIAM J Comput 31(3):762–776

    Article  MathSciNet  MATH  Google Scholar 

  17. Navarro G, Mäkinen V (2007) Compressed full text indexes. ACM Comput Surv 39(1):article 2

    Google Scholar 

  18. Navarro G, Sadakane K (2014) Fully functional static and dynamic succinct trees. ACM Trans Algorithms 10(3):article 16

    Google Scholar 

  19. Raman R, Raman V, Rao SS (2007) Succinct indexable dictionaries with applications to encoding k-ary trees and multisets. ACM Trans Algorithms 3(4):article 43

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this entry

Cite this entry

Satti, P.F.R. (2016). Compressing and Indexing Structured Text. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_430

Download citation

Publish with us

Policies and ethics