Skip to main content

Table Compression

  • Reference work entry
  • First Online:
Encyclopedia of Algorithms

Years and Authors of Summarized Original Work

  • 2003; Buchsbaum, Fowler, Giancarlo

Problem Definition

Table compression was introduced by Buchsbaum et al. [3] as a unique application of compression, based on several distinguishing characteristics. Tables are collections of fixed-length records and can grow to be terabytes in size. They are often generated by information systems and kept in data warehouses to facilitate ongoing operations. These data warehouses will typically manage many terabytes of data online, with significant capital and operational costs. In addition, the tables must be transmitted to different parts of an organization, incurring additional costs for transmission. Typical examples are tables of transaction activity, like phone calls and credit card usage, which are stored once but then shipped repeatedly to different parts of an organization: for fraud detection, billing, operations support, etc. The goals of table compression are to be fast, online, and effective:...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 1,599.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 1,999.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Apostolico A, Cunian F, Kaul V (2008) Table compression by record intersection. In: Proceedings of the IEEE data compression conference (DCC), Snowbird, pp 13–22

    Google Scholar 

  2. Blum A, Li M, Tromp J, Yannakakis M (1994) Linear approximation of shortest superstrings. J ACM 41:630–647

    Article  MathSciNet  MATH  Google Scholar 

  3. Buchsbaum AL, Caldwell DF, Church KW, Fowler GS, Muthukrishnan S (2000) Engineering the compression of massive tables: an experimental approach. In: Proceedings of the 11th ACM-SIAM symposium on discrete algorithms, San Francisco, pp 175–184

    Google Scholar 

  4. Buchsbaum AL, Fowler GS, Giancarlo R (2003) Improving table compression with combinatorial optimization. J ACM 50:825–851

    Article  MathSciNet  MATH  Google Scholar 

  5. Burrows M, Wheeler D (1994) A block sorting lossless data compression algorithm. Technical report 124, Digital Equipment Corporation

    Google Scholar 

  6. Cilibrasi R, Vitanyi PMB (2005) Clustering by compression. IEEE Trans Inf Theory 51:1523–1545

    Article  MathSciNet  MATH  Google Scholar 

  7. Cormack G (1985) Data compression in a data base system. Commun ACM 28:1336–1350

    Article  MathSciNet  Google Scholar 

  8. Cover TM, Thomas JA (1990) Elements of information theory. Wiley Interscience, New York

    MATH  Google Scholar 

  9. Ferragina P, Giancarlo R, Manzini G, Sciortino M (2005) Boosting textual compression in optimal linear time. J ACM 52:688–713

    Article  MathSciNet  MATH  Google Scholar 

  10. Ferragina P, Luccio F, Manzini G, Muthukrishnan S (2005) Structuring labeled trees for optimal succinctness, and beyond. In: Proceedings of the 45th annual IEEE symposium on foundations of computer science, Pittsburgh, pp 198–207

    Google Scholar 

  11. Giancarlo R, Sciortino M, Restivo A (2007) From first principles to the Burrows and Wheeler transform and beyond, via combinatorial optimization. Theor Comput Sci 387:236–248

    Article  MathSciNet  MATH  Google Scholar 

  12. Li M, Chen X, Li X, Ma B, Vitanyi PMB (2004) The similarity metric. IEEE Trans Inf Theory 50:3250–3264

    Article  MathSciNet  MATH  Google Scholar 

  13. Liefke H, Suciu D (2000) XMILL: an efficient compressor for XML data. In: Proceedings of the 2000 ACM SIGMOD international conference on management of data, Dallas. ACM, New York, pp 153–164

    Chapter  Google Scholar 

  14. Lifshits Y, Mozes S, Weimann O, Ziv-Ukelson M (2009) Speeding up HMM decoding and training by exploiting sequence repetitions. Algorithmica 54:379–399

    Article  MathSciNet  MATH  Google Scholar 

  15. Manzini G (2001) An analysis of the Burrows-Wheeler transform. J ACM 48:407–430

    Article  MathSciNet  MATH  Google Scholar 

  16. Vo K-P (2006) Compression as data transformation. In: DCC: data compression conference, Snowbird. IEEE Computer Society TCC, Washington DC, p 403

    Google Scholar 

  17. Vo BD, Vo K-P (2004) Using column dependency to compress tables. In: DCC: data compression conference, Snowbird. IEEE Computer Society TCC, Washington DC, pp 92–101

    Google Scholar 

  18. Vo BD, Vo K-P (2007) Compressing table data with column dependency. Theor Comput Sci 387:273–283

    Article  MathSciNet  MATH  Google Scholar 

  19. Ziv J, Lempel A (1977) A universal algorithm for sequential data compression. IEEE Trans Inf Theory 23:337–343

    Article  MathSciNet  MATH  Google Scholar 

  20. Ziv J, Lempel A (1978) Compression of individual sequences via variable length coding. IEEE Trans Inf Theory 24:530–536

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raffaele Giancarlo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this entry

Cite this entry

Giancarlo, R., Buchsbaum, A.L. (2016). Table Compression. In: Kao, MY. (eds) Encyclopedia of Algorithms. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2864-4_418

Download citation

Publish with us

Policies and ethics