Skip to main content

The Vcodex Platform for Data Compression

  • Conference paper
  • 509 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 22))

Abstract

Vcodex is a software platform for constructing data compressors. It introduces the notion of data transforms as software components to encapsulate data transformation and compression techniques. The platform provides a variety of compression transforms ranging from general purpose compressors such as Huffman or Lempel-Ziv to structure related ones such as reordering fields and columns in relational data tables. Tranform composition enables construction of compressors either general purpose or customized to data semantics. The software and data architecture of Vcodex will be presented. Examples and experimental results will be given showing how the approach helps to achieve compression performance far beyond traditional approaches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bell, T., Powell, M.: The Canterbury Corpus. Technical Report (2001), http://corpus.canterbury.ac.nz

  2. Bentley, J., Sleator, D., Tarjan, R., Wei, V.: A Locally Adapative Data Compression Scheme. Comm. of the ACM 29, 320–330 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  3. Buchsbaum, A., Fowler, G.S., Giancarlo, R.: Improving Table Compression with Combinatorial Optimization. J. of the ACM 50(6), 825–851 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  4. Burrows, M., Wheeler, D.J.: A Block-Sorting Lossless Data Compression Algorithm. Report 124, Digital Systems Research Center (1994)

    Google Scholar 

  5. Deorowicz, S.: Improvements to Burrows-Wheeler Compression Algorithm. Software—Practice and Experience 30(13), 1465–1483 (2000)

    Article  MATH  Google Scholar 

  6. Deutsch, P.: DEFLATE Compressed Data Format Specification version 1.3. In: IETF RFC1951 (1996), http://www.ietf.org

  7. Fowler, G.S., Hume, A., Korn, D.G., Vo, K.-P.: Migrating an MVS Mainframe Application to a PC. In: Proceedings of Usenix 2004. USENIX (2004)

    Google Scholar 

  8. Gailly, J., Adler, M.: Zlib. Technical report (2005), http://www.zlib.net

  9. Huffman, D.A.: A Method for the Construction of Minimum-Redundancy Codes. Proc. of the IRE 40(9), 1098–1101 (1952)

    Article  MATH  Google Scholar 

  10. Hunt, J.J., Vo, K.-P., Tichy, W.F.: Delta Algorithms: An Empirical Analysis. ACM Transactions on Software Engineering and Methodology 7, 192–214 (1998)

    Article  Google Scholar 

  11. Jones, D.W.: Practical Evaluation of a Data Compression Algorithm. In: Data Compression Conference. IEEE Computer Society Press, Los Alamitos (1991)

    Google Scholar 

  12. Korn, D.G., Vo, K.-P.: SFIO: Safe/Fast String/File IO. In: Proc. of the Summer 1991 Usenix Conference, pp. 235–256. USENIX (1991)

    Google Scholar 

  13. Korn, D.G., MacDonals, J., Mogul, J., Vo, K.-P.: The VCDIFF Generic Differencing and Compression Data Format. Internet Engineering Task Force, RFC 3284 (2002), www.ietf.org

  14. Korn, D.G., Vo, K.-P.: Engineering a Differencing and Compression Data Format. In: Proceedings of Usenix 2002. USENIX (2002)

    Google Scholar 

  15. Liefke, H., Suciu, D.: Xmill: an efficient compressor for xml data. In: Proc. of SIGMOD, pp. 153–164 (2000)

    Google Scholar 

  16. Manzini, G., Rastero, M.: A Simple and Fast DNA Compression Algorithm. Software—Practice and Experience 34, 1397–1411 (2004)

    Article  Google Scholar 

  17. Seward, J.: Bzip2. Technical report (1994), http://www.bzip.org

  18. Tichy, W.F.: RCS—a system for version control. Software—Practice and Experience 15(7), 637–654 (1985)

    Article  MathSciNet  Google Scholar 

  19. Vo, B.D., Vo, K.-P.: Using Column Dependency to Compress Tables. In: Data Compression Conference (2004)

    Google Scholar 

  20. Vo, B.D., Vo, K.-P.: Compressing Table Data with Column Dependency. Theoretical Computer Science 387, 273–283 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  21. Vo, K.-P.: The Discipline and Method Architecture for Reusable Libraries. Software—Practice and Experience 30, 107–128 (2000)

    Article  Google Scholar 

  22. Witten, I.H., Radford, M., Cleary, J.G.: Arithmetic Coding for Data Compression. Comm. of the ACM 30(6), 520–540 (1987)

    Article  Google Scholar 

  23. Ziv, J., Lempel, A.: A Universal Algorithm for Sequential Data Compression. IEEE Trans. on Information Theory 23(3), 337–343 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  24. Ziv, J., Lempel, A.: Compression of Individual Sequences via Variable-Rate Coding. IEEE Trans. on Information Theory 24(5), 530–536 (1978)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vo, KP. (2008). The Vcodex Platform for Data Compression. In: Filipe, J., Shishkov, B., Helfert, M., Maciaszek, L.A. (eds) Software and Data Technologies. ICSOFT ENASE 2007 2007. Communications in Computer and Information Science, vol 22. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88655-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88655-6_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88654-9

  • Online ISBN: 978-3-540-88655-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics