Skip to main content

Improved Variable-to-Fixed Length Codes

  • Conference paper
String Processing and Information Retrieval (SPIRE 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5280))

Included in the following conference series:

Abstract

Though many compression methods are based on the use of variable length codes, there has recently been a trend to search for alternatives in which the lengths of the codewords are more restricted, which can be useful for fast decoding and compressed searches. This paper explores the construction of variable-to-fixed length codes, which have been suggested long ago by Tunstall. Using a new heuristic based on suffix trees, the performance of Tunstall codes could be improved by more than 30%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abrahams, J.: Code and parse trees for lossless source encoding. Comm. in Information and Systems 1(2), 113–146 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  2. Brisaboa, N.R., Fariña, A., Navarro, G., Esteller, M.F. (s,c)-dense coding: an optimized compression code for natural language text databases. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds.) SPIRE 2003. LNCS, vol. 2857, pp. 122–136. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  3. Brisaboa, N.R., Iglesias, E.L., Navarro, G., Paramá, J.R.: An efficient compression code for text databases. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, pp. 468–481. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Crochemore, M., Ilie, L., Smyth, W.F.: A Simple Algorithm for Computing the Lempel Ziv Factorization. In: Proc. Data Compression Conference DCC 2008, Snowbird, Utah, pp. 482–488 (2008)

    Google Scholar 

  5. Fraenkel, A.S., Klein, S.T.: Complexity Aspects of Guessing Prefix Codes. Algorithmica 12, 409–419 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  6. Chrobak, M., Kolman, P., Sgall, J.: The greedy algorithm for the minimum common string partition problem. ACM Transactions on Algorithms 1(2), 350–366 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  7. Fraenkel, A.S., Mor, M., Perl, Y.: Is text compression by prefixes and suffixes practical? Acta Informatica 20, 371–389 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  8. Huffman, D.: A method for the construction of minimum redundancy codes. Proc. of the IRE 40, 1098–1101 (1952)

    Article  MATH  Google Scholar 

  9. Klein, S.T.: Improving static compression schemes by alphabet extension. In: Giancarlo, R., Sankoff, D. (eds.) CPM 2000. LNCS, vol. 1848, pp. 210–221. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  10. Klein, S.T., Kopel Ben-Nissan, M.: Using Fibonacci compression codes as alternatives to dense codes. In: Proc. Data Compression Conference DCC 2008, Snowbird, Utah, pp. 472–481 (2008)

    Google Scholar 

  11. Moffat, A.: Word-based text compression. Software – Practice & Experience 19, 185–198 (1989)

    Article  Google Scholar 

  12. de Moura, E.S., Navarro, G., Ziviani, N., Baeza-Yates, R.: Fast and flexible word searching on compressed text. ACM Trans. on Information Systems 18, 113–139 (2000)

    Article  Google Scholar 

  13. Savari, S.A., Gallager, R.G.: Generalized Tunstall codes for sources with memory. IEEE Trans. Info. Theory  IT–43, 658–668 (1997)

    Article  MATH  Google Scholar 

  14. Tjalkens, T.J., Willems, F.M.J.: Variable to fixed length codes for Markov sources. IEEE Trans. Info. Theory IT–33, 246–257 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  15. Tunstall, B.P.: Synthesis of noiseless compression codes, Ph.D dissertation, Georgia Institute of Technology, Atlanta, GA (1967)

    Google Scholar 

  16. Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14(3), 249–260 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  17. Véronis, J., Langlais, P.: Evaluation of parallel text alignment systems: The arcade project. In: Véronis, J. (ed.) Parallel Text Processing, pp. 369–388. Kluwer Academic Publishers, Dordrecht (2000)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Klein, S.T., Shapira, D. (2008). Improved Variable-to-Fixed Length Codes. In: Amir, A., Turpin, A., Moffat, A. (eds) String Processing and Information Retrieval. SPIRE 2008. Lecture Notes in Computer Science, vol 5280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89097-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89097-3_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89096-6

  • Online ISBN: 978-3-540-89097-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics