Skip to main content
Log in

Space Efficient Algorithms for the Burrows-Wheeler Backtransformation

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

The Burrows-Wheeler transformation is used for effective data compression, e.g., in the well known program bzip2. Compression and decompression are done in a block-wise fashion; larger blocks usually result in better compression rates. With the currently used algorithms for decompression, 4n bytes of auxiliary memory for processing a block of n bytes are needed, 0<n<232. This may pose a problem in embedded systems (e.g., mobile phones), where RAM is a scarce resource.

In this paper we present algorithms that reduce the memory need without sacrificing speed too much.

The main results are: Assuming an input string of n characters, 0<n<232, the reverse Burrows-Wheeler transformation can be done with 1.625 n bytes of auxiliary memory and O(n) runtime, using just a few operations per input character. Alternatively, we can use n/t bytes and 256 t n operations.

The theoretical results are backed up by experimental data showing the space-time tradeoff.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Abel, J.: Grundlagen des Burrows-Wheeler-Kompressionsalgorithmus (in German). Informatik—Forschung und Entwicklung (2003). http://www.data-compression.info/JuergenAbel/Preprints/Preprint_Grundlagen_BWCA.pdf

  2. Arnavut, Z.: Generalization of the BWT transformation and inversion ranks. In: Proc. IEEE Data Compression Conference (DCC’02), p. 447 (2002)

  3. Atallah, M.J. (ed.): Algorithms and Theory of Computation Handbook. CRC Press, Boca Raton (1999)

    Google Scholar 

  4. Balkenhol, B., Kurtz, S.: Universal data compression based on the Burrows-Wheeler transformation: Theory and practice. IEEE Trans. Comput. 23(10), 1043–1053 (2000)

    MathSciNet  Google Scholar 

  5. Bentley, J.L., Sedgewick, R.: Fast algorithms for sorting and searching strings. In: Proc. 8th ACM-SIAM Symposium on Discrete Algorithms (SODA’97), p. 360–369 (1997)

  6. Burkhardt, S., Kärkkäinen, J.: Fast lightweight suffix array construction and checking. In: Proc. 14th Symposium on Combinatorial Pattern Matching (CPM’03), pp. 55–69 (2003)

  7. Burrows, M., Wheeler, D.J.: A block-sorting lossless data compression algorithm. Technical report 124, Digital Equipment Corporation (1994). http://gatekeeper.research.compaq.com/pub/DEC/SRC/research-reports/abstracts/src-rr-124.html

  8. The Canterbury Corpus: http://corpus.canterbury.ac.nz/

  9. Fenwick, P.: Block sorting text compression—final report. Technical report, Department of Computer Science, The University of Auckland (1996). ftp://ftp.cs.auckland.ac.nz/pub/staff/peter-f/TechRep130.ps

  10. Ferragina, P., Manzini, G.: An experimental study of an opportunistic index. In: Proc. ACM-SIAM Symposium on Discrete Algorithms (SODA’01), pp. 269–278 (2001)

  11. Ferragina, P., Manzini, G.: Compression boosting in optimal linear time using the Burrows-Wheeler transform. In: Proc. 15th ACM-SIAM Symposium on Discrete Algorithms (SODA’04), pp. 655–663 (2004)

  12. Toms hardware guide: http://www.tomshardware.com/cpu/20001120/p4-01.html

  13. Itoh, H., Tanaka, H.: An efficient method for in-memory construction of suffix arrays. In: Proc. 6th International Symposium on String Processing and Information Retrieval (SPIRE), pp. 81–88 (1999)

  14. Larsson, N.J., Sadakane, K.: Faster suffix sorting. Theor. Comput. Sci. 387(3), 258–272 (2007)

    MATH  MathSciNet  Google Scholar 

  15. Manber, U., Meyers, E.: Suffix arrays: a new method for on-line string searches. SIAM J. Comput. 22, 935–948 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  16. Manzini, G.: An analysis of the Burrows-Wheeler transform. J. ACM 48(3), 407–430 (2001)

    Article  MathSciNet  Google Scholar 

  17. Manzini, G., Ferragina, P.: Engineering a lightweight suffix array construction algorithm. Algorithmica 40(1), 33–50 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  18. Na, J.C.: Linear-time construction of compressed suffix arrays using o(nlog n)-bit working space for large alphabets. In: Proc. 16th Symposium on Combinatorial Pattern Matching (CPM’05), pp. 56–67 (2005)

  19. Navarro, G., Mäkinen, V.: Compressed full text indexes. ACM Comput. Surv. 39(1) (2007)

  20. Nelson, M.: Data compression with the Burrows-Wheeler transform. Dr. Dobb’s J. 9 (1996)

  21. Puglisi, S.J., Smyth, W.F., Turpin, A.H.: A taxonomy of suffix array construction algorithms. ACM Comput. Surv. 39(2) (2007)

  22. Seward, J.: Bzip2 manual. http://www.bzip.org/1.0.3/bzip2-manual-1.0.3.html

  23. Seward, J.: On the performance of BWT sorting algorithms. In: Proc. IEEE Data Compression Conference (DCC’00), pp. 173–182 (2000)

  24. Seward, J.: Space-time tradeoffs in the inverse B-W transform. In: Proc. IEEE Data Compression Conference (DCC’01), pp. 439–448 (2001)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ulrich Lauther.

Additional information

A preminirary version of this paper appeared in the Proceedings of the 13th Annual European Symposium on Algorithms (ESA), LNCS, vol. 3669, pp. 293–304, 2005. This research was done while Tamás Lukovszki was with Siemens AG, Corporate Technology.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lauther, U., Lukovszki, T. Space Efficient Algorithms for the Burrows-Wheeler Backtransformation. Algorithmica 58, 339–351 (2010). https://doi.org/10.1007/s00453-008-9269-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-008-9269-9

Keywords

Navigation