Abstract
A new data structure is investigated, which allows fast decoding of texts encoded by canonical Huffman codes. The storage requirements are much lower than for conventional Huffman trees, O(log2 n) for trees of depth O(log n), and decoding is faster, because a part of the bit-comparisons necessary for the decoding may be saved. Empirical results on large real-life distributions show a reduction of up to 50% and more in the number of bit operations.
Partially supported by Grant 8560195 of the Israeli Ministry of Science and Arts
Preview
Unable to display preview. Download preview PDF.
References
Bookstein A., Klein S.T., Compression, Information Theory and Grammars: A Unified Approach, ACM Trans. on Information Systems8 (1990) 27–49.
Bookstein A., Klein S.T., Is Huffman coding dead?, Computing50 (1993) 279–296.
Bookstein A., Klein S.T., Ziff D.A., A systematic approach to compressing a full text retrieval system, Information Processing & Management28 (1992) 795–806.
Choueka Y., Klein S.T., Perl Y., Efficient Variants of Huffman Codes in High Level Languages, Proc. 8-th ACM-SIGIR Conf., Montreal (1985) 122–130.
Fraenkel A.S., All about the Responsa Retrieval Project you always wanted to know but were afraid to ask, Expanded Summary, Jurimetrics J.16 (1976) 149–156.
Fraenkel A.S., Klein S.T., Novel Compression of Sparse Bit-Strings, in Combinatorial Algorithms on Words, NATO ASI Series Vol F12, Springer Verlag, Berlin (1985) 169–183.
Fraenkel A.S., Klein S.T., Bidirectional Huffman Coding, The Computer Journal33 (1990) 296–307.
Fraenkel A.S., Klein S.T., Bounding the Depth of Search Trees, The Computer Journal36 (1993) 668–678.
Ferguson T.J., Rabinowitz J.H., Self-synchronizing Huffman codes, IEEE Trans. on Information Theory, IT-30 (1984) 687–693.
Gilbert E.N., Moore E.F., Variable-length binary encodings, The Bell System Technical Journal38 (1959) 933–968.
Heaps H.S., Information Retrieval, Computational and Theoretical Aspects, Academic Press, New York (1978).
Huffman D., A method for the construction of minimum redundancy codes, Proc. of the IRE40 (1952) 1098–1101.
Hirschberg D.S., Lelewer D.A., Efficient decoding of prefix codes, Comm. of the ACM33 (1990) 449–459.
Katona G.H.O., Nemetz T.O.H., Huffman codes and self-information, IEEE Trans. on Inf. Th.IT-11 (1965) 284–292.
Knuth D.E., The Art of Computer Programming, VolI, Fundamental Algorithms, Addison-Wesley, Reading, MA (1973).
Lelewer D.A., Hirschberg D.S., Data compression, ACM Computing Surveys19 (1987) 261–296.
Longo G., Galasso G., An application of informational divergence to Huffman codes, IEEE Trans. on Inf. Th.IT-28 (1982) 36–43.
Moffat A., Bell T., In-situ generation of compressed inverted files, J. ASIS46 (1995) 537–550.
Moffat A., Turpin A., On the implementation of minimum redundancy prefix codes, Proc. Data Compression Conference DCC-96, Snowbird, Utah (1996) 182–191.
Moffat A., Turpin A., Katajainen J., Space-efficient construction of optimal prefix codes, Proc. Data Compression Conference DCC-95, Snowbird, Utah (1995) 192–201.
Moffat A., Zobel J., Sharman N., Text compression for dynamic document databases, to appear in IEEE Transactions on Knowledge and Data Engineering. Preliminary version in Proc. Data Compression Conference DCC-94, Snowbird, Utah (1994) 126–135.
Schwartz E.S., Kallick B., Generating a canonical prefix encoding, Comm. of the ACM7 (1964) 166–169.
Sieminski, A., Fast decoding of the Huffman codes, Information Processing Letters26 (1988) 237–241.
Witten I.H., Moffat A., Bell T.C., Managing Gigabytes: Compressing and Indexing Documents and Images, Van Nostrand Reinhold, New York (1994).
Zipf G.K., The Psycho-Biology of Language, Boston, Houghton (1935).
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Klein, S.T. (1997). Space- and time-efficient decoding with canonical huffman trees. In: Apostolico, A., Hein, J. (eds) Combinatorial Pattern Matching. CPM 1997. Lecture Notes in Computer Science, vol 1264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63220-4_50
Download citation
DOI: https://doi.org/10.1007/3-540-63220-4_50
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63220-7
Online ISBN: 978-3-540-69214-0
eBook Packages: Springer Book Archive