Abstract
This paper presents an eclectic approach for compressing weighted finite-state automata and transducers, with minimal impact on performance. The approach is eclectic in the sense that various complementary methods have been employed: row-indexed storage of sparse matrices, dictionary compression, bit manipulation, and lossless omission of data. The compression rate is over 83% with respect to the current Bell Labs finite-state library.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Halle, M. and S. Keyser. 1971. English Stress, Its Forms, Its Growth, and Its Role in Verse. Studies in Language. Harper & Row, New York.
Kaplan, R. and M. Kay. 1994. Regularmo dels of phonological rule systems. Computational Linguistics, 20(3):331–78.
Karttunen, L. and K. Beesley. 1992. Two-level rule compiler. Technical report, Palo Alto Research Center, Xerox Corporation.
Kay, M. and R. Kaplan. 1983. Word recognition. This paper was never published. The core ideas are published in Kaplan and Kay (1994).
Koskenniemi, K. 1983. Two-Level Morphology. Ph.D. thesis, University of Helsinki.
Liang, F. 1983. Word Hy-phen-a-tion by Comp-uter. Ph.D. thesis, Stanford Univeristy.
Mohri, M., F. Pereira, and M. Riley. 1998. A rational design for a weighted finitestate transducer library. In D. Wood and S. Yu, editors, Automata Implementation, Lecture Notes in Computer Science 1436. Springer, pages 144–58.
Mohri, M. and R. Sproat. 1996. An efficient compiler for weighted rewrite rules. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, pages 231–8.
Ritchie, G., A. Black, G. Russell, and S. Pulman. 1992. Computational Morphology: Practical Mechanisms for the English Lexicon. MIT Press, Cambridge, MA.
Roche, E. and Y. Schabes. 1995. Deterministic part-of-speech tagging with finite-state transducers. Computational Linguistics, 21(2):227–53.
Roche, E. and Y. Schabes, editors. 1997. Finite-State Language Processing. MIT Press.
Sproat, R., editor. 1997. Multilingual Text-to-Speech Synthesis: The Bell Labs Approach. Kluwer, Boston, MA.
Tarjan, R. and A. Yao. 1979. Storing a sparse table. Communications of the ACM, 22(11):606–11.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kiraz, G.A. (2001). Compressed Storage of Sparse Finite-State Transducers. In: Boldt, O., Jürgensen, H. (eds) Automata Implementation. WIA 1999. Lecture Notes in Computer Science, vol 2214. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45526-4_11
Download citation
DOI: https://doi.org/10.1007/3-540-45526-4_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42812-1
Online ISBN: 978-3-540-45526-4
eBook Packages: Springer Book Archive