Abstract
We study determinization of weighted finite-state automata (WFAs), which has important applications in automatic speech recognition (ASR). We provide the first polynomial-time algorithm to test for the twins property, which determines if a WFA admits a deterministic equivalent. We also provide a rigorous analysis of a determinization algorithm of Mohri, with tight bounds for acyclic WFAs. Given that WFAs can expand exponentially when determinized, we explore why those used in ASR tend to shrink. The folklore explanation is that ASR WFAs have an acyclic, multi-partite structure. We show, however, that there exist such WFAs that always incur exponential expansion when determinized. We then introduce a class of WFAs, also with this structure, whose expansion depends on the weights: some weightings cause them to shrink, while others, including random weightings, cause them to expand exponentially. We provide experimental evidence that ASR WFAs exhibit this weight dependence. That they shrink when determinized, therefore, is a result of favorable weightings in addition to special topology.
Work supported by AT&T Labs.
Preview
Unable to display preview. Download preview PDF.
References
R. K. Ahuja, T. L. Magnanti, and J. B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice-Hall, 1993.
J. Berstel. Transduction and Context-Free Languages. Springer-Verlag, 1979.
J. Berstel and C. Reutenauer. Rational Series and Their Languages. Springer-Verlag, 1988.
E. Bocchieri, G. Riccardi, and J. Anantharaman. The 1994 AT&T ATIS CHRONUS recognizer. In Proc. ARPA SLT, pages 265–8, 1995.
A. L. Buchsbaum and R. Giancarlo. Algorithmic aspects in speech recognition: An introduction. ACM J. Exp. Algs., 2, 1997.
J. L. Carter and M. N. Wegman. Universal classes of hash functions. JCSS, 18:143–54, 1979.
C. Choffrut. Une caracterisation des fonctions sequentielles et des fonctions soussequentielles en tant que relations rationnelles. Theor. Comp. Sci., 5:325–37, 1977.
C. Choffrut. Contributions á l'étude de quelques familles remarquables de function rationnelles. PhD thesis, LITP-Université Paris 7, 1978.
K. Culik II and J. Karhumäki. Finite automata computing real functions. SI AM J. Comp., 23(4):789–814, 1994.
K. Culik II and P. Rajcáni. Iterative weighted finite transductions. Acta Inf., 32:681–703, 1995.
D. Derencourt, J. Karhumäki, M. Latteux, and A. Terlutte. On computational power of weighted finite automata. In Proc. 17th MFCS, volume 629 of LNCS, pages 236–45. Springer-Verlag, 1992.
S. Eilenberg. Automata, Languages, and Machines, volume A. Academic Press, 1974.
J. Goldstine, C. M. R. Kintala, and D. Wotschke. On measuring nondeterminism in regular languages. Inf. and Comp., 86:179–94, 1990.
J. Kari and P. Fränti. Arithmetic coding of weighted finite automata. RAIRO Inform. Th. Appl., 28(3–4):343–60, 1994.
C. M. R. Kintala and D. Wotschke. Amounts of nondeterminism in finite automata. Acta Inf., 13:199–204, 1980.
W. Kuich and A. Salomaa. Semirings, Automata, Languages. Springer-Verlag, 1986.
M. Mohri. Finite-state transducers in language and speech processing. Comp. Ling., 23(2):269–311, 1997.
M. Mohri. On the use of sequential transducers in natural language processing. In Finite-State Language Processing. MIT Press, 1997.
F. Pereira and M. Riley. Speech recognition by composition of weighted finite automata. In Finite-State Language Processing. MIT Press, 1997.
F. Pereira, M. Riley, and R. Sproat. Weighted rational transductions and their application to human language processing. In Proc. ARPA HLT, pages 249–54, 1994.
F. P. Preparata and M. I. Shamos. Computational Geometry: An Introduction. Springer-Verlag, 1988.
M. O. Rabin. Probabilistic automata. Inf. and Control, 6:230–45, 1963.
M. D. Riley, A. Ljolje, D. Hindle, and F. C. N. Pereira. The AT&T 60,000 word speech-to-text system. In Proc. 4th EUROSPEECH, volume 1, pages 207–210, 1995.
E. Roche. Analyse Syntaxique Transformationelle du Francais par Transducteurs et Lexique-Grammaire. PhD thesis, LITP-Université Paris 7, 1993.
A. Salomaa and M. Soittola. Automata-Theoretic Aspects of Formal Power Series. Springer-Verlag, 1978.
M. Silberztein. Dictionnaires électroniques et analise automatique de textes: le systéme INTEX. PhD thesis, Masson, Paris, France., 1993.
A. Weber and R. Klemm. Economy of description for single-valued transducers. Inf. and Comp., 118:327–40, 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Buchsbaum, A.L., Giancarlo, R., Westbrook, J.R. (1998). On the determinization of weighted finite automata. In: Larsen, K.G., Skyum, S., Winskel, G. (eds) Automata, Languages and Programming. ICALP 1998. Lecture Notes in Computer Science, vol 1443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0055077
Download citation
DOI: https://doi.org/10.1007/BFb0055077
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64781-2
Online ISBN: 978-3-540-68681-1
eBook Packages: Springer Book Archive