Abstract
We study automata for capturing the transformations in practical natural language processing (NLP) systems, especially those that translate between human languages. For several variations of finite-state string and tree transducers, we survey answers to formal questions about their expressiveness, modularity, teachability, and generalization. We conclude that no formal device yet captures everything that is desirable, and we point to future research.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Arnold A, Dauchet M (1982) Morphismes et Bimorphismes d’Arbres. Theor Comput Sci 20: 33–93
Baum LE, Eagon JA (1967) An inequality with application to statistical estimation for probabilistic functions of Markov processes and to a model for ecology. Bull Am Math Soc 73:360–363
Brown P, Della Pietra S, Della Pietra V, Mercer R (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2): 263–311
Doner J (1970) Tree acceptors and some of their applications. J Comput Syst Sci 4: 406–451
Eisner J (2003) Learning non-isomorphic tree mappings for machine translation. In: 41st annual meeting of the Association for Computational Linguistics. Sapporo, Japan, pp 205–208
Galley M, Hopkins M, Knight K, Marcu D (2004) What’s in a translation rule? In: Proceedings of the human language technology conference of the North American chapter of the association for computational linguistics: HLT-NAACL 2004. Boston, MA, pp 273–280
Gécseg F, Steinby M (1984) Tree Automata. Akadémiai Kiadó, Budapest
Graehl J (1997) Carmel finite-state toolkit. http://www.isi.edu/licensed-sw/carmel
Graehl J, Knight K (2004) Training tree transducers. In: Proceedings of the human language technology conference of the North American chapter of the association for computational linguistics: HLT-NAACL 2004. Boston, MA, pp 105–112
Knight K, Al-Onaizan Y (1998) Translation with finite-state devices. In: Machine translation and the information soup: third conference of the association for machine translation in the Americas, AMTA’98. Springer, Berlin, pp 421–437
Knight K, Graehl J (1998) Machine transliteration. Comput Linguist 24(4): 599–612
Knight K, Graehl J (2005) An overview of probabilistic tree transducers for natural language processing. In: Computational linguistics and intelligent text processing, 6th international conference, CICLing 2005, proceedings. Lecture notes in computer science, vol 3406. Springer, Berlin, pp 1–24
Kumar S, Byrne W (2003). A weighted finite state transducer implementation of the alignment template model for statistical machine translation. In: HLT-NAACL: human language technology conference of the North American chapter of the association for computational linguistics. Edmonton, AL, Canada, pp 63–70
Maletti A (2007) Compositions of extended top-down tree transducers. In: Proceedings of the 1st international conference on language and automata theory and applications. Springer, Berlin, pp 379–390
Maletti A, Graehl J, Hopkins M, Knight K (2008) The power of extended top-down tree transducers. SIAM J Comput (to appear)
May J, Knight K (2006) Tiburon: a weighted tree automata toolkit. In: Proceedings of the international conference on implementation and application of automata (CIAA). Lecture notes in computer science, vol 4094. Springer, Berlin, pp 102–113
Mohri M, Pereira F, Riley M (2000) The design principles of a weighted finite-state transducer library. Theor Comput Sci 231(1): 17–32
Rounds WC (1970) Mappings and grammars on trees. Math Syst Theory 4(3): 257–287
Schutzenberger MP (1961) A remark on finite transducers. Inf Control 4: 185–196
Shen L, Xu J, Weischedel R (2008) A new string-to-dependency machine translation algorithm with a target dependency language model. In: 46th annual meeting of the association for computational linguistics: human language technologies, proceedings of the conference. Columbus, OH, pp 577–585
Shieber S (2004) Synchronous grammars as tree transducers. In: 7th international workshop of TAG and related formalisms (TAG+7). Vancouver, BC, Canada, pp 88–95
Shieber SM, Schabes Y (1990) Synchronous tree-adjoining grammars. In: COLING-90. Papers presented to the 13th international conference on computational linguistics, vol 3. Helsinki, Finland, pp 253–258
Thatcher JW (1970) Generalized sequential machine maps. J Comput System Sci 4: 339–367
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Knight, K. Capturing practical natural language transformations. Machine Translation 21, 121–133 (2007). https://doi.org/10.1007/s10590-008-9039-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-008-9039-0