Abstract
Probabilistic finite-state string transducers (FSTs) are extremely popular in natural language processing, due to powerful generic methods for applying, composing, and learning them. Unfortunately, FSTs are not a good fit for much of the current work on probabilistic modeling for machine translation, summarization, paraphrasing, and language modeling. These methods operate directly on trees, rather than strings. We show that tree acceptors and tree transducers subsume most of this work, and we discuss algorithms for realizing the same benefits found in probabilistic string transduction.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Knight, K., Graehl, J.: Machine transliteration. Computational Linguistics 24 (1998)
Mohri, M., Pereira, F., Riley, M.: The design principles of a weighted finite-state transducer library. Theor. Comput. Sci. 231 (2000)
Kaplan, R., Kay, M.: Regular models of phonological rule systems. Computational Linguistics 20 (1994)
Karttunen, L., Gaal, T., Kempe, A.: Xerox finite-state tool. Technical report, Xerox Research Centre Europe (1997)
van Noord, G., Gerdemann, D.: An extendible regular expression compiler for finite-state approaches in natural language processing. In: Boldt, O., Jürgensen, H. (eds.) WIA 1999. LNCS, vol. 2214, p. 122. Springer, Heidelberg (2001)
Kanthak, S., Ney, H.: Fsa: An efficient and flexible C++ toolkit for finite state automata using on-demand computation. In: Proc. ACL (2004)
Graehl, J.: Carmel finite-state toolkit (1997), http://www.isi.edu/licensed-sw/carmel/
Knight, K., Al-Onaizan, Y.: Translation with finite-state devices. In: Farwell, D., Gerber, L., Hovy, E. (eds.) AMTA 1998. LNCS (LNAI), vol. 1529, pp. 421–437. Springer, Heidelberg (1998)
Brown, P., Della Pietra, S., Della Pietra, V., Mercer, R.: The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19, 263–311 (1993)
Kumar, S., Byrne, W.: A weighted finite state transducer implementation of the alignment template model for statistical machine translation. In: Proc. NAACL (2003)
Och, F., Tillmann, C., Ney, H.: Improved alignment models for statistical machine translation. In: Proc. ACL (1999)
Yamada, K., Knight, K.: A syntax-based statistical translation model. In: Proc. ACL, pp. 523–530 (2001)
Wu, D.: Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics 23, 377–404 (1997)
Alshawi, H., Bangalore, S., Douglas, S.: Learning dependency translation models as collections of finite state head transducers. Computational Linguistics 26, 45–60 (2000)
Gildea, D.: Loosely tree-based alignment for machine translation. In: Proc. ACL, Sapporo, Japan (2003)
Eisner, J.: Learning non-isomorphic tree mappings for machine translation. In: Proc. ACL, companion volume (2003)
Melamed, I.D.: Multitext grammars and synchronous parsers. In: Proc. NAACL (2003)
Knight, K., Marcu, D.: Summarization beyond sentence extraction: A probabilistic approach to sentence compression. Artificial Intelligence 139 (2002)
Pang, B., Knight, K., Marcu, D.: Syntax-based alignment of multiple translations extracting paraphrases and generating new sentences. In: Proc. NAACL (2003)
Langkilde, I., Knight, K.: Generation that exploits corpus-based statistical knowledge. In: Proc. ACL (1998)
Bangalore, S., Rambow, O.: Exploiting a probabilistic hierarchical model for generation. In: International Conference on Computational Linguistics (COLING 2000), Saarbrucken, Germany (2000)
Corston-Oliver, S., Gamon, M., Ringger, E.K., Moore, R.: An overview of Amalgam: A machine-learned generation module. In: Proceedings of the International Natural Language Generation Conference, New York, USA, pp. 33–40 (2002)
Echihabi, A., Marcu, D.: A noisy-channel approach to question answering. In: Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan (2003)
Charniak, E.: Immediate-head parsing for language models. In: Proc. ACL (2001)
Rounds, W.C.: Mappings and grammars on trees. Mathematical Systems Theory 4, 257–287 (1970)
Thatcher, J.W.: Generalized2 sequential machine maps. J. Comput. System Sci. 4, 339–367 (1970)
Graehl, J., Knight, K.: Training tree transducers. In: Proc. NAACL (2004)
Gécseg, F., Steinby, M.: Tree Automata. Akadémiai Kiadó, Budapest (1984)
Comon, H., Dauchet, M., Gilleron, R., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (1997), Available on www.grappa.univ-lille3.fr/tata (release October 1, 2002)
Galley, M., Hopkins, M., Knight, K., Marcu, D.: What’s in a translation rule? In: NAACL, Boston, MA (2004)
Doner, J.: Tree acceptors and some of their applications. Journal of Computer and System Sciences 4, 406–451 (1970)
Hopcroft, J., Ullman, J.: Introduction to Automata Theory, Languages, and Computation. Addison-Wesley Series in Computer Science. Addison-Wesley, London (1979)
Johnson, M.: PCFG models of linguistic tree representations. Computational Linguistics 24, 613–632 (1998)
Collins, M.: Three generative, lexicalised models for statistical parsing. In: Proc. ACL (1997)
Thatcher, J.: Characterizing derivation trees of context-free grammars through a generalization of finite automata theory. J. Comput. Syst. Sci. 1, 317–322 (1967)
Yamasaki, K., Sodeshima, Y.: Fundamental properties of pushdown tree transducers (PDTT) — a top-down case. IEICE Trans. Inf. and Syst. E76-D (1993)
Engelfriet, J.: Top-down tree transducers with regular look-ahead. Math. Systems Theory 10, 289–303 (1977)
Engelfriet, J.: Bottom-up and top-down tree transformations — a comparison. Math. Systems Theory 9, 198–231 (1975)
Baum, L.E., Eagon, J.A.: An inequality with application to statistical estimation for probabilistic functions of Markov processes and to a model for ecology. Bulletin of the American Mathematical Society 73, 360–363 (1967)
Aho, A.V., Ullman, J.D.: Translations of a context-free grammar. Information and Control 19, 439–475 (1971)
van Noord, G.: The intersection of finite state automata and definite clause grammars. In: Proc. ACL (1995)
Dijkstra, E.W.: A note on two problems in connexion with graphs. Numerische Mathematik 1, 269–271 (1959)
Eppstein, D.: Finding the k shortest paths. SIAM Journal on Computing 28, 652–673 (1999)
Mohri, M., Riley, M.: An efficient algorithm for the n-best-strings problem. In: Proc. ICSLP (2002)
Langkilde, I.: Forest-based statistical sentence generation. In: Proc. NAACL (2000)
Nederhof, M.J., Satta, G.: Parsing non-recursive CFGs. In: Proc. ACL (2002)
Knuth, D.: A generalization of Dijkstra’s algorithm. Info. Proc. Letters 6 (1977)
Klein, D., Manning, C.: Parsing and hypergraphs. In: International Workshop on Parsing Technologies (2001)
Nederhof, M.J.: Weighted deductive parsing and Knuth’s algorithm. Computational Linguistics 29 (2003)
Shieber, S.M., Schabes, Y.: Synchronous tree-adjoining grammars. In: Proceedings of the 13th International Conference on Computational Linguistics, Helsinki, Finland, vol. 3, pp. 253–258 (1990)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Knight, K., Graehl, J. (2005). An Overview of Probabilistic Tree Transducers for Natural Language Processing. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2005. Lecture Notes in Computer Science, vol 3406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30586-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-30586-6_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24523-0
Online ISBN: 978-3-540-30586-6
eBook Packages: Computer ScienceComputer Science (R0)