Abstract
Compositions of well-known tree-to-tree translation models used in statistical machine translation are investigated. Synchronous context-free grammars are closed under composition in both the unweighted as well as the weighted case. In addition, it is demonstrated that there is a close connection between compositions of synchronous tree-substitution grammars and compositions of certain tree transducers because the intermediate trees can encode finite-state information. Utilizing these close ties, the composition closure of synchronous tree-substitution grammars is identified in the unweighted and weighted case. In particular, in the weighted case, these results build on a novel lifting strategy that will prove useful also in other setups.
Supported by the German Research Foundation (DFG) grant MA/4959/1-1.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For technical reasons we disallow that \(\{t, t'\} \subseteq Q\).
- 2.
Note that in an STSG the elements of \(\varSigma \) can label internal nodes and leaves.
- 3.
For simplicity, we assume that nodes in different trees are disjoint.
- 4.
Using only the weights 0 and 1.
- 5.
A tree translation \(\tau :T_\varSigma (L) \times T_\varSigma (L) \rightarrow A\) is injective if for every output tree \(u \in T_\varSigma (L)\) there exists at most one input tree \(t \in T_\varSigma (L)\) such that \(\tau (t, u) \ne 0\).
References
Aho, A.V., Ullman, J.D.: Syntax directed translations and the pushdown assembler. J. Comput. Syst. Sci. 3(1), 37–56 (1969)
Arnold, A., Dauchet, M.: Morphismes et bimorphismes d’arbres. Theor. Comput. Sci. 20(1), 33–93 (1982)
Chen, S., Matsumoto, T.: Translation of quantifiers in Japanese-Chinese machine translation. In: Isahara, H., Kanzaki, K. (eds.) JapTAL 2012. LNCS, vol. 7614, pp. 11–22. Springer, Heidelberg (2012)
Clifton, A., Sarkar, A.: Combining morpheme-based machine translation with post-processing morpheme prediction. In: Proceedings of ACL, pp. 32–42. ACL (2011)
Collins, M., Koehn, P., Kucerovǎ, I.: Clause re-structuring for statistical machine translation. In: Proceedings of ACL, pp. 531–540. ACL (2005)
Eisner, J.: Learning non-isomorphic tree mappings for machine translation. In: Proceedings of ACL, pp. 205–208. ACL (2003)
Engelfriet, J.: Bottom-up and top-down tree transformations: a comparison. Math. Syst. Theor. 9(3), 198–231 (1975)
Engelfriet, J., Fülöp, Z., Maletti, A.: Composition closure of linear extended top-down tree transducers. Theor. Comput. Syst. (2016, to appear). doi:10.1007/s00224-015-9660-2
Fülöp, Z., Maletti, A., Vogler, H.: Weighted extended tree transducers. Fundam. Informaticae 111(2), 163–202 (2011)
Fülöp, Z., Vogler, H.: Weighted tree transducers. J. Autom. Lang. Comb. 9(1), 31–54 (2004)
Fülöp, Z., Vogler, H.: Weighted tree automata and tree transducers. In: Droste, M., Kuich, W., Vogler, H. (eds.) Handbook of Weighted Automata, Chap. 9, pp. 313–403. Springer, Heidelberg (2009)
Gécseg, F., Steinby, M.: Tree Automata. Akadémiai Kiadó, Budapest (1984)
Gécseg, F., Steinby, M.: Tree Automata. arXiv:1509.06233 (2015)
Golan, J.S.: Semirings and Their Applications. Springer, Dordrecht (1999)
Graehl, J., Knight, K.: Training tree transducers. In: Proceedings of HLT-NAACL, pp. 105–112. ACL (2004)
Hebisch, U., Weinert, H.J.: Semirings-Algebraic Theory and Applications in Computer Science. World Scientific, Singapore (1998)
Koehn, P.: Statistical Machine Translation. Cambridge University Press, Cambridge (2010)
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: open source toolkit for statistical machine translation. In: Proceedings of ACL, pp. 177–180. ACL (2007)
Kuich, W.: Full abstract families of tree series I. In: Karhumäki, J., Maurer, H., Păun, G., Rozenberg, G. (eds.) Jewels are Forever, pp. 145–156. Springer, Heidelberg (1999)
Lerner, U., Petrov, S.: Source-side classifier preordering for machine translation. In: Proceedings of EMNLP, pp. 513–523. ACL (2013)
Maletti, A.: The power of weighted regularity-preserving multi bottom-up tree transducers. Int. J. Found. Comput. Sci. 26(7), 987–1005 (2015)
Maletti, A., Graehl, J., Hopkins, M., Knight, K.: The power of extended top-down tree transducers. SIAM J. Comput. 39(2), 410–430 (2009)
May, J., Knight, K., Vogler, H.: Efficient inference through cascades of weighted tree transducers. In: Proceedings of ACL, pp. 1058–1066. ACL (2010)
Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2), 269–311 (1997)
Stymne, S.: Text harmonization strategies for phrase-based statistical machine translation. Ph.D. thesis, Linköping University (2012)
Xia, F., McCord, M.C.: Improving a statistical MT system with automatically learned rewrite patterns. In: Proceedings of CoLing, pp. 508–514 (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Maletti, A. (2016). Compositions of Tree-to-Tree Statistical Machine Translation Models. In: Brlek, S., Reutenauer, C. (eds) Developments in Language Theory. DLT 2016. Lecture Notes in Computer Science(), vol 9840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53132-7_24
Download citation
DOI: https://doi.org/10.1007/978-3-662-53132-7_24
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-53131-0
Online ISBN: 978-3-662-53132-7
eBook Packages: Computer ScienceComputer Science (R0)