Compositions of Tree-to-Tree Statistical Machine Translation Models

Maletti, Andreas

doi:10.1007/978-3-662-53132-7_24

Andreas Maletti¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9840))

Included in the following conference series:

International Conference on Developments in Language Theory

556 Accesses
1 Citations

Abstract

Compositions of well-known tree-to-tree translation models used in statistical machine translation are investigated. Synchronous context-free grammars are closed under composition in both the unweighted as well as the weighted case. In addition, it is demonstrated that there is a close connection between compositions of synchronous tree-substitution grammars and compositions of certain tree transducers because the intermediate trees can encode finite-state information. Utilizing these close ties, the composition closure of synchronous tree-substitution grammars is identified in the unweighted and weighted case. In particular, in the weighted case, these results build on a novel lifting strategy that will prove useful also in other setups.

Supported by the German Research Foundation (DFG) grant MA/4959/1-1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
For technical reasons we disallow that \(\{t, t'\} \subseteq Q\).
2.
Note that in an STSG the elements of \(\varSigma \) can label internal nodes and leaves.
3.
For simplicity, we assume that nodes in different trees are disjoint.
4.
Using only the weights 0 and 1.
5.
A tree translation \(\tau :T_\varSigma (L) \times T_\varSigma (L) \rightarrow A\) is injective if for every output tree \(u \in T_\varSigma (L)\) there exists at most one input tree \(t \in T_\varSigma (L)\) such that \(\tau (t, u) \ne 0\).

References

Aho, A.V., Ullman, J.D.: Syntax directed translations and the pushdown assembler. J. Comput. Syst. Sci. 3(1), 37–56 (1969)
Article MathSciNet MATH Google Scholar
Arnold, A., Dauchet, M.: Morphismes et bimorphismes d’arbres. Theor. Comput. Sci. 20(1), 33–93 (1982)
Article MathSciNet MATH Google Scholar
Chen, S., Matsumoto, T.: Translation of quantifiers in Japanese-Chinese machine translation. In: Isahara, H., Kanzaki, K. (eds.) JapTAL 2012. LNCS, vol. 7614, pp. 11–22. Springer, Heidelberg (2012)
Chapter Google Scholar
Clifton, A., Sarkar, A.: Combining morpheme-based machine translation with post-processing morpheme prediction. In: Proceedings of ACL, pp. 32–42. ACL (2011)
Google Scholar
Collins, M., Koehn, P., Kucerovǎ, I.: Clause re-structuring for statistical machine translation. In: Proceedings of ACL, pp. 531–540. ACL (2005)
Google Scholar
Eisner, J.: Learning non-isomorphic tree mappings for machine translation. In: Proceedings of ACL, pp. 205–208. ACL (2003)
Google Scholar
Engelfriet, J.: Bottom-up and top-down tree transformations: a comparison. Math. Syst. Theor. 9(3), 198–231 (1975)
Article MathSciNet MATH Google Scholar
Engelfriet, J., Fülöp, Z., Maletti, A.: Composition closure of linear extended top-down tree transducers. Theor. Comput. Syst. (2016, to appear). doi:10.1007/s00224-015-9660-2
Google Scholar
Fülöp, Z., Maletti, A., Vogler, H.: Weighted extended tree transducers. Fundam. Informaticae 111(2), 163–202 (2011)
MathSciNet MATH Google Scholar
Fülöp, Z., Vogler, H.: Weighted tree transducers. J. Autom. Lang. Comb. 9(1), 31–54 (2004)
MathSciNet MATH Google Scholar
Fülöp, Z., Vogler, H.: Weighted tree automata and tree transducers. In: Droste, M., Kuich, W., Vogler, H. (eds.) Handbook of Weighted Automata, Chap. 9, pp. 313–403. Springer, Heidelberg (2009)
Chapter Google Scholar
Gécseg, F., Steinby, M.: Tree Automata. Akadémiai Kiadó, Budapest (1984)
MATH Google Scholar
Gécseg, F., Steinby, M.: Tree Automata. arXiv:1509.06233 (2015)
Golan, J.S.: Semirings and Their Applications. Springer, Dordrecht (1999)
Book MATH Google Scholar
Graehl, J., Knight, K.: Training tree transducers. In: Proceedings of HLT-NAACL, pp. 105–112. ACL (2004)
Google Scholar
Hebisch, U., Weinert, H.J.: Semirings-Algebraic Theory and Applications in Computer Science. World Scientific, Singapore (1998)
Book MATH Google Scholar
Koehn, P.: Statistical Machine Translation. Cambridge University Press, Cambridge (2010)
MATH Google Scholar
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: open source toolkit for statistical machine translation. In: Proceedings of ACL, pp. 177–180. ACL (2007)
Google Scholar
Kuich, W.: Full abstract families of tree series I. In: Karhumäki, J., Maurer, H., Păun, G., Rozenberg, G. (eds.) Jewels are Forever, pp. 145–156. Springer, Heidelberg (1999)
Chapter Google Scholar
Lerner, U., Petrov, S.: Source-side classifier preordering for machine translation. In: Proceedings of EMNLP, pp. 513–523. ACL (2013)
Google Scholar
Maletti, A.: The power of weighted regularity-preserving multi bottom-up tree transducers. Int. J. Found. Comput. Sci. 26(7), 987–1005 (2015)
Article MathSciNet MATH Google Scholar
Maletti, A., Graehl, J., Hopkins, M., Knight, K.: The power of extended top-down tree transducers. SIAM J. Comput. 39(2), 410–430 (2009)
Article MathSciNet MATH Google Scholar
May, J., Knight, K., Vogler, H.: Efficient inference through cascades of weighted tree transducers. In: Proceedings of ACL, pp. 1058–1066. ACL (2010)
Google Scholar
Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2), 269–311 (1997)
MathSciNet Google Scholar
Stymne, S.: Text harmonization strategies for phrase-based statistical machine translation. Ph.D. thesis, Linköping University (2012)
Google Scholar
Xia, F., McCord, M.C.: Improving a statistical MT system with automatically learned rewrite patterns. In: Proceedings of CoLing, pp. 508–514 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Natural Language Processing, Universität Stuttgart, Pfaffenwaldring 5b, 70569, Stuttgart, Germany
Andreas Maletti

Authors

Andreas Maletti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreas Maletti .

Editor information

Editors and Affiliations

Université du Québec à Montréal , Montreal, Québec, Canada
Srečko Brlek
Dept Mathematiques, Univ du Quebec Montreal Dept Mathematiques, Montreal, Québec, Canada
Christophe Reutenauer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maletti, A. (2016). Compositions of Tree-to-Tree Statistical Machine Translation Models. In: Brlek, S., Reutenauer, C. (eds) Developments in Language Theory. DLT 2016. Lecture Notes in Computer Science(), vol 9840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53132-7_24

Download citation

DOI: https://doi.org/10.1007/978-3-662-53132-7_24
Published: 21 July 2016
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-53131-0
Online ISBN: 978-3-662-53132-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics