Abstract
Word alignment plays a critical role in statistical machine translation systems. The famous word alignment system, IBM models series, currently operates on only surface forms of words regardless of their linguistic features. This deficiency usually leads to many data sparseness problems. Therefore, we present an extension that enables the integration of morphological analysis into the traditional IBM models. Experiments on English-Vietnamese tasks show that the new model produces better results not only in word alignment but also in final translation performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Goldsmith, M.J., Hajic, J., Mercer, R.L., Mohanty, S.: But dictionaries are data too. In: Proceedings of the Workshop on Human Language Technology, pp. 202–205. Association for Computational Linguistics (1993)
Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2), 263–311 (1993)
Federico, M., Bertoldi, N., Cettolo, M.: Irstlm: an open source toolkit for handling large scale language models. In: Interspeech, pp. 1618–1621 (2008)
Koehn, P., Hoang, H.: Factored translation models. In: EMNLP-CoNLL, pp. 868–876 (2007)
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pp. 177–180. Association for Computational Linguistics (2007)
Lee, Y.-S.: Morphological analysis for statistical machine translation. In: Proceedings of HLT-NAACL 2004: Short Papers, pp. 57–60. Association for Computational Linguistics (2004)
Moore, R.C.: Improving ibm word-alignment model 1. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 518. Association for Computational Linguistics (2004)
Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol. 1, pp. 160–167. Association for Computational Linguistics (2003)
Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29(1), 19–51 (2003)
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Sadat, F., Habash, N.: Combination of arabic preprocessing schemes for statistical machine translation. In: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 1–8. Association for Computational Linguistics (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Van Bui, V., Tran, T.T., Nguyen, N.B.T., Pham, T.D., Le, A.N., Le, C.A. (2015). Improving Word Alignment Through Morphological Analysis. In: Huynh, VN., Inuiguchi, M., Demoeux, T. (eds) Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2015. Lecture Notes in Computer Science(), vol 9376. Springer, Cham. https://doi.org/10.1007/978-3-319-25135-6_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-25135-6_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25134-9
Online ISBN: 978-3-319-25135-6
eBook Packages: Computer ScienceComputer Science (R0)