Abstract
Uyghur-Chinese Machine Translation System Combination bears some drawbacks of not considering semantic information when doing the combination and the individual systems which participated in system combination lacking diversity. This paper tackles these problems by proposing a system combination method which was generated multiple new systems from a single Statistical Machine Translation (SMT) engine and combined together. These new systems are generated based on a bilingual phrase semantic representation model. Specifically, the Uyghur-Chinese bilingual phrase bilinear semantic similarity score and cosine semantic similarity score were firstly computed by a bilingual phrase semantic representation model and then several new systems were generated by adding features to the original feature set of the phrase-based translation model by static features and dynamic features. Finally, the newly generated system is combined with the baseline system to obtain the final combination results. Experimental results on the Uyghur-Chinese CWMT2013 test sets show that our approach significantly outperforms the baseline by 0.63 BLEU points respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bentivogli, L., Bisazza, A., Cettolo, M., Federico, M.: Neural versus phrase-based machine translation quality: a case study. arXiv preprint arXiv:1608.04631 (2016)
Koehn, P., Knowles, R.: Six challenges for neural machine translation. arXiv preprint arXiv:1706.03872 (2017)
Li, X., Jiang, T., Zhou, X., Wang, L., Yang, Y.: Uyghur-Chinese machine translation key technology research overview. J. Netw. New Media 5(1), 19–25 (2016)
Su, J., Zhang, X., Turghun, O., Li, X.: Joint multiple engines Uyghur-Chinese machine translation system. Comput. Eng. 37(16), 179–181 (2011)
Wang, Y., Li, X., Yang, Y., Mi, C.: Research of Uyghur-Chinese machine translation system combination based on paraphrase information. Comput. Eng. 45(4), 288–295+301 (2019)
Zhang, J., Liu, S., Li, M., Zhou, M., Zong, C.: Bilingually-constrained Phrase Embeddings for Machine Translation, pp. 111–121. Association for Computational Linguistics, Baltimore (2014)
Zhang, B., Xiong, D., Su, J., Qin, Y.: Alignment-supervised bidimensional attention-based recursive autoencoders for bilingual phrase representation. IEEE Trans. Cybern., 1–11 (2018)
Gao, J., He, X., Yih, W., Li, D.: Learning Semantic Representations for the Phrase Translation Model. Computer Science (2013)
Su, J., Xiong, D., Zhang, B., Liu, Y., Yao, J., Zhang, M.: Bilingual Correspondence Recursive Autoencoder for Statistical Machine Translation, pp. 1248–1258. Association for Computational Linguistics, Lisbon (2015)
Zhang, B., Xiong, D., Su, J.: BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings (2017)
Abend, O., Rappoport, A.: The State of the Art in Semantic Representation, pp. 77–89. Association for Computational Linguistics, Vancouver, Canada (2017)
Banik, D., Ekbal, A., Bhattacharyya, P., Bhattacharyya, S.: Assembling translations from multi-engine machine translation outputs. Appl. Soft Comput. 78, 230–239 (2019)
Freitag, M., Peter, J.-T., Peitz, S., Feng, M., Ney, H.: Local System Voting Feature for Machine Translation System Combination, pp. 467–476. Association for Computational Linguistics, Lisbon (2015)
Ma, W.-Y., McKeown, K.: System combination for machine translation through paraphrasing. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1053–1058 (2015)
Marie, B., Fujita, A.: A smorgasbord of features to combine phrase-based and neural machine translation. In: Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Papers), pp. 111–124. (2018)
Zhu, J., Yang, M., Li, S., Zhao, T.: Sentence-level paraphrasing for machine translation system combination. In: Che, W., et al. (eds.) ICYCSEE 2016. CCIS, vol. 623, pp. 612–620. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-2053-7_54
Freitag, M., Huck, M., Ney, H.: Jane: open source machine translation system combination. In: Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp. 29–32 (2014)
Xiao, T., Zhu, J., Liu, T.: Bagging and boosting statistical machine translation systems. Artif. Intell. 195, 496–527 (2013)
Wuebker, J., Mauser, A., Ney, H.: Training phrase translation models with leaving-one-out. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 475–484. Association for Computational Linguistics (2010)
Passban, P., Hokamp, C., Liu, Q.: Bilingual distributed phrase representation for statistical machine translation. In: Proceedings of MT Summit XV, pp. 310–318 (2015)
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 311–318. Association for Computational Linguistics (2002)
Acknowledgements
This research is supported by the Xinjiang Uygur Autonomous Region Level talent introduction project (Y839031201), National Natural Science Foundation of China (U1703133), Subsidy of the Youth Innovation Promotion Association of the Chinese Academy of Sciences (2017472), the Xinjiang Key Laboratory Fund under Grant (2018D04018).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Y., Li, X., Yang, Y., Anwar, A., Dong, R. (2019). Research of Uyghur-Chinese Machine Translation System Combination Based on Semantic Information. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_45
Download citation
DOI: https://doi.org/10.1007/978-3-030-32236-6_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)