skip to main content
research-article

Deep Neural Network--based Machine Translation System Combination

Published: 04 August 2020 Publication History

Abstract

Deep neural networks (DNNs) have provably enhanced the state-of-the-art natural language process (NLP) with their capability of feature learning and representation. As one of the more challenging NLP tasks, neural machine translation (NMT) becomes a new approach to machine translation and generates much more fluent results compared to statistical machine translation (SMT). However, SMT is usually better than NMT in translation adequacy and word coverage. It is therefore a promising direction to combine the advantages of both NMT and SMT. In this article, we propose a deep neural network--based system combination framework leveraging both minimum Bayes-risk decoding and multi-source NMT, which take as input the N-best outputs of NMT and SMT systems and produce the final translation. In particular, we apply the proposed model to both RNN and self-attention networks with different segmentation granularity. We verify our approach empirically through a series of experiments on resource-rich Chinese⇒English and low-resource English⇒Vietnamese translation tasks. Experimental results demonstrate the effectiveness and universality of our proposed approach, which significantly outperforms the conventional system combination methods and the best individual system output.

References

[1]
Philip Arthur, Graham Neubig, and Satoshi Nakamura. 2016. Incorporating discrete translation lexicons into neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1557--1567.
[2]
Necip Fazil Ayan, Jing Zheng, and Wen Wang. 2008. Improving alignments for better confusion networks for combining machine translation systems. In Proceedings of the International Conference on Computational Linguistics (COLING’08).
[3]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations (ICLR’15).
[4]
Srinivas Bangalore, German Bordel, and Giuseppe Richardi. 2001. Computing consensus translation from multiple machine translation systems. In Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU’01).
[5]
Debajyoty Banik, Asif Ekbal, Pushpak Bhattacharyya, and Siddhartha Bhattacharyya. 2019. Assembling translations from multi-engine machine translation outputs. Appl. Soft Comput. 78 (2019), 230--239.
[6]
Boxing Chen, Min Zhang, Haizhou Li, and Aiti Aw. 2009. A comparative study of hypothesis alignment and its improvement for machine translation system combination. In Proceedings of the Annual Conference of the Association for Computational Linguistics (ACL’09).
[7]
David Chiang. 2005. A hierarchical phrase-based model for statistical machine translation. In Proceedings of the Annual Conference of the Association for Computational Linguistics (ACL’05).
[8]
Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14).
[9]
Yang Feng, Yang Liu, Haitao Mi, Qun Liu, and Yajuan Lu. 2009. Lattice-based system combination for statistical machine translation. In Proceedings of the Annual Conference of the Association for Computational Linguistics (ACL’09).
[10]
Orhan Firat, Kyunghyun Cho, and Yoshua Bengio. 2016. Multi-way, multilingual neural machine translation with a shared attention mechanism. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’16).
[11]
Markus Freitag, Matthias Huck, and Hermann Ney. 2014. Jane: Open source machine translation system combiantion. In Proceedings of the International Conference of the European Association of Chinese Linguistics (EACL’14).
[12]
Ekaterina Garmash and Christof Monz. 2016. Ensemble learning for multi-source neural machine translation. In Proceedings of the International Conference on Computational Linguistics (COLING’16).
[13]
Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin. 2017. Convolutional sequence to sequence learning. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research), Doina Precup and Yee Whye Teh (Eds.), Vol. 70. PMLR, International Convention Centre, Sydney, Australia, 1243--1252.
[14]
Xinwei Geng, Xiaocheng Feng, Bing Qin, and Ting Liu. 2018. Adaptive multi-pass decoder for neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 523--532.
[15]
Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, et al. 2018. Achieving human parity on automatic Chinese to English news translation. arXiv preprint arXiv:1803.05567 (2018).
[16]
Wei He, Zhongjun He, Hua Wu, and Haifeng Wang. 2016. Improved neural machine translation with SMT features. In Proceedings of the Annual Confernece on Artificial Intelligence (AAAI’16).
[17]
Kenneth Heafield and Alon Lavie. 2010. Combining machine translation output with open source. In The Prague Bulletin of Machematical Linguistics.
[18]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neur. Comput. 9, 8 (1997), 1735--1780.
[19]
Hideki Isozaki, Tsutomu Hirao, Kevin Duh, Katsuhito Sudoh, and Hajime Tsukada. 2010. Automatic evaluation of translation quality for distant language pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’10).
[20]
Marcin Junczys-Dowmunt, Tomasz Dwojak, and Hieu Hoang. 2016. Is neural machine translation ready for deployment? A case study on 30 translation directions. In Proceedings of the International Conference on Spoken Language Translation (IWSLT’16).
[21]
Philipp Koehn and Rebecca Knowles. 2017. Six challenges for neural machine translation. In Proceedings of the 1st Workshop on Neural Machine Translation. Association for Computational Linguistics, 28--39.
[22]
Philipp Koehn, Franz J. Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the Association for Computational Linguistics Annual Conference of the North American Chapter of the Association for Computational Linguistics (ACL NAACL,13).
[23]
Shankar Kumar and William Byrne. 2004. Minimum bayes-risk decoding for statistical machine translation. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’04).
[24]
Maoxi Li, Jiajun Zhang, Yu Zhou, and Chengqing Zong. 2009. The CASIA statistical machine translation system for IWSLT 2009. In Proceedings of the International Conference on Spoken Language Translation (IWSLT’09).
[25]
Maoxi Li and Chengqing Zong. 2008. Word reordering alignment for combination of statistical machine translation systems. In Proceedings of the International Symposium on Chinese Spoken Language Processing.
[26]
Jindřich Libovický and Jindřich Helcl. 2017. Attention strategies for multi-source sequence-to-sequence learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 196--202.
[27]
Jindřich Libovický, Jindřich Helcl, and David Mareček. 2018. Input combination strategies for multi-source transformer decoder. In Proceedings of the T3rd Conference on Machine Translation: Research Papers. Association for Computational Linguistics, 253--260.
[28]
Yuchen Liu, Long Zhou, Yining Wang, Yang Zhao, Jiajun Zhang, and Chengqing Zong. 2018. A comparable study on model averaging, ensembling and reranking in NMT. In Natural Language Processing and Chinese Computing, Min Zhang, Vincent Ng, Dongyan Zhao, Sujian Li, and Hongying Zan (Eds.). Springer International Publishing, Cham, 299--308.
[29]
Wei-Yun Ma and Kathleen Mckeown. 2015. System combination for machine translation through paraphrasing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15).
[30]
Wolfgang Macherey and Franz Josef Och. 2007. An empirical study on computing consensus translations from multiple machine translation systems. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’07).
[31]
Benjamin Marie and Atsushi Fujita. 2018. A smorgasbord of features to combine phrase-based and neural machine translation. In Proceedings of the 13th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Papers). Association for Machine Translation in the Americas, Boston, MA, 111--124.
[32]
Jan Niehues, Eunah Cho, Thanh-Le Ha, and Alex Waibel. 2016. Pre-translation for neural machine translation. In Proceedings of the International Conference on Computational Linguistics (COLING’16).
[33]
Franz Och and Hermann Ney. 2002. Discriminative training and maximum entropy models for statistical machine translation. In Proceedings of the Annual Conference of the Association for Computational Linguistics (ACL’02).
[34]
Franz Josef Och and Hermann Ney. 2001. Statistical multi-source translation. In Proceedings of MT Summit.
[35]
Kishore Papineni, Salim Roukos, Todd Ward, and WeiJing Zhu. 2002. Bleu: A methof for automatic evaluation of machine translation. In Proceedings of the Annual Conference of the Association for Computational Linguistics (ACL’02).
[36]
Matīss RIKTERS. 2019. Hybrid machine translation by combining output from multiple machine translation systems. Baltic J. Mod. Comput. 7, 3 (2019), 301--341.
[37]
Antti-Veikko I. Rosti, Spyros Matsoukas, and Richard Schwartz. 2007. Improved word-level system combination for machine translation. In Proceedings of the Annual Conference of the Association for Computational Linguistics (ACL’07).
[38]
Antti-Veikko I. Rosti, Bing Zhang, Spyros Matsoukas, and Richard Schwartz. 2008. Incremental hypothesis alignment for building confusion networks with appplication to machine translation systems combination. In Proceedings of the 3rd ACL Workshop on Statistical Machine Translation.
[39]
Rico Sennrich, Alexandra Birch, Anna Currey, Ulrich Germann, Barry Haddow, Kenneth Heafield, Antonio Valerio Miceli Barone, and Philip Williams. 2017. The University of Edinburgh’s neural MT systems for WMT17. In Proceedings of the 2nd Conference on Machine Translation. Association for Computational Linguistics, 389--399.
[40]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Neural machine translation of rare words with subword units. In Proceedings of Annual Conference of the Association for Computational Linguistics (ACL’16).
[41]
Ilya Sutskever, Oriol Vinyals, and Quoc VV Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of NIPS 2014.
[42]
Roy Tromble, Shankar Kumar, Franz Och, and Wolfgang Macherey. 2008. Lattice minimum bayes-risk decoding for statistical machine translation. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 620--629.
[43]
Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, and Hang Li. 2016. Modeling coverage for neural machine translation. In Proceedings of the Annual Conference of the Association for Computational Linguistics (ACL’16).
[44]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, 5998--6008.
[45]
Xing Wang, Zhengdong Lu, Zhaopeng Tu, Hang Li, Deyi Xiong, and Min Zhang. 2017. Neural machine translation advised by statistical machine translation. In Proceedings of the 31st AAAI Conference on Artificial Intelligence.
[46]
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, and Mohammad Norouzi. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. In arXiv preprint arXix:1609.08144.
[47]
Yingce Xia, Fei Tian, Lijun Wu, Jianxin Lin, Tao Qin, Nenghai Yu, and Tie-Yan Liu. 2017. Deliberation networks: Sequence generation beyond one-pass decoding. In Advances in Neural Information Processing Systems. 1784--1794.
[48]
Kenji Yamada and Kevin Knight. 2001. A syntax-based statistical translation model. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Toulouse, France, 523--530.
[49]
Yang Zhao, Yining Wang, Jiajun Zhang, and Chengqing Zong. 2018. Phrase table as recommendation memory for neural machine translation. In Proceedings of the 27th International Joint Conference on Artificial Intelligence.
[50]
Jie Zhou, Ying Cao, Xuguang Wang, Peng Li, and Wei Xu. 2016. Deep recurrent models with fast-forward connections for neural machine translation. arXiv preprint arXiv:1606.04199 (2016).
[51]
Long Zhou, Wenpeng Hu, Jiajun Zhang, and Chengqing Zong. 2017. Neural system combination for machine translation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Vancouver, Canada, 378--384.
[52]
Long Zhou, Jiajun Zhang, and Chengqing Zong. 2019. Synchronous bidirectional neural machine translation. Trans. Assoc. Comput. Ling. 7 (Mar. 2019), 91--105.
[53]
Junguo Zhu, Muyun Yang, Sheng Li, and Tiejun Zhao. 2016. Sentence-level paraphrasing for machine translation system combination. In Proceedings of ICYCSEE 2016.
[54]
Barret Zoph and Kevin Knight. 2016. Multi-source neural translation. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’16).

Cited By

View all
  • (2023)NLP-reliant Neural Machine Translation techniques used in smart city applicationsInformation System and Smart City10.59400/issc.v3i1.4813:1(481)Online publication date: 2-Oct-2023
  • (2023)Speech-to-speech Low-resource Translation2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI58017.2023.00023(91-95)Online publication date: Aug-2023
  • (2021)Hybrid System Combination Framework for Uyghur–Chinese Machine TranslationInformation10.3390/info1203009812:3(98)Online publication date: 25-Feb-2021
  • Show More Cited By

Index Terms

  1. Deep Neural Network--based Machine Translation System Combination

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 19, Issue 5
    September 2020
    278 pages
    ISSN:2375-4699
    EISSN:2375-4702
    DOI:10.1145/3403646
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 August 2020
    Online AM: 07 May 2020
    Accepted: 01 March 2020
    Revised: 01 January 2020
    Received: 01 August 2019
    Published in TALLIP Volume 19, Issue 5

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. DNN
    2. NMT
    3. SMT
    4. low-resource translation
    5. minimal Bayes-risk decoding
    6. system combination

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Beijing Municipal Science and Technology Project
    • Natural Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)NLP-reliant Neural Machine Translation techniques used in smart city applicationsInformation System and Smart City10.59400/issc.v3i1.4813:1(481)Online publication date: 2-Oct-2023
    • (2023)Speech-to-speech Low-resource Translation2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI58017.2023.00023(91-95)Online publication date: Aug-2023
    • (2021)Hybrid System Combination Framework for Uyghur–Chinese Machine TranslationInformation10.3390/info1203009812:3(98)Online publication date: 25-Feb-2021
    • (2021)High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain CorpusApplied Sciences10.3390/app11221091511:22(10915)Online publication date: 18-Nov-2021

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media