Abstract
Neural Machine Translation (NMT) brings promising improvements in translation quality, but until recently, these models rely on large-scale parallel corpora. As such corpora only exist on a handful of language pairs, the translation performance is far from the desired effect in the majority of low-resource languages. Thus, developing low-resource language translation techniques is crucial and it has become a popular research field in neural machine translation. In this article, we make an overall review of existing deep learning techniques in low-resource NMT. We first show the research status as well as some widely used low-resource datasets. Then, we categorize the existing methods and show some representative works detailedly. Finally, we summarize the common characters among them and outline the future directions in this field.
- [1] . 2017. Learning bilingual word embeddings with (almost) no bilingual data. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 451–462.Google ScholarCross Ref
- [2] . 2019. An effective approach to unsupervised machine translation. arXiv preprint arXiv:1902.01313 (2019).Google Scholar
- [3] . 2020. Translation artifacts in cross-lingual transfer learning. arXiv preprint arXiv:2004.04721 (2020).Google Scholar
- [4] . 2017. Unsupervised neural machine translation. arXiv preprint arXiv:1710.11041 (2017).Google Scholar
- [5] . 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google Scholar
- [6] . 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. 65–72.Google ScholarDigital Library
- [7] . 2009. Domain adaptation for statistical machine translation with monolingual resources. In Proceedings of the Fourth Workshop on Statistical Machine Translation. 182–189.Google ScholarDigital Library
- [8] . 2011. Improving translation model by monolingual data. In Proceedings of the 6th Workshop on Statistical Machine Translation. 330–336.Google Scholar
- [9] . 1993. The mathematics of statistical machine translation: Parameter estimation. Computat. Ling. 19, 2 (1993), 263–311.Google ScholarDigital Library
- [10] . 2019. Enhancing machine translation with dependency-aware self-attention. arXiv preprint arXiv:1909.03149 (2019).Google Scholar
- [11] . 2006. Context-based machine translation. In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, 19–28.Google Scholar
- [12] . 2019. Tagged back-translation. arXiv preprint arXiv:1906.06442 (2019).Google Scholar
- [13] . 2017. A teacher-student framework for zero-resource neural machine translation. arXiv preprint arXiv:1705.00753 (2017).Google Scholar
- [14] . 2018. Zero-resource neural machine translation with multi-agent communication game. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.Google ScholarCross Ref
- [15] . 2019. Joint training for pivot-based neural machine translation. In Joint Training for Neural Machine Translation. Springer, 41–54.Google ScholarCross Ref
- [16] . 2019. Semi-supervised learning for neural machine translation. In Joint Training for Neural Machine Translation. Springer, 25–40.Google ScholarCross Ref
- [17] . 2005. A hierarchical phrase-based model for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). 263–270.Google ScholarDigital Library
- [18] . 2020. Morphological segmentation to improve crosslingual word embeddings for low resource languages. ACM Trans. Asian Low-resour. Lang. Inf. Process. 19, 5 (2020), 1–15.Google ScholarDigital Library
- [19] . 2020. Reusing a pretrained language model on languages with limited corpora for unsupervised NMT. arXiv preprint arXiv:2009.07610 (2020).Google Scholar
- [20] . 2005. Clause restructuring for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). 531–540.Google ScholarDigital Library
- [21] . 2018. Explaining and generalizing back-translation through wake-sleep. arXiv preprint arXiv:1806.04402 (2018).Google Scholar
- [22] . 2019. Incorporating source syntax into transformer-based neural machine translation. In Proceedings of the 4th Conference on Machine Translation (Volume 1: Research Papers). 24–33.Google ScholarCross Ref
- [23] . 2017. Copied monolingual data improves low-resource neural machine translation. In Proceedings of the 2nd Conference on Machine Translation. 148–156.Google ScholarCross Ref
- [24] . 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).Google Scholar
- [25] . 2012. Large scale decipherment for out-of-domain machine translation. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 266–275.Google ScholarDigital Library
- [26] . 2018. Understanding back-translation at scale. arXiv preprint arXiv:1808.09381 (2018).Google Scholar
- [27] . 2013. Language independent connectivity strength features for phrase pivot statistical machine translation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 412–418.Google Scholar
- [28] . 2017. Data augmentation for low-resource neural machine translation. arXiv preprint arXiv:1705.00440 (2017).Google Scholar
- [29] . 2018. Back-translation sampling by targeting difficult words in neural machine translation. arXiv preprint arXiv:1808.09006 (2018).Google Scholar
- [30] . 2017. Multi-domain neural machine translation through unsupervised adaptation. In Proceedings of the 2nd Conference on Machine Translation. 127–137.Google ScholarCross Ref
- [31] . 2016. Zero-resource translation with multi-lingual neural machine translation. arXiv preprint arXiv:1606.04164 (2016).Google Scholar
- [32] . 2016. A theoretically grounded application of dropout in recurrent neural networks. Adv. Neural Inf. Process. Syst. 29 (2016), 1019–1027.Google Scholar
- [33] . 2016. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17, 1 (2016), 2096–2030.Google ScholarDigital Library
- [34] . 2019. Soft contextual data augmentation for neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5539–5544.Google ScholarCross Ref
- [35] . 2019. A survey of methods to leverage monolingual data in low-resource neural machine translation. arXiv preprint arXiv:1910.00373 (2019).Google Scholar
- [36] . 2018. Universal neural machine translation for extremely low resource languages. arXiv preprint arXiv:1802.05368 (2018).Google Scholar
- [37] . 2018. Meta-learning for low-resource neural machine translation. arXiv preprint arXiv:1808.08437 (2018).Google Scholar
- [38] . 2015. On using monolingual corpora in neural machine translation. arXiv preprint arXiv:1503.03535 (2015).Google Scholar
- [39] . 2019. Non-autoregressive neural machine translation with enhanced decoder input. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 3723–3730.Google ScholarDigital Library
- [40] . 2019. The FLoRes evaluation datasets for low-resource machine translation: Nepali-English and Sinhala-English. arXiv preprint arXiv:1902.01382 (2019).Google Scholar
- [41] . 2016. Dual learning for machine translation. Adv. Neural Inf. Process. Syst. 29 (2016), 820–828.Google Scholar
- [42] . 2018. Iterative back-translation for neural machine translation. In Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. 18–24.Google ScholarCross Ref
- [43] . 2019. Exploiting out-of-domain parallel data through multilingual transfer learning for low-resource neural machine translation. arXiv preprint arXiv:1907.03060 (2019).Google Scholar
- [44] . 2019. Filtered pseudo-parallel corpus improves low-resource neural machine translation. ACM Trans. Asian Low-resour. Lang. Inf. Process. 19, 2 (2019), 1–16.Google ScholarDigital Library
- [45] . 2013. Combining bilingual and comparable corpora for low resource machine translation. In Proceedings of the 8th Workshop on Statistical Machine Translation. 262–270.Google Scholar
- [46] . 2014. Hallucinating phrase translations for low resource MT. In Proceedings of the 18th Conference on Computational Natural Language Learning. 160–170.Google ScholarCross Ref
- [47] . 2016. End-to-end statistical machine translation with zero or small parallel texts. Nat. Lang. Eng. 22, 4 (2016), 517.Google ScholarCross Ref
- [48] . 2010. Head finalization: A simple reordering rule for SOV languages. In Proceedings of the Joint 5th Workshop on Statistical Machine Translation and Metrics. 244–251.Google Scholar
- [49] . 2020. Cross-lingual pre-training based transfer for zero-shot neural machine translation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 115–122.Google ScholarCross Ref
- [50] . 2020. Boosting neural machine translation with similar translations. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 1580–1590.Google Scholar
- [51] . 2017. Google’s multilingual neural machine translation system: Enabling zero-shot translation. Trans. Assoc. Computat. Ling. 5 (2017), 339–351.Google ScholarCross Ref
- [52] . 2018. Neural machine translation for low-resource languages without parallel corpora. Mach. Transl. 32, 1 (2018), 167–189.Google ScholarDigital Library
- [53] . 2020. Simulated multiple reference training improves low-resource machine translation. arXiv preprint arXiv:2004.14524 (2020).Google Scholar
- [54] . 2019. Effective cross-lingual transfer of neural machine translation models without shared vocabularies. arXiv preprint arXiv:1905.05475 (2019).Google Scholar
- [55] . 2019. Improving unsupervised word-by-word translation with language model and denoising autoencoder. arXiv preprint arXiv:1901.01590 (2019).Google Scholar
- [56] . 2019. Pivot-based transfer learning for neural machine translation between non-English languages. arXiv preprint arXiv:1909.09524 (2019).Google Scholar
- [57] . 2012. Toward statistical machine translation without parallel corpora. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. 130–140.Google ScholarDigital Library
- [58] . 2018. Trivial transfer learning for low-resource neural machine translation. arXiv preprint arXiv:1809.00357 (2018).Google Scholar
- [59] . 2003. Statistical Phrase-based Translation.
Technical Report . University of Southern California Marina Del Rey Information Sciences Inst.Google ScholarCross Ref - [60] . 2018. Transfer learning in multilingual neural machine translation with dynamic vocabulary. arXiv preprint arXiv:1811.01137 (2018).Google Scholar
- [61] . 2017. Unsupervised machine translation using monolingual corpora only. arXiv preprint arXiv:1711.00043 (2017).Google Scholar
- [62] . 2018. Phrase-based & neural unsupervised machine translation. arXiv preprint arXiv:1804.07755 (2018).Google Scholar
- [63] . 2017. Fully character-level neural machine translation without explicit segmentation. Trans. Assoc. Computat. Ling. 5 (2017), 365–378.Google ScholarCross Ref
- [64] . 2019. Unsupervised pivot translation for distant languages. arXiv preprint arXiv:1906.02461 (2019).Google Scholar
- [65] . 2019. Understanding data augmentation in neural machine translation: Two perspectives towards generalization. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 5693–5699.Google ScholarCross Ref
- [66] . 2020. MetaMT, a meta learning method leveraging multiple domain data for low resource machine translation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 8245–8252.Google ScholarCross Ref
- [67] . 2016. One sentence one model for neural machine translation. arXiv preprint arXiv:1609.06490 (2016).Google Scholar
- [68] . 2019. A survey of low resource neural machine translation. In Proceedings of the 4th International Conference on Mechanical, Control and Computer Engineering (ICMCCE). IEEE, 39–393.Google ScholarCross Ref
- [69] . 2021. Continual mixed-language pre-training for extremely low-resource neural machine translation. arXiv preprint arXiv:2105.03953 (2021).Google Scholar
- [70] . 2019. Hierarchical transfer learning architecture for low-resource neural machine translation. IEEE Access 7 (2019), 154157–154166.Google ScholarCross Ref
- [71] . 2015. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015).Google Scholar
- [72] . 2021. Improving data augmentation for low-resource NMT guided by POS-Tagging and paraphrase embedding. Trans. Asian Low-resour. Lang. Inf. Process. 20, 6 (2021), 1–21.Google ScholarDigital Library
- [73] . 2019. Multi-round transfer learning for low-resource NMT using multiple high-resource languages. ACM Trans. Asian Low-resour. Lang. Inf. Process. 18, 4 (2019), 1–26.Google ScholarDigital Library
- [74] . 2020. Tagged back-translation revisited: Why does it really work? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5990–5997.Google ScholarCross Ref
- [75] . 2017. Zero-resource machine translation by multimodal encoder–decoder network with multimedia pivot. Mach. Transl. 31, 1 (2017), 49–64.Google ScholarDigital Library
- [76] . 2017. Transfer learning across low-resource, related languages for neural machine translation. arXiv preprint arXiv:1708.09803 (2017).Google Scholar
- [77] . 2019. Data diversification: A simple strategy for neural machine translation. arXiv preprint arXiv:1911.01986 (2019).Google Scholar
- [78] . 2019. MacNet: Transferring knowledge from machine comprehension to sequence-to-sequence models. arXiv preprint arXiv:1908.01816 (2019).Google Scholar
- [79] . 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311–318.Google ScholarDigital Library
- [80] . 2013. How to choose the best pivot language for automatic translation of low-resource languages. ACM Trans. Asian Lang. Inf. Process. 12, 4 (2013), 1–17.Google ScholarDigital Library
- [81] . 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018).Google Scholar
- [82] . 2021. Meta back-translation. arXiv preprint arXiv:2102.07847 (2021).Google Scholar
- [83] . 2015. chrF: Character n-gram F-score for automatic MT evaluation. In Proceedings of the 10th Workshop on Statistical Machine Translation. 392–395.Google ScholarCross Ref
- [84] . 2019. Translating translationese: A two-step approach to unsupervised machine translation. arXiv preprint arXiv:1906.05683 (2019).Google Scholar
- [85] . 2017. Deciphering related languages. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2513–2518.Google ScholarCross Ref
- [86] . 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019), 9.Google Scholar
- [87] . 2011. Deciphering foreign language. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 12–21.Google ScholarDigital Library
- [88] . 2018. Triangular architecture for rare language translation. arXiv preprint arXiv:1805.04813 (2018).Google Scholar
- [89] . 2019. Explicit cross-lingual pre-training for unsupervised machine translation. arXiv preprint arXiv:1909.00180 (2019).Google Scholar
- [90] . 2020. A retrieve-and-rewrite initialization method for unsupervised machine translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 3498–3504.Google ScholarCross Ref
- [91] . 2019. Multilingual unsupervised NMT using shared encoder and language-specific decoders. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3083–3089.Google ScholarCross Ref
- [92] . 2015. Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709 (2015).Google Scholar
- [93] . 2016. Edinburgh neural machine translation systems for WMT 16. arXiv preprint arXiv:1606.02891 (2016).Google Scholar
- [94] . 2019. Revisiting low-resource neural machine translation: A case study. arXiv preprint arXiv:1905.11901 (2019).Google Scholar
- [95] . 2021. Better neural machine translation by extracting linguistic information from BERT. arXiv preprint arXiv:2104.02831 (2021).Google Scholar
- [96] . 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers. 223–231.Google Scholar
- [97] . 2019. Unsupervised bilingual word embedding agreement for unsupervised neural machine translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1235–1245.Google ScholarCross Ref
- [98] . 2014. Sequence to sequence learning with neural networks. arXiv preprint arXiv:1409.3215 (2014).Google Scholar
- [99] . 2017. Attention is all you need. arXiv preprint arXiv:1706.03762 (2017).Google Scholar
- [100] . 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion.J. Mach. Learn. Res. 11, 12 (2010).Google Scholar
- [101] . 2021. A survey on low-resource neural machine translation. arXiv preprint arXiv:2107.04239 (2021).Google Scholar
- [102] . 2018. SwitchOut: An efficient data augmentation algorithm for neural machine translation. arXiv preprint arXiv:1808.07512 (2018).Google Scholar
- [103] . 2020. Iterative domain-repaired back-translation. arXiv preprint arXiv:2010.02473 (2020).Google Scholar
- [104] . 2020. Acquiring knowledge from pre-trained model to neural machine translation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 9266–9273.Google ScholarCross Ref
- [105] . 2019. Improving neural machine translation with pre-trained representation. arXiv preprint arXiv:1908.07688 (2019).Google Scholar
- [106] . 2019. Extract and edit: An alternative to back-translation for unsupervised neural machine translation. arXiv preprint arXiv:1904.02331 (2019).Google Scholar
- [107] . 2016. Dual learning for machine translation. arXiv preprint arXiv:1611.00179 (2016).Google Scholar
- [108] . 2017. Data noising as smoothing in neural network language models. arXiv preprint arXiv:1703.02573 (2017).Google Scholar
- [109] . 2018. Unsupervised neural machine translation with weight sharing. arXiv preprint arXiv:1804.09057 (2018).Google Scholar
- [110] . 2013. Using context vectors in improving a machine translation system with bridge language. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 318–322.Google Scholar
- [111] . 2016. Exploiting source-side monolingual data in neural machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1535–1545.Google ScholarCross Ref
- [112] . 2017. Adversarial training for unsupervised bilingual lexicon induction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1959–1970.Google ScholarCross Ref
- [113] . 2017. Maximum expected likelihood estimation for zero-resource neural machine translation. In Proceedings of the International Joint Conference on Artificial Intelligence. 4251–4257.Google ScholarCross Ref
- [114] . 2019. Mirror-generative neural machine translation. In Proceedings of the International Conference on Learning Representations.Google Scholar
- [115] . 2019. Handling syntactic divergence in low-resource machine translation. arXiv preprint arXiv:1909.00040 (2019).Google Scholar
- [116] . 2016. Transfer learning for low-resource neural machine translation. arXiv preprint arXiv:1604.02201 (2016).Google Scholar
Index Terms
- Low-resource Neural Machine Translation: Methods and Trends
Recommendations
Neural Machine Translation for Low-resource Languages: A Survey
Neural Machine Translation (NMT) has seen tremendous growth in the last ten years since the early 2000s and has already entered a mature phase. While considered the most widely used solution for Machine Translation, its performance on low-resource ...
Extremely low-resource neural machine translation for Asian languages
AbstractThis paper presents a set of effective approaches to handle extremely low-resource language pairs for self-attention based neural machine translation (NMT) focusing on English and four Asian languages. Starting from an initial set of parallel ...
Morphologically Motivated Input Variations and Data Augmentation in Turkish-English Neural Machine Translation
Success of neural networks in natural language processing has paved the way for neural machine translation (NMT), which rapidly became the mainstream approach in machine translation. Significant improvement in translation performance has been achieved ...
Comments