Skip to main content
Log in

Neighbors helping the poor: improving low-resource machine translation using related languages

  • Published:
Machine Translation

Abstract

Sentence-level parallel data is essential for training machine translation systems. However, existing parallel data is extremely limited for thousands of languages. In order to increase the available parallel data for a low-resource language we borrow parallel data from a higher-resource closely related language (RL). In so doing we propose a method for translating texts from RL to the low-resource language without requiring any parallel data between them. We use this method to convert RL/English parallel data and use it as an extra resource for machine translation. We show that this extra parallel data highly helps the BLEU score.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Tuning with a held-out subset of training data results in lower BLEU scores in all the experiments but does not change the conclusions of this paper.

References

  • Chalamandaris A, Protopapas A, Tsiakoulis P, Raptis S (2006) All Greek to me! an automatic Greeklish to Greek transliteration system. In: Proceedings of the 5th international conference on language resources and evaluation (LREC’06), Genoa, Italy, pp 1226–1229

  • Chen Y, Liu Y, Cheng Y, Li V (2017) A teacher-student framework for zero-resource neural machine translation. In: Proceedings of the 55th annual meeting of the association for computational linguistics, vol 1, Long Papers. Vancouver, Canada, pp 1925–1935

  • Cicekli I (2002) A machine translation system between a pair of closely related languages. In: Proceedings of the 17th international symposium on computer and information sciences (ISCIS 2002), CRC Press, Orlando, Florida, pp 192–196

  • Conneau A, Lample G, Ranzato M, Denoyer L, Jégou H (2017) Word translation without parallel data. arXiv:1710.04087

  • Currey A, Karakanta A, Dehdari J (2016) Using related languages to enhance statistical language models. In: Proceedings of the NAACL student research workshop, San Diego, California, pp 116–123

  • Dou Q, Knight K (2012) Large scale decipherment for out-of-domain machine translation. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, Jeju Island, Korea, pp 266–275

  • Firat O, Sankaran B, Al-Onaizan Y, Yarman Vural FT, Cho K (2016) Zero-resource translation with multi-lingual neural machine translation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, pp 268–277

  • Forcada ML, Ginestí-Rosell M, Nordfalk J, O’Regan J, Ortiz-Rojas S, Pérez-Ortiz JA, Sánchez-Martínez F, Ramírez-Sánchez G, Tyers FM (2011) Apertium: a free/open-source platform for rule-based machine translation. Mach Transl 25(2):127–144

    Article  Google Scholar 

  • Fung P, Yee LY (1998) An IR approach for translating new words from nonparallel, comparable texts. In: Proceedings of the 36th annual meeting of the association for computational linguistics and 17th international conference on computational linguistics, Vol 1. Montreal, Quebec, Canada, pp 414–420

  • Goldhahn D, Eckart T, Quasthoff U (2012) Building large monolingual dictionaries at the Leipzig corpora collection: From 100 to 200 languages. In: Proceedings of the 8th international conference on language resources and evaluation (LREC-2012), Istanbul, Turkey, pp 759–765

  • Haghighi A, Liang P, Berg-Kirkpatrick T, Klein D (2008) Learning bilingual lexicons from monolingual corpora. In: Proceedings of ACL-08: HLT, Columbus, Ohio, pp 771–779

  • Hajič J, Hric J, Kuboň V (2000) Machine translation of very close languages. In: Proceedings of the 6th conference on applied natural language processing, Seattle, Washington, USA, pp 7–12

  • Hana J, Feldman A, Brew C, Amaral L (2006) Tagging Portuguese with a Spanish tagger using cognates. In: Proceedings of the international workshop on cross-language knowledge induction, Sydney, Australia, pp 33–40

  • Hitham AB, Shaalan K, Ziedan I (2008) A hybrid approach for converting written Egyptian colloquial dialect into diacritized Arabic. The 6th international conference on informatics and systems. Egypt, Cairo, pp 27–33

  • Irvine A (2013) Statistical machine translation in low resource settings. In: Proceedings of the 2013 NAACL HLT student research workshop, Atlanta, Georgia, pp 54–61

  • Irvine A, Callison-Burch C (2013) Combining bilingual and comparable corpora for low resource machine translation. In: Proceedings of the 8th workshop on statistical machine translation, Sofia, Bulgaria, pp 262–270

  • Johnson M, Schuster M, Le QV, Krikun M, Wu Y, Chen Z, Thorat N, Viégas F, Wattenberg M, Corrado G et al (2017) Google’s multilingual neural machine translation system: enabling zero-shot translation. Trans Assoc Comput Linguist 5:339–351

    Article  Google Scholar 

  • Karakanta A, Dehdari J, van Genabith J (2018) Neural machine translation for low-resource languages without parallel corpora. Mach Transl 32(1):1–23

    Google Scholar 

  • Knight K, Nair A, Rathod N, Yamada K (2006) Unsupervised analysis for decipherment problems. In: Proceedings of the COLING/ACL 2006 main conference poster sessions, Sydney, Australia, pp 499–506

  • Koehn P, Knight K (2002) Learning a translation lexicon from monolingual corpora. In: Proceedings of the ACL-02 workshop on unsupervised lexical acquisition, Philadelphia, Pennsylvania, USA, pp 9–16

  • Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: Open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of the demo and poster sessions, Prague, Czech Republic, pp 177–180

  • Kondrak G, Marcu D, Knight K (2003) Cognates can improve statistical translation models. In: Companion volume of the proceedings of HLT-NAACL 2003—short papers, Edmonton, Canada, pp 46–48

  • Lample G, Denoyer L, Ranzato M (2017) Unsupervised machine translation using monolingual corpora only. arXiv:1711.00043

  • Larasati SD, Kuboň V (2010) A study of Indonesian-to-Malaysian MT system. In: Proceedings of the 4th international MALINDO workshop, Depok, Indonesia, pp 16–22

  • Liu CH, Silva CC, Wang L, Way A (2018) Pivot machine translation using Chinese as pivot language. In: CWMT 2018: Proceedings of the 14th China workshop on machine translation, Wuyishan, China, pp 1–12

  • Mann GS, Yarowsky D (2001) Multipath translation lexicon induction via bridge languages. In: Second Meeting of the North American Chapter of the Association for Computational Linguistics, Pittsburgh, Pennsylvania, USA, 8pp

  • May J, Benjira Y, Echihabi A (2014) An Arabizi-English social media statistical machine translation system. In: Proceedings of the 11th conference of the association for machine translation in the Americas, Vancouver, British Columbia, Canada, pp 329–341

  • Naim I, Riley P, Gildea D (2018) Feature-based decipherment for machine translation. Comput Linguist 44(3):525–546

    Article  MathSciNet  Google Scholar 

  • Nakov P, Ng HT (2009) Improved statistical machine translation for resource-poor languages using related resource-rich languages. In: Proceedings of the 2009 conference on empirical methods in natural language processing, Singapore, pp 1358–1367

  • Nakov P, Tiedemann J (2012) Combining word-level and character-level models for machine translation between closely-related languages. In: Proceedings of the 50th annual meeting of the association for computational linguistics, Vol 2, Short Papers. Jeju Island, Korea, pp 301–305

  • Nuhn M, Mauser A, Ney H (2012) Deciphering foreign language by combining language models and context vectors. In: Proceedings of the 50th annual meeting of the association for computational linguistics, vol 1, Long Papers. Jeju Island, Korea, pp 156–164

  • Och FJ (2003) Minimum error rate training in statistical machine translation. In: Proceedings of the 41st annual meeting of the association for computational linguistics, Sapporo, Japan, pp 160–167

  • Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1):19–51

    Article  Google Scholar 

  • Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the ACL-2002 40th annual meeting of the association for computational linguistics, Philadelphia, pp 311–318

  • Passban P, Liu Q, Way A (2017) Translating low-resource languages by vocabulary adaptation from close counterparts. ACM Trans Asian Low-Resour Lang Inf Process 16(4):29. https://doi.org/10.1145/3099556

    Article  Google Scholar 

  • Pourdamghani N, Knight K (2017) Deciphering related languages. In: Proceedings of the 2017 conference on empirical methods in natural language processing, Copenhagen, Denmark, pp 2513–2518

  • Rapp R (1995) Identifying word translations in non-parallel texts. In: Proceedings of the 33rd annual meeting of the association for computational linguistics, Cambridge, Massachusetts, USA, pp 320–322

  • Ravi S (2013) Scalable decipherment for machine translation via hash sampling. In: Proceedings of the 51st annual meeting of the association for computational linguistics, vol 1, Long Papers. Sofia, Bulgaria, pp 362–371

  • Ravi S, Knight K (2009) Learning phoneme mappings for transliteration without parallel data. In: Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the association for computational linguistics, Boulder, Colorado, pp 37–45

  • Ravi S, Knight K (2011a) Bayesian inference for Zodiac and other homophonic ciphers. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, Portland, Oregon, USA, pp 239–247

  • Ravi S, Knight K (2011b) Deciphering foreign language. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, Portland, Oregon, USA, pp 12–21

  • Salloum W, Habash N (2011) Dialectal to standard Arabic paraphrasing to improve Arabic–English statistical machine translation. In: Proceedings of the 1st workshop on algorithms and resources for modelling of dialects and language varieties, Edinburgh, Scotland, pp 10–21

  • Sawaf H, (2010) Arabic dialect handling in hybrid machine translation. In: Proceedings of the 2010 AMTA, 9th conference of the association for machine translation in the Americas. Denver, Colorado, p 8

  • Scannell KP (2006) Machine translation for closely related language pairs. In: Proceedings of the LREC workshop on strategies for developing machine translation for minority languages, Genoa, Italy, pp 103–109

  • Smith JR, Quirk C, Toutanova K (2010) Extracting parallel sentences from comparable corpora using document level alignment. In: Proceedings of the Human Language Technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics. Los Angeles, California, pp 403–411

  • Tiedemann J (2009) Character-based PSMT for closely related languages. In: Proceedings of the 13th conference of the European association for machine translation, Barcelona, Spain, pp 12–19

  • Tiedemann J (2012) Parallel data, tools and interfaces in OPUS. In: Proceedings of the 8th international conference on language resources and evaluation (LREC-2012), Istanbul, Turkey, pp 2214–2218

  • Utiyama M, Isahara H (2007) A comparison of pivot methods for phrase-based statistical machine translation. In: Proceedings of the Main Conference, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics.Rochester, New York, pp 484–491

  • Vilar D, Peter JT, Ney H (2007) Can we translate letters? In: Proceedings of the 2nd workshop on statistical machine translation, Prague, Czech Republic, pp 33–39

  • Wu H, Wang H (2007) Pivot language approach for phrase-based statistical machine translation. In: Proceedings of the 45th annual meeting of the association of computational linguistics, Prague, Czech Republic, pp 856–863

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by DARPA Contract HR0011-15-C-0115. The authors would like to thank Marjan Ghazvininejad, Ulf Hermjakob, Jonathan May, and Michael Pust for their comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nima Pourdamghani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is a significant extension to Pourdamghani and Knight (2017). The cipher model (Sects. 4.15.1, and 6 ) and the evaluation of the RL to IL translation accuracy (Sect. 8.1) are initially presented in Pourdamghani and Knight (2017). Other sections including description of the language models and their training as well as the idea of converting the parallel data, methods for combining converted and original parallel data and machine translation experiments are presented in this paper for the first time.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pourdamghani, N., Knight, K. Neighbors helping the poor: improving low-resource machine translation using related languages. Machine Translation 33, 239–258 (2019). https://doi.org/10.1007/s10590-019-09236-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-019-09236-7

Keywords

Navigation