Skip to main content
Log in

A review of Thai–English machine translation

  • Published:
Machine Translation

Abstract

The improvement of machine translation (MT) for languages such as Thai requires access to knowledge reported in past and current research. With the distinctive features of several Asian languages as exhibited by Thai, and the recent change in focus of MT to a neural network-based approach, researchers require knowledge of understanding of these languages to aid further research. The purpose of this study is to provide an overview of the significant research in Thai–English MT that is both a valuable reference for current researchers in the field, as well as being suitable for the non-expert. We include details of the relevant language characteristics and extensive coverage of the important contributions to MT in Thailand. Although the application of neural networks to translation (called ‘neural machine translation’ (NMT)) is rapidly evolving and not widely reported in academic work involving Thai translation, it has shown potential for languages that require segmentation or have few resources and is therefore of special interest. Translation techniques are in many cases not directly applicable to many Asian languages because of their linguistic features and versatile writing systems, but NMT is already in widespread use in industry for the translation of Asian languages including Thai. Given this relevant success and potential for Thai translation, the aim is to integrate this important area into current literature on Thai–English MT, and to encourage interest and support advancement.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Adapted from Haruechaiyasak et al. (2008)

Similar content being viewed by others

Notes

  1. Overview. History of the Thai language. https://www.thai-language.com/ref/Overview Accessed 14 June 2018.

  2. Inside a Thai Syllable: Part I. https://www.thai-language.com/id/830221 Accessed 7th July 2020.

  3. As Way and Hearne (2011, p.234) point out, “In seeking to understand SMT in particular, this is a key distinction; while the means by which RBMT and EBMT generate translations usually look somewhat plausible to linguists and translators, the methods of translation generation in SMT are not intuitively plausible. In fact, the methods used—at least until the recent attempts to incorporate syntactic knowledge into PB-SMT—are not intended to be either linguistically or cognitively plausible (just probabilistically plausible), and holding onto the notion that they somehow are or should be simply hinders understanding of SMT” (original emphasis).

  4. https://www.nist.gov/itl/iad/mig/openmt15-evaluation Accessed 14 June 2018.

  5. https://www.youtube.com/watch?v=nR74lBO5M3s&t Accessed 7 June 2018.

  6. https://www.cs.cmu.edu/~paisarn/software.html.

  7. https://sites.google.com/site/iwsltevaluation2015/mt-track Accessed 11 June 2018.

  8. https://sealang.net/thai/bitext.htm.

  9. Network-Based ASEAN Languages. https://www.aseanmt.org/index.php?q=index/status_update Accessed 25 June 2018.

  10. https://sites.google.com/site/iwsltevaluation2015/mt-track.

  11. https://cloud.google.com/translate/docs/languages#languages-nmt Accessed 7 June 2018.

  12. Microsoft Translator now offers more accurate and human-like translations of Thai. https://www.nationthailand.com/Startup_and_IT/30343651 Published on April 21, 2018, written by ThaiVisa. Accessed 6 June 2018.

  13. Baidu Thailand reveals marketing platform. https://www.bangkokpost.com/tech/local-news/1208193/baidu-thailand-reveals-marketing-platfor.

    Published on 3/3/2017, written by Srisamorn Phoosuphanusorn. Accessed 7 June 2018.

  14. South Korea's Naver launches official version of AI translator ‘Papago’. https://www.nationthailand.com/Startup_and_IT/30321483.

    Published on July 22, 2017, written by The Korea Herald/ANN. Accessed 9 December 2018.

  15. Explorations on Multi-lingual Neural Machine Translation. https://www.youtube.com/watch?v=dOewfz19dMA Accessed 7 June 2018.

  16. https://wordseg.readthedocs.io/en/latest/.

  17. https://pioneer.chula.ac.th/~awirote/resources/thai-word-segmentation.html Accessed 5 January 2019.

  18. cf. https://moin.delph-in.net/GlennSlayden Accessed 15th July 2020.

  19. Deepcut. A Thai Word Tokenization Library using Deep Neural Network. https://github.com/rkcosmos/deepcut Accessed 11 June 2018.

  20. Thai Word Segmentation with Bi-Directional RNN. https://github.com/sertiscorp/thai-word-segmentation Accessed 13 June 2018.

  21. Thai Word Segmentation with Bi-Directional RNN. https://www.sertiscorp.com/november-20-2017 Accessed 13 June 2018.

  22. Thai Natural Language Processing (Thai NLP) Resource. https://github.com/kobkrit/nlp_thai_resources Accessed 7th July 2020.

  23. CutKum. Thai Word-Segmentation with Deep Learning in Tensorflow. https://github.com/pucktada/cutkum Accessed 11 June 2018.

  24. https://github.com/KenjiroAI/SynThai Accessed 11 June 2018.

  25. https://www.ted.com/.

  26. https://opus.nlpl.eu/OpenSubtitles-v2018.php.

  27. https://www.opensubtitles.org/en/search/subs.

References

  • Aroonmanakun W (2002) Collocation and Thai word segmentation. In: Proceedings of SNLP-oriental COCOSDA, Hua Hin, Thailand, pp 68–75

  • Aroonmanakun W (2007) Thoughts on word and sentence segmentation in Thai. In: Proceedings of the seventh symposium on natural language processing, Pattaya, Thailand, pp 85–90

  • Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473

  • Bheganan P, Nayak R, Xu Y (2009) Thai word segmentation with hidden Markov model and decision tree. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining, Bangkok, Thailand. Springer-Verlag, Berlin, Heidelberg, pp 74–85

  • Boonkwan P, Kawtrakul A (2002) Plaesarn: machine-aided translation tool for English-to-Thai. In: Proceedings of the 2002 COLING workshop on machine translation in Asia, Taipei, Taiwan, 7pp

  • Boonkwan P, Supnithi T (2017) Bidirectional deep learning of context representation for joint word segmentation and POS tagging. In: Proceedings of international conference on computer science, applied mathematics and applications. Springer, Cham, Switzerland, pp 184–196

  • Brown P, Della Pietra SA, Della Pietra VJ, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Ling 19(2):263–311

    Google Scholar 

  • Carl M, Way A (eds) (2003) Recent advances in example-based machine translation. Kluwer Academic Publishers, Dordrecht, The Netherlands

    MATH  Google Scholar 

  • Castano A, Casacuberta F (1997) A connectionist approach to machine translation. In: Proceedings of EUROSPEECH-1997: Fifth European conference on speech communication and technology, Rhodes, Greece, pp 91–94

  • Cettolo M, Girardi C, Federico M (2012) WIT3: web inventory of transcribed and translated talks. In: Proceedings of the 16th conference of European Association for Machine Translation (EAMT), Trento, Italy, pp 261–268

  • Chancharoen K, Tannin N, Sirinaovakul B (1999) Pattern-based machine translation for English-Thai. In: Proceedings of the 13th Pacific Asia conference on language, information and computation. Taiwan, China, pp 329–336

  • Charoenpornsawat P, Schultz T (2008) Improving word segmentation for Thai speech translation. In: Proceedings of IEEE spoken language technology workshop, Goa, India, pp 241–244

  • Charoenpornsawat P, Sornlertlamvanich V (2001) automatic sentence break disambiguation for Thai. In: Proceedings of international conference on computer processing of oriental languages (ICCPOL), Seoul, South Korea, pp 231–235

  • Charoenpornsawat P, Sornlertlamvanich V, Charoenporn T (2002) Improving translation quality of rule-based machine translation. In: Proceedings of the 2002 COLING workshop on machine translation in Asia, Taipei, Taiwan, 6pp

  • Chen MX, Firat O, Bapna A, Johnson M, Macherey W, Foster G, Jones L, Parmar N, Schuster M, Chen Z, Wu Y, Hughes M (2018) The best of both worlds: combining recent advances in neural machine translation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, pp 76–86

  • Cheng Y, Shen S, He Z, He W, Wu H, Sun M. , Liu Y (2015) Agreement-based joint training for bidirectional attention-based neural machine translation. arXiv:1512.04650

  • Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. In: Proceedings of ACL-2005: 43rd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Michigan, pp 263–270

  • Chimsuk T, Auwatanamongkol S (2009) A Thai to English machine translation system using Thai LFG tree structure as Interlingua. World Acad Sci Eng Technol Int J Math Comput Phys Electr Comput Eng 3:1134–1139

    Google Scholar 

  • Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, pp 1724–1734

  • Chung J, Cho K, Bengio Y (2016) A character-level decoder without explicit segmentation for neural machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers) Association for Computational Linguistics, Berlin, Germany, pp 1692–1703

  • Coughlin R, Setthawong R, Setthawong P (2018) An improved English-Thai translation framework for non-timing aligned parallel corpora using bleualign with explicit feedback. In: Proceedings of the 10th international conference on advances in information technology, Bangkok, Thailand, 8pp

  • Denkowski M, Lavie A (2010) Choosing the right evaluation for machine translation: an examination of annotator and automatic metric performance on human judgment tasks. In: Proceedings of AMTA 2010: the Ninth conference of the Association for Machine Translation in the Americas, Denver, Colorado, 9pp

  • Dong D, Wu H, He W, Yu D, Wang H (2015) Multi-task learning for multiple language translation. In: Proceedings of the 53rd Annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), Beijing, China, pp 1723–1732

  • Finch A, Liu L, Wang X, Sumita E (2016) Target-bidirectional neural models for machine transliteration. In: Proceedings of the sixth named entity workshop, Berlin, Germany, pp 78–82

  • Firat O, Cho K, Bengio Y (2016) Multi-way, multilingual neural machine translation with a shared attention mechanism. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. San Diego, California, pp 866–875

  • Forcada ML, Ñeco RP (1997) Recursive Hetero-associative memories for translation. In: Biological and artificial computation: from neuroscience to technology (international work-conference on artificial neural networks (IWANN'97), proceedings), Lanzarote, Canary Islands, Spain Springer, Berlin, Heidelberg, pp 453–462.

  • Goodman K, Nirenburg S (eds) (1991) The KBMT project: a case study in knowledge-based machine translation. Morgan Kaufmann, Burlington, Massachusetts

    Google Scholar 

  • Haruechaiyasak C, Kongyoung S (2009) TLex: Thai lexeme analyser based on the conditional random fields. In: InterBEST 2009 Thai word segmentation workshop, proceedings of 8th international symposium on natural language processing, Bangkok, Thailand, pp 13–17

  • Haruechaiyasak C, Kongyoung S, Dailey M (2008) A comparative study on thai word segmentation approaches. In: Proceedings of 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, Krabi, Thailand, pp 125–128

  • Haruechaiyasak C, Sangkeettrakarn C, Palingoon P, Kongyoung S, Damrongrat C (2006) A collaborative framework for collecting Thai unknown words from the web. In: Proceedings of the COLING/ACL Main Conference Poster Sessions, Sydney, Australia, pp 345–352.

  • Hutchins WJ (1995) Machine Translation: A Brief History. In: Koerner EFK, Asher RE (eds) Concise history of the language sciences: from the Sumerians to the cognitivists. Pergamon Press, Oxford, UK, pp 431–445

    Chapter  Google Scholar 

  • Isozaki H, Hirao T, Duh K, Sudoh K, Tsukada H (2010) Automatic evaluation of translation quality for distant language pairs. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, MA, pp 944–952

  • Jean S, Cho K, Memisevic R, Bengio Y (2015) On using very large target vocabulary for neural machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), Beijing, China, pp 1–10

  • Johnson M, Schuster M, Le QV, Krikun M, Wu Y, Chen Z, Thorat N, Viégas F, Wattenberg M, Corrado G, Hughes M (2017) Google's multilingual neural machine translation system: enabling zero-shot translation. Trans Assoc Comput Ling 5:339–351

    Google Scholar 

  • Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference on empirical methods in natural language processing, USA, pp 1700–1709

  • Kampanya N, Boonkwan P, Kawtrakul A (2002) Bilingual unknown word alignment tool for English-Thai. In: Proceedings of the joint international conference of SNLP-Oriental COCOSDA 2002: the fifth symposium on natural language processing & the fifth oriental COCOSDA Workshop, Hua Hin, Prachuapkirikhan, Thailand, 7pp

  • Kaplan RM, Netter K, Wedekind J, Zaenen A (1989) Translation by structural correspondences. In: Proceedings of the fourth conference of the European chapter of the association for computational linguistics, Manchester, England, pp 272–281

  • Kawtrakul A, Boonkwan P (2004) An integrated tool for translation-memory maintenance. https://pdfs.semanticscholar.org/7420/f88b7b94cffc504c9fd3faad5f43836e8fa8.pdf?_ga=2.142574834.1106393232.1594816470-1712962596.1567069468

  • Kawtrakul A, Kumtanode S, Jamjunya T, Jewriyavech A (1995) A Lexibase model for writing production assistant system. In: Proceedings of the second symposium on natural language processing, Bangkok, Thailand, pp 226–236

  • Kawtrakul A, Praneetpolgrang P (2014) A history of AI research and development in Thailand: three periods. Three Dir AI Mag 35(2):83–92

    Google Scholar 

  • Kawtrakul A, Suktarachan M, Varasai P, Chanlekha H (2002) A state of the art of Thai language resources and Thai language behavior analysis and modeling. In: Proceedings of COLING-02: The 3rd workshop on Asian language resources and international standardization, Taipei, Taiwan, 8pp

  • Kazimi MB (2017) Coverage model for character-based neural machine translation. Master's thesis, Universitat Politècnica de Catalunya, Barcelona, Spain

  • Khankasikam K, Muansuwan N (2005). Thai word segmentation a lexical semantic approach. In: Proceedings of the Tenth Machine Translation Summit, Phuket, Thailand, pp 331–338

  • Kim Y, Rush AM (2016) Sequence-level knowledge distillation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, pp 1317–1327

  • Kit C, Wong TM (2008) Comparative evaluation of online machine translation systems with legal texts. Law Libr J 100:299–321

    Google Scholar 

  • Klaithin S, Kriengket K, Phaholphinyo S, Kosawat K (2011) Thai word segmentation verification tool. In: Proceedings of the 2nd Workshop on South Southeast Asian Natural Language Processing (WSSANLP), Chiang Mai, Thailand, pp 16–22

  • Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) OpenNMT: open-source toolkit for neural machine translation. arXiv:1701.02810

  • Koehn P (2004) Pharaoh: a beam search decoder for phrase-based statistical machine translation models. In: Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Proceedings, Washington, DC, Springer Verlag, Berlin, pp 115–124

  • Koehn P (2009) Statistical machine translation. Cambridge University Press, Cambridge, UK

    Book  Google Scholar 

  • Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, Prague, Czech Republic, pp 177–180

  • Kongyoung S, Rugchatjaroen A, Kosawat K (2015) TLex+: a hybrid method using conditional random fields and dictionaries for Thai word segmentation. In: International conference on knowledge, information, and creativity support systems, Phuket. Springer, Cham, pp 112–125

  • Kosawat K, Boriboon M, Chootrakool P, Chotimongkol A, Klaithin S, Kongyoung S, Kriengket K, Phaholphinyo S, Purodakananda S, Thanakulwarapas T, Wutiwiwatchai C (2009) BEST 2009: Thai word Segmentation software contest. In: Proceedings of the Eighth International Symposium on Natural Language Processing, Bangkok, Thailand, pp 83–88

  • Kritsuthikul N, Thammano A, Supnithi T (2006) English–Thai example-based machine translation using n-gram model. In: Proceedings of the 2006 IEEE International Conference on Systems, Man and Cybernetics, Taipei, Taiwan, pp 4386–4390

  • Kruengkrai C, Sornlertlamvanich V, Isahara H (2006) A conditional random field framework for Thai morphological analysis. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy, pp 2419–2424

  • Kruengkrai C, Uchimoto K, Kazama J, Torisawa K, Isahara H, Jaruskulchai C (2009) A word and Character-cluster hybrid model for Thai Word segmentation. In: Proceedings of the Eighth International Symposium on Natural Language Processing, Bangkok, Thailand, pp 24–29

  • Labutsri N, Chamchong R, Booth R, Rodtook A (2009) English syntactic reordering for English-Thai phrase-based statistical machine translation. In: Proceedings of the 6th International Joint Conference on Computer Science and Software Engineering (JCSSE 2009), Phuket, Thailand, pp 360–366

  • Lee HG, Lee J, Kim JS, Lee CK (2015) NAVER machine translation system for WAT 2015. In: Proceedings of the 2nd workshop on asian translation (WAT2015), Kyoto, Japan, pp 69–73

  • Lee J, Cho K, Hofmann T (2017) Fully character-level neural machine translation without explicit segmentation. Trans Assoc Comput Ling 5:365–378

    Google Scholar 

  • Lertpiya A, Chaiwachirasak T, Maharattanamalai N, Lapjaturapit T, Chalothorn T, Tirasaroj N, Chuangsuwanich E (2018) A preliminary study on fundamental Thai NLP tasks for user-generated web content. In: Proceedings of the 2018 International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), Pattaya, Thailand, 8pp

  • Limcharoen P, Nattee C, Theeramunkong T (2009) Thai word Segmentation based-on GLR parsing technique and word n-gram model. In: Proceedings of the Eighth International Symposium on Natural Language Processing, Bangkok, Thailand, 7pp

  • Lommel A (2018) Metrics for translation quality assessment: a case for standardising error typologies. In: Moorkens, J., Castilho, S., Gaspari, F., Doherty, S. (Eds.) Translation quality assessment, Springer, Cham, Switzerland

  • Luekhong P, Limkonchotiwat P, Ruangrajitpakorn T (2019) A study on an effect of using deep learning in Thai-English machine translation processes. In: Proceedings of the 2019 14th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), Chiang Mai, Thailand, 6pp

  • Luekhong P, Ruangrajitpakorn T, Sukhahuta R, Supnithi T (2016) A study of a Thai-English translation comparing on applying phrase-based and hierarchical phrase-based translation. In: Proceedings of the International Symposium on Natural Language Processing, Advances in Natural Language Processing, Intelligent Informatics and Smart Technology, Springer International Publishing, Cham, Switzerland, pp 38–48

  • Luekhong P, Ruangrajitpakorn T, Sukhahuta R, Supnithi T (2017) A framework of 2-step bilingual alignment for SMT: in Case Study of Thai-English Translation. Chiang Mai University Library Journal Article, Chiang Mai, Thailand: https://cmuir.cmu.ac.th/jspui/handle/6653943832/57064

  • Luekhong P, Ruangrajitpakorn T, Supnithi T, Sukhahuta R (2013) Pooja: similarity-based bilingual word alignment framework for SMT. In: Proceedings of the 10th International Symposium on Natural Language Processing, Phuket, Thailand, pp 199–204

  • Luekhong P, Sukhahuta R, Porkaew P, Ruangrajitpakorn T, Supnithi T (2012) A comparative study on applying hierarchical phrase-based and phrase-based on Thai-Chinese Translation. In: Proceedings of the 2012 Seventh International Conference on Knowledge, Information and Creativity Support Systems, Melbourne, Australia, pp 126–133

  • Luong MT, Pham H, Manning CD (2015a) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, pp 1412–1421

  • Luong MT, Sutskever I, Le QV, Vinyals O, Zaremba W (2015b) addressing the rare word problem in neural machine translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China, pp 11–19

  • Lyons S (2016) A survey of the use of mobile technology and translation tools by students at secondary school in Thailand. Payap University Journal 26(1):35–57

    Google Scholar 

  • Lyons S (2016b) Quality of Thai to English machine translation. In: Pacific Rim Knowledge Acquisition Workshop, Phuket, Thailand, pp 261–270. Springer International Publishing, Cham, Switzerland

  • Lyons S (2020) Comparison of neural network and traditional Thai word segmentation systems. In: Proceedings of the Payap University Research Symposium 2020, Chiang Mai, Thailand, pp 593–604

  • Mahatthanachai C, Malaivongs K, Tantranont N (2016) Thai word segmentation technique for solving unknown words and ambiguous words using rules-based and surrounding contextual clues. วารสาร เทคโนโลยี อุตสาหกรรม มหาวิทยาลัย ราชภัฏ อุบลราชธานี 6(1): 1–15

  • Mai K, Sukhahuta R, Luekhong P (2014) Thai-English phrase-based statistical machine translation (in Thai). In: Proceedings of the 18th International Computer Science and Engineering Conference (ICSEC2014), Khon Kaen, Thailand, 9pp

  • Meechoonuk M, Rakchonlatee S (2001) An analysis of text translated by machine. MA Thesis, School of Language and Communication, NIDA, Bangkok, Thailand

  • Meknavin S Charoenpornsawat P and Kijsirikul B (1997) Feature-based Thai word segmentation. In: Proceedings of Natural Language Processing Pacific Rim Symposium, Phuket, Thailand, pp 41–46

  • Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: 1st International Conference on Learning Representations, ICLR 2013, Workshop Track Proceedings, Scottsdale, Arizona, 12pp

  • Mittrapiyanuruk P, Sornlertlamvanich V (2000) The automatic Thai sentence extraction. In: Proceedings of the fourth symposium on Natural Language Processing, Chiang Mai, Thailand, pp 23–28

  • Modhiran T, Kosawat K, Klaithin S, Boriboon M, Supnithi T (2005) PARSITTE: online Thai-english machine translation. In: Proceedings of MT Summit X, Phuket, Thailand, pp 13–15

  • Nagao M (1984) A framework of a mechanical translation between Japanese and English by analogy principle, Artificial and human intelligence: edited review papers presented at the international NATO Symposium, October 1981, Lyons, France; A. Elithorn and R. Banerji (Eds) North Holland, Amsterdam, pp 173–180

  • Nararatwong R, Kertkeidkachorn N, Cooharojananone N, Okada H (2018) Improving Thai word and sentence segmentation using linguistic knowledge. IEICE Trans Inf Syst 101(12):3218–3225

    Article  Google Scholar 

  • Nathalang S, Porkeaw P, Supnithi T (2010) Don’t use big words with me: an evaluation of English-Thai statistical-based machine translation. In: Proceedings of the International Symposium on Using Corpora in Contrastive and Translation Studies (UCCTS2010), Ormskirk, UK, 19pp

  • Ñeco RP, Forcada ML (1997) Asynchronous translations with recurrent neural nets. In: Proceedings of International Conference on Neural Networks (ICNN’97), Houston, USA, pp 2535--2540

  • Netisopakul P, Keawwan K (2007) Thai sentence segmentation using M-ATN. In: Proceedings of the 7th International Symposium on Natural Language Processing (SNLP 2007), Pattaya, Thailand, 2007, pp 91–96

  • Netjinda N, Facundes N, Sirinaovakul B (2009) Toward statistical machine translation for Thai and English. In: Proceedings of the International Symposium On Digital Libraries, Albuquerque, New Mexico, USA, pp 27–28

  • Niu X, Denkowski M, Carpuat M (2018) Bi-Directional neural machine translation with synthetic parallel data. In: Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, Melbourne, Australia, pp 84–91

  • Nomponkrang T, Sanrach C (2016) The comparison of algorithms for Thai-Sentence Classification. Int J Inf Educ Techn 6(10):801–808

    Google Scholar 

  • Noyunsan C, Haruechaiyasak C, Poltree S, Saikeaw KR (2014) A multi-aspect comparison and evaluation on Thai word segmentation programs. In: Proceedings of the Joint International Conference of Semantic Technology (JIST) (Workshops & Posters), Chiang Mai, Thailand, pp 132--135

  • Nusai C, Suzuki Y, Yamazaki H (2007) Method based on EM algorithm for estimating word translation probabilities in Thai–english machine translation. In: Proceedings of the 9th WSEAS international conference on data networks, communications, computers. World scientific and engineering academy and society (WSEAS), pp 407–412

  • Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Ling 29(1):19–51

    Article  Google Scholar 

  • Oupatcha T, Thammakoranonta N (2014) English-Thai translating algorithm by subject category using neural network. In: Proceedings of the International Conference on Challenges in IT, Engineering and Technology (ICCIET’2014), Phuket, Thailand, pp 57–59

  • Pa WP, Thu YK, Finch A, Sumita E (2016) A study of statistical machine translation methods for under resourced languages. Procedia Comp Sci 81:250–257

    Article  Google Scholar 

  • Papineni P, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method of automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadephia, PA, pp 311–318

  • Phaholphinyo S, Modhiran T, Kritsuthikul N, Supnithi T (2005) A practical of memory-based approach for improving accuracy of MT. In: Proceedings of MT Summit X, Phuket, Thailand, pp 41–46

  • Phodong K, Kongkachandra R (2016) Improvement of word alignment in Thai-English statistical machine translation by grammatical attributes identification. In: Proceedings of the 8th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Ploiesti, Romania, 4pp

  • Poncelas A, Pidchamook W, Liu C-H, Way A (2020) Multiple segmentations of Thai sentences for neural machine translation. In: SLTU-CCURL 2020: Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages and Collaboration and Computing for Under-Resourced Languages, Marseille, France, pp 240–244

  • Pooworawan Y (1986) Dictionary-based thai syllable separation. In: Proceedings of the ninth electrical engineering conference (EECON-86), Thailand, pp 409–418

  • Porkaew P, Ruangrajitpakorn T, Trakultaweekoon K, Supnithi T (2001) Translation of noun phrase from English to Thai using phrase-based SMT with CCG reordering rules. In: Proceedings of the Conference of the Pacific Association for Computational Linguistics (PACLING 2009), Sapporo, Japan, 5pp

  • Porkeaw P, Supnithi T, Wutiwiwatchai C (2008) Statistical machine translation for Thai-English Electronic Translator (in Thai). NECTEC Technical Journal, NECTEC-ACE2008 Special Edition, Thailand, 7pp

  • Prasomsuk S, Mol P (2017) Thai to Khmer rule-based machine translation using reordering word to phrase. Glob J Comp Sci Technol 17(3):223–227

    Google Scholar 

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, Burlington, Massachusetts

    Google Scholar 

  • Ruangrajitpakorn T, na Chai W, Boonkwan P, Boriboon M, Supnithi T (2007) The design of lexical information for Thai to English MT. In: Proceedings of 7th International Symposium on Natural Language Processing (SNLP 2007), Pattaya, Thailand, 7pp

  • Saetia C, Chuangsuwanich E, Chalothorn T, Vateekul P (2019) Semi-supervised Thai Sentence segmentation using local and distant word representations. arXiv:1908.01294

  • Seljan S, Brkic M, Vicic T (2012) BLEU Evaluation of machine-translated English-Croatian legislation. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 12), Istanbul, Turkey, pp 2143–2148

  • Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, pp 1715–1725

  • Shen S, Cheng Y, He Z, He W, Wu H, Sun M, Liu Y (2015) Minimum risk training for neural machine translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Beijing, China, pp 1683–1692

  • Slayden G, Hwang MY, Schwartz L (2010a) Thai sentence-breaking for large-scale SMT. In: Proceedings of the 1st Workshop on South and Southeast Asian Natural Language Processing, Beijing, China, pp 8–16

  • Slayden G, Hwang MY, Schwartz L (2010b) Large-Scale Thai statistical machine translation (Vol. 41). MSR-TR-2010. Redmond: Microsoft Corporation. https://www.microsoft.com/en-us/research/publication/large-scale-thai-statistical-machine-translation

  • Slayden G, Luqman E (2010) Derivative sentence breaking for moore alignment. Available from https://www.thai-language.com/ref/breaking-words

  • Sornlertlamvanich V (1993) Word segmentation for Thai in machine translation system machine translation. National electronics and computer technology center, Bangkok, Thailand, pp 50–56

  • Sornlertlamvanich V, Charoenporn T, Isahara H (1997) ORCHID: Thai part-of-speech tagged corpus. Technical Report Orchid TR-NECTEC-1997–001, National Electronics and Computer Technology Center, Thailand, pp 5–19

  • Sornlertlamvanich V, Potipiti T, Charoenporn T (2000a) Automatic corpus-based Thai Word extraction with the C4.5 learning algorithm. In: Proceedings of the 18th International Conference on Computational Linguistics (COLING2000), Saarbrucken, Germany, pp 802–807

  • Sornlertlamvanich V, Potipiti T, Wutiwiwatchai C, Mittrapiyanuruk P (2000b) The state of the art in Thai language processing. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL2000), Hong Kong, pp 597–598

  • Steedman Mark (1987) Combinatory grammars and parasitic gaps. Nat Lang Ling Theory 5:403–439

    Article  Google Scholar 

  • Stolcke A (2002) SRILM-An extensible language modeling toolkit. In: Proceedings of the Seventh International Conference on Spoken Language Processing, Denver, USA, pp 901–904

  • Suesatpanit K, Punyabukkana P, Suchato A (2009) Thai word segmentation using character-level information. In: Proceedings of the InterBEST 2009 Thai word segmentation workshop, Bangkok, Thailand, pp 18–23

  • Supnithi P, Boonkwan T (2008) Memory-inductive categorial grammar: an approach to gap resolution in analytic-language translation. In: Proceedings of the Third International Joint Conference on Natural Language Processing, Hyderabad, India, pp 80–87

  • Supnithi T, Ruangrajitpakorn T, Trakultaweekool K, Porkaew P (2010) AutoTagTCG: a framework for automatic Thai CG Tagging. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta, pp 971–974

  • Supnithi T, Sornlertlamvanich V, Charoenporn T (2002) A cross system machine translation. In: Proceedings of the 2002 COLING workshop on Machine translation in Asia, Taipei, Taiwan, 7pp

  • Sutantayawalee V, Porkaew P, Boonkwan P, Phaholphinyo S, Supnithi T (2014) Improvement of statistical machine translation using character-based segmentation with monolingual and bilingual information. In: Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing, Phuket, Thailand, pp 145–151

  • Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Proceedings of NeurIPS 2014, Twenty-eighth Conference on Neural Information Processing Systems. Montreal, Canada, pp 3104–3112

  • Tanruangporn P (2017) Experimenting with Neural Machine Translation for Thai. https://medium.com/@petepeeradejtanruangporn/experimenting-with-neural-machine-translation-for-thai-1681fd2b375a Accessed 11 June 2018

  • Tapsai C, Meesad P, Unger H (2019) An Overview on the development of Thai natural language processing. Inf Technol J 15(2):45–52

    Google Scholar 

  • TeCho J, Nattee C, Theeramunkong T (2009) A corpus-based approach for automatic thai unknown word recognition using boosting techniques. IEICE Trans Inf Syst 92(12):2321–2333

    Article  Google Scholar 

  • Theeramunkong T, Usanavasin S (2001) Non-dictionary-based Thai word segmentation using decision trees. In: Proceedings of the First International Conference on Human Language Technology Research, San Diego, California, 5pp

  • Thonglor K (1972) Principles of Thai language. Ruamsam, Bangkok

    Google Scholar 

  • Tongchim S, Altmeyer R, Sornlertlamvanich V, Isahara H (2008) A dependency parser for Thai. In: Proceedings of the international conference on language resources and evaluation, LREC 2008, Marrakech, Morocco

  • Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of NeurIPS 2017: Thirty-first Conference on Neural Information Processing Systems. Long Beach, CA., pp 5998–6008

  • Way A (2018) Quality expectations of machine translation. In: Moorkens J, Castilho S, Gaspari F, Doherty S (eds) Translation quality assessment. Springer, Cham, Switzerland

    Google Scholar 

  • Way A, Hearne M (2011) On the role of translations in state-of-the-art statistical machine translation. Lang Ling Compass 5:227–248

    Article  Google Scholar 

  • Wutiwiwatchai C (2015) Language and speech translation activities in Thailand. ASEAN-NICT Round Table – Feb 2015 [PowerPoint presentation], Available at https://www.nict.go.jp/en/asean_ivo/lde9n2000000anwh-att/208-20150226_ASEAN-NICT_Chai.pdf

  • Wutiwiwatchai C, Furui S (2007) Thai speech processing technology: a review. Speech Commun 49(1):8–27

    Article  Google Scholar 

  • Wutiwiwatchai, C, Hansakunbuntheung C, Rugchatjaroen A, Saychum S, Kasuriya S, Chootrakool P (2017) Thai text-to-speech synthesis: a review. J Intell Inf Smart Technol 2:7

  • Wutiwiwatchai C, Supnithi T, Boonkwan P (2013) The network-based ASEAN language translation public service. ASEAN-MT NAC 2013 [Powerpoint presentation]. Available at https://www.nstda.or.th/nac/2013/download/presentation/NAC2013_Set2/CC-308-01-AM/Chai.pdf

  • Wutiwiwatchai C, Supnithi T, Kosawat K (2008) Speech-to-speech translation activities in Thailand. In: Proceedings of the Workshop on Technologies and Corpora for Asia-Pacific Speech Translation (TCAST), Hyderabad, India, pp 1–7

  • Wutiwiwatchai C, Supnithi T, Porkaew P, Thatphithakkul N (2009) Improvement Issues in English-Thai speech translation. In: Proceedings of the Workshop on Technologies and Corpora for Asia-Pacific Speech Translation (TCAST) Workshop, Suntec, Singapore, p 10–14

  • Xiong H, He Z, Hu X and Wu H (2018) Multi-channel encoder for neural machine translation. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, USA, pp 4962–4969

  • Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J (2016) Google's neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144

  • Zhang J, Wang M, Liu Q, Zhou J (2017) Incorporating word reordering knowledge into attention-based neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, Canada, pp 1524–1534

  • Zhou J, Cao Y, Wang X, Li P, Xu W (2016) Deep recurrent models with fast-forward connections for neural machine translation. Trans Assoc Comput Ling 4:371–383

    Google Scholar 

  • Zhou N, Aw A, Lertcheva N, Wang X (2016) A word labeling approach to Thai sentence boundary detection and POS Tagging. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, pp 319–327

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Séamus Lyons.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lyons, S. A review of Thai–English machine translation. Machine Translation 34, 197–230 (2020). https://doi.org/10.1007/s10590-020-09248-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-020-09248-8

Keywords

Navigation