Abstract
The improvement of machine translation (MT) for languages such as Thai requires access to knowledge reported in past and current research. With the distinctive features of several Asian languages as exhibited by Thai, and the recent change in focus of MT to a neural network-based approach, researchers require knowledge of understanding of these languages to aid further research. The purpose of this study is to provide an overview of the significant research in Thai–English MT that is both a valuable reference for current researchers in the field, as well as being suitable for the non-expert. We include details of the relevant language characteristics and extensive coverage of the important contributions to MT in Thailand. Although the application of neural networks to translation (called ‘neural machine translation’ (NMT)) is rapidly evolving and not widely reported in academic work involving Thai translation, it has shown potential for languages that require segmentation or have few resources and is therefore of special interest. Translation techniques are in many cases not directly applicable to many Asian languages because of their linguistic features and versatile writing systems, but NMT is already in widespread use in industry for the translation of Asian languages including Thai. Given this relevant success and potential for Thai translation, the aim is to integrate this important area into current literature on Thai–English MT, and to encourage interest and support advancement.
Similar content being viewed by others
Notes
Overview. History of the Thai language. https://www.thai-language.com/ref/Overview Accessed 14 June 2018.
Inside a Thai Syllable: Part I. https://www.thai-language.com/id/830221 Accessed 7th July 2020.
As Way and Hearne (2011, p.234) point out, “In seeking to understand SMT in particular, this is a key distinction; while the means by which RBMT and EBMT generate translations usually look somewhat plausible to linguists and translators, the methods of translation generation in SMT are not intuitively plausible. In fact, the methods used—at least until the recent attempts to incorporate syntactic knowledge into PB-SMT—are not intended to be either linguistically or cognitively plausible (just probabilistically plausible), and holding onto the notion that they somehow are or should be simply hinders understanding of SMT” (original emphasis).
https://www.nist.gov/itl/iad/mig/openmt15-evaluation Accessed 14 June 2018.
https://www.youtube.com/watch?v=nR74lBO5M3s&t Accessed 7 June 2018.
https://sites.google.com/site/iwsltevaluation2015/mt-track Accessed 11 June 2018.
Network-Based ASEAN Languages. https://www.aseanmt.org/index.php?q=index/status_update Accessed 25 June 2018.
https://cloud.google.com/translate/docs/languages#languages-nmt Accessed 7 June 2018.
Microsoft Translator now offers more accurate and human-like translations of Thai. https://www.nationthailand.com/Startup_and_IT/30343651 Published on April 21, 2018, written by ThaiVisa. Accessed 6 June 2018.
Baidu Thailand reveals marketing platform. https://www.bangkokpost.com/tech/local-news/1208193/baidu-thailand-reveals-marketing-platfor.
Published on 3/3/2017, written by Srisamorn Phoosuphanusorn. Accessed 7 June 2018.
South Korea's Naver launches official version of AI translator ‘Papago’. https://www.nationthailand.com/Startup_and_IT/30321483.
Published on July 22, 2017, written by The Korea Herald/ANN. Accessed 9 December 2018.
Explorations on Multi-lingual Neural Machine Translation. https://www.youtube.com/watch?v=dOewfz19dMA Accessed 7 June 2018.
https://pioneer.chula.ac.th/~awirote/resources/thai-word-segmentation.html Accessed 5 January 2019.
cf. https://moin.delph-in.net/GlennSlayden Accessed 15th July 2020.
Deepcut. A Thai Word Tokenization Library using Deep Neural Network. https://github.com/rkcosmos/deepcut Accessed 11 June 2018.
Thai Word Segmentation with Bi-Directional RNN. https://github.com/sertiscorp/thai-word-segmentation Accessed 13 June 2018.
Thai Word Segmentation with Bi-Directional RNN. https://www.sertiscorp.com/november-20-2017 Accessed 13 June 2018.
Thai Natural Language Processing (Thai NLP) Resource. https://github.com/kobkrit/nlp_thai_resources Accessed 7th July 2020.
CutKum. Thai Word-Segmentation with Deep Learning in Tensorflow. https://github.com/pucktada/cutkum Accessed 11 June 2018.
https://github.com/KenjiroAI/SynThai Accessed 11 June 2018.
References
Aroonmanakun W (2002) Collocation and Thai word segmentation. In: Proceedings of SNLP-oriental COCOSDA, Hua Hin, Thailand, pp 68–75
Aroonmanakun W (2007) Thoughts on word and sentence segmentation in Thai. In: Proceedings of the seventh symposium on natural language processing, Pattaya, Thailand, pp 85–90
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473
Bheganan P, Nayak R, Xu Y (2009) Thai word segmentation with hidden Markov model and decision tree. In: Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining, Bangkok, Thailand. Springer-Verlag, Berlin, Heidelberg, pp 74–85
Boonkwan P, Kawtrakul A (2002) Plaesarn: machine-aided translation tool for English-to-Thai. In: Proceedings of the 2002 COLING workshop on machine translation in Asia, Taipei, Taiwan, 7pp
Boonkwan P, Supnithi T (2017) Bidirectional deep learning of context representation for joint word segmentation and POS tagging. In: Proceedings of international conference on computer science, applied mathematics and applications. Springer, Cham, Switzerland, pp 184–196
Brown P, Della Pietra SA, Della Pietra VJ, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Ling 19(2):263–311
Carl M, Way A (eds) (2003) Recent advances in example-based machine translation. Kluwer Academic Publishers, Dordrecht, The Netherlands
Castano A, Casacuberta F (1997) A connectionist approach to machine translation. In: Proceedings of EUROSPEECH-1997: Fifth European conference on speech communication and technology, Rhodes, Greece, pp 91–94
Cettolo M, Girardi C, Federico M (2012) WIT3: web inventory of transcribed and translated talks. In: Proceedings of the 16th conference of European Association for Machine Translation (EAMT), Trento, Italy, pp 261–268
Chancharoen K, Tannin N, Sirinaovakul B (1999) Pattern-based machine translation for English-Thai. In: Proceedings of the 13th Pacific Asia conference on language, information and computation. Taiwan, China, pp 329–336
Charoenpornsawat P, Schultz T (2008) Improving word segmentation for Thai speech translation. In: Proceedings of IEEE spoken language technology workshop, Goa, India, pp 241–244
Charoenpornsawat P, Sornlertlamvanich V (2001) automatic sentence break disambiguation for Thai. In: Proceedings of international conference on computer processing of oriental languages (ICCPOL), Seoul, South Korea, pp 231–235
Charoenpornsawat P, Sornlertlamvanich V, Charoenporn T (2002) Improving translation quality of rule-based machine translation. In: Proceedings of the 2002 COLING workshop on machine translation in Asia, Taipei, Taiwan, 6pp
Chen MX, Firat O, Bapna A, Johnson M, Macherey W, Foster G, Jones L, Parmar N, Schuster M, Chen Z, Wu Y, Hughes M (2018) The best of both worlds: combining recent advances in neural machine translation. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, pp 76–86
Cheng Y, Shen S, He Z, He W, Wu H, Sun M. , Liu Y (2015) Agreement-based joint training for bidirectional attention-based neural machine translation. arXiv:1512.04650
Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. In: Proceedings of ACL-2005: 43rd Annual Meeting of the Association for Computational Linguistics, Ann Arbor, Michigan, pp 263–270
Chimsuk T, Auwatanamongkol S (2009) A Thai to English machine translation system using Thai LFG tree structure as Interlingua. World Acad Sci Eng Technol Int J Math Comput Phys Electr Comput Eng 3:1134–1139
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), Doha, Qatar, pp 1724–1734
Chung J, Cho K, Bengio Y (2016) A character-level decoder without explicit segmentation for neural machine translation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers) Association for Computational Linguistics, Berlin, Germany, pp 1692–1703
Coughlin R, Setthawong R, Setthawong P (2018) An improved English-Thai translation framework for non-timing aligned parallel corpora using bleualign with explicit feedback. In: Proceedings of the 10th international conference on advances in information technology, Bangkok, Thailand, 8pp
Denkowski M, Lavie A (2010) Choosing the right evaluation for machine translation: an examination of annotator and automatic metric performance on human judgment tasks. In: Proceedings of AMTA 2010: the Ninth conference of the Association for Machine Translation in the Americas, Denver, Colorado, 9pp
Dong D, Wu H, He W, Yu D, Wang H (2015) Multi-task learning for multiple language translation. In: Proceedings of the 53rd Annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), Beijing, China, pp 1723–1732
Finch A, Liu L, Wang X, Sumita E (2016) Target-bidirectional neural models for machine transliteration. In: Proceedings of the sixth named entity workshop, Berlin, Germany, pp 78–82
Firat O, Cho K, Bengio Y (2016) Multi-way, multilingual neural machine translation with a shared attention mechanism. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies. San Diego, California, pp 866–875
Forcada ML, Ñeco RP (1997) Recursive Hetero-associative memories for translation. In: Biological and artificial computation: from neuroscience to technology (international work-conference on artificial neural networks (IWANN'97), proceedings), Lanzarote, Canary Islands, Spain Springer, Berlin, Heidelberg, pp 453–462.
Goodman K, Nirenburg S (eds) (1991) The KBMT project: a case study in knowledge-based machine translation. Morgan Kaufmann, Burlington, Massachusetts
Haruechaiyasak C, Kongyoung S (2009) TLex: Thai lexeme analyser based on the conditional random fields. In: InterBEST 2009 Thai word segmentation workshop, proceedings of 8th international symposium on natural language processing, Bangkok, Thailand, pp 13–17
Haruechaiyasak C, Kongyoung S, Dailey M (2008) A comparative study on thai word segmentation approaches. In: Proceedings of 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, Krabi, Thailand, pp 125–128
Haruechaiyasak C, Sangkeettrakarn C, Palingoon P, Kongyoung S, Damrongrat C (2006) A collaborative framework for collecting Thai unknown words from the web. In: Proceedings of the COLING/ACL Main Conference Poster Sessions, Sydney, Australia, pp 345–352.
Hutchins WJ (1995) Machine Translation: A Brief History. In: Koerner EFK, Asher RE (eds) Concise history of the language sciences: from the Sumerians to the cognitivists. Pergamon Press, Oxford, UK, pp 431–445
Isozaki H, Hirao T, Duh K, Sudoh K, Tsukada H (2010) Automatic evaluation of translation quality for distant language pairs. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, MA, pp 944–952
Jean S, Cho K, Memisevic R, Bengio Y (2015) On using very large target vocabulary for neural machine translation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), Beijing, China, pp 1–10
Johnson M, Schuster M, Le QV, Krikun M, Wu Y, Chen Z, Thorat N, Viégas F, Wattenberg M, Corrado G, Hughes M (2017) Google's multilingual neural machine translation system: enabling zero-shot translation. Trans Assoc Comput Ling 5:339–351
Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference on empirical methods in natural language processing, USA, pp 1700–1709
Kampanya N, Boonkwan P, Kawtrakul A (2002) Bilingual unknown word alignment tool for English-Thai. In: Proceedings of the joint international conference of SNLP-Oriental COCOSDA 2002: the fifth symposium on natural language processing & the fifth oriental COCOSDA Workshop, Hua Hin, Prachuapkirikhan, Thailand, 7pp
Kaplan RM, Netter K, Wedekind J, Zaenen A (1989) Translation by structural correspondences. In: Proceedings of the fourth conference of the European chapter of the association for computational linguistics, Manchester, England, pp 272–281
Kawtrakul A, Boonkwan P (2004) An integrated tool for translation-memory maintenance. https://pdfs.semanticscholar.org/7420/f88b7b94cffc504c9fd3faad5f43836e8fa8.pdf?_ga=2.142574834.1106393232.1594816470-1712962596.1567069468
Kawtrakul A, Kumtanode S, Jamjunya T, Jewriyavech A (1995) A Lexibase model for writing production assistant system. In: Proceedings of the second symposium on natural language processing, Bangkok, Thailand, pp 226–236
Kawtrakul A, Praneetpolgrang P (2014) A history of AI research and development in Thailand: three periods. Three Dir AI Mag 35(2):83–92
Kawtrakul A, Suktarachan M, Varasai P, Chanlekha H (2002) A state of the art of Thai language resources and Thai language behavior analysis and modeling. In: Proceedings of COLING-02: The 3rd workshop on Asian language resources and international standardization, Taipei, Taiwan, 8pp
Kazimi MB (2017) Coverage model for character-based neural machine translation. Master's thesis, Universitat Politècnica de Catalunya, Barcelona, Spain
Khankasikam K, Muansuwan N (2005). Thai word segmentation a lexical semantic approach. In: Proceedings of the Tenth Machine Translation Summit, Phuket, Thailand, pp 331–338
Kim Y, Rush AM (2016) Sequence-level knowledge distillation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, pp 1317–1327
Kit C, Wong TM (2008) Comparative evaluation of online machine translation systems with legal texts. Law Libr J 100:299–321
Klaithin S, Kriengket K, Phaholphinyo S, Kosawat K (2011) Thai word segmentation verification tool. In: Proceedings of the 2nd Workshop on South Southeast Asian Natural Language Processing (WSSANLP), Chiang Mai, Thailand, pp 16–22
Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) OpenNMT: open-source toolkit for neural machine translation. arXiv:1701.02810
Koehn P (2004) Pharaoh: a beam search decoder for phrase-based statistical machine translation models. In: Machine translation: from real users to research: 6th conference of the Association for Machine Translation in the Americas, AMTA 2004, Proceedings, Washington, DC, Springer Verlag, Berlin, pp 115–124
Koehn P (2009) Statistical machine translation. Cambridge University Press, Cambridge, UK
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, Prague, Czech Republic, pp 177–180
Kongyoung S, Rugchatjaroen A, Kosawat K (2015) TLex+: a hybrid method using conditional random fields and dictionaries for Thai word segmentation. In: International conference on knowledge, information, and creativity support systems, Phuket. Springer, Cham, pp 112–125
Kosawat K, Boriboon M, Chootrakool P, Chotimongkol A, Klaithin S, Kongyoung S, Kriengket K, Phaholphinyo S, Purodakananda S, Thanakulwarapas T, Wutiwiwatchai C (2009) BEST 2009: Thai word Segmentation software contest. In: Proceedings of the Eighth International Symposium on Natural Language Processing, Bangkok, Thailand, pp 83–88
Kritsuthikul N, Thammano A, Supnithi T (2006) English–Thai example-based machine translation using n-gram model. In: Proceedings of the 2006 IEEE International Conference on Systems, Man and Cybernetics, Taipei, Taiwan, pp 4386–4390
Kruengkrai C, Sornlertlamvanich V, Isahara H (2006) A conditional random field framework for Thai morphological analysis. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy, pp 2419–2424
Kruengkrai C, Uchimoto K, Kazama J, Torisawa K, Isahara H, Jaruskulchai C (2009) A word and Character-cluster hybrid model for Thai Word segmentation. In: Proceedings of the Eighth International Symposium on Natural Language Processing, Bangkok, Thailand, pp 24–29
Labutsri N, Chamchong R, Booth R, Rodtook A (2009) English syntactic reordering for English-Thai phrase-based statistical machine translation. In: Proceedings of the 6th International Joint Conference on Computer Science and Software Engineering (JCSSE 2009), Phuket, Thailand, pp 360–366
Lee HG, Lee J, Kim JS, Lee CK (2015) NAVER machine translation system for WAT 2015. In: Proceedings of the 2nd workshop on asian translation (WAT2015), Kyoto, Japan, pp 69–73
Lee J, Cho K, Hofmann T (2017) Fully character-level neural machine translation without explicit segmentation. Trans Assoc Comput Ling 5:365–378
Lertpiya A, Chaiwachirasak T, Maharattanamalai N, Lapjaturapit T, Chalothorn T, Tirasaroj N, Chuangsuwanich E (2018) A preliminary study on fundamental Thai NLP tasks for user-generated web content. In: Proceedings of the 2018 International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), Pattaya, Thailand, 8pp
Limcharoen P, Nattee C, Theeramunkong T (2009) Thai word Segmentation based-on GLR parsing technique and word n-gram model. In: Proceedings of the Eighth International Symposium on Natural Language Processing, Bangkok, Thailand, 7pp
Lommel A (2018) Metrics for translation quality assessment: a case for standardising error typologies. In: Moorkens, J., Castilho, S., Gaspari, F., Doherty, S. (Eds.) Translation quality assessment, Springer, Cham, Switzerland
Luekhong P, Limkonchotiwat P, Ruangrajitpakorn T (2019) A study on an effect of using deep learning in Thai-English machine translation processes. In: Proceedings of the 2019 14th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), Chiang Mai, Thailand, 6pp
Luekhong P, Ruangrajitpakorn T, Sukhahuta R, Supnithi T (2016) A study of a Thai-English translation comparing on applying phrase-based and hierarchical phrase-based translation. In: Proceedings of the International Symposium on Natural Language Processing, Advances in Natural Language Processing, Intelligent Informatics and Smart Technology, Springer International Publishing, Cham, Switzerland, pp 38–48
Luekhong P, Ruangrajitpakorn T, Sukhahuta R, Supnithi T (2017) A framework of 2-step bilingual alignment for SMT: in Case Study of Thai-English Translation. Chiang Mai University Library Journal Article, Chiang Mai, Thailand: https://cmuir.cmu.ac.th/jspui/handle/6653943832/57064
Luekhong P, Ruangrajitpakorn T, Supnithi T, Sukhahuta R (2013) Pooja: similarity-based bilingual word alignment framework for SMT. In: Proceedings of the 10th International Symposium on Natural Language Processing, Phuket, Thailand, pp 199–204
Luekhong P, Sukhahuta R, Porkaew P, Ruangrajitpakorn T, Supnithi T (2012) A comparative study on applying hierarchical phrase-based and phrase-based on Thai-Chinese Translation. In: Proceedings of the 2012 Seventh International Conference on Knowledge, Information and Creativity Support Systems, Melbourne, Australia, pp 126–133
Luong MT, Pham H, Manning CD (2015a) Effective approaches to attention-based neural machine translation. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal, pp 1412–1421
Luong MT, Sutskever I, Le QV, Vinyals O, Zaremba W (2015b) addressing the rare word problem in neural machine translation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Beijing, China, pp 11–19
Lyons S (2016) A survey of the use of mobile technology and translation tools by students at secondary school in Thailand. Payap University Journal 26(1):35–57
Lyons S (2016b) Quality of Thai to English machine translation. In: Pacific Rim Knowledge Acquisition Workshop, Phuket, Thailand, pp 261–270. Springer International Publishing, Cham, Switzerland
Lyons S (2020) Comparison of neural network and traditional Thai word segmentation systems. In: Proceedings of the Payap University Research Symposium 2020, Chiang Mai, Thailand, pp 593–604
Mahatthanachai C, Malaivongs K, Tantranont N (2016) Thai word segmentation technique for solving unknown words and ambiguous words using rules-based and surrounding contextual clues. วารสาร เทคโนโลยี อุตสาหกรรม มหาวิทยาลัย ราชภัฏ อุบลราชธานี 6(1): 1–15
Mai K, Sukhahuta R, Luekhong P (2014) Thai-English phrase-based statistical machine translation (in Thai). In: Proceedings of the 18th International Computer Science and Engineering Conference (ICSEC2014), Khon Kaen, Thailand, 9pp
Meechoonuk M, Rakchonlatee S (2001) An analysis of text translated by machine. MA Thesis, School of Language and Communication, NIDA, Bangkok, Thailand
Meknavin S Charoenpornsawat P and Kijsirikul B (1997) Feature-based Thai word segmentation. In: Proceedings of Natural Language Processing Pacific Rim Symposium, Phuket, Thailand, pp 41–46
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: 1st International Conference on Learning Representations, ICLR 2013, Workshop Track Proceedings, Scottsdale, Arizona, 12pp
Mittrapiyanuruk P, Sornlertlamvanich V (2000) The automatic Thai sentence extraction. In: Proceedings of the fourth symposium on Natural Language Processing, Chiang Mai, Thailand, pp 23–28
Modhiran T, Kosawat K, Klaithin S, Boriboon M, Supnithi T (2005) PARSITTE: online Thai-english machine translation. In: Proceedings of MT Summit X, Phuket, Thailand, pp 13–15
Nagao M (1984) A framework of a mechanical translation between Japanese and English by analogy principle, Artificial and human intelligence: edited review papers presented at the international NATO Symposium, October 1981, Lyons, France; A. Elithorn and R. Banerji (Eds) North Holland, Amsterdam, pp 173–180
Nararatwong R, Kertkeidkachorn N, Cooharojananone N, Okada H (2018) Improving Thai word and sentence segmentation using linguistic knowledge. IEICE Trans Inf Syst 101(12):3218–3225
Nathalang S, Porkeaw P, Supnithi T (2010) Don’t use big words with me: an evaluation of English-Thai statistical-based machine translation. In: Proceedings of the International Symposium on Using Corpora in Contrastive and Translation Studies (UCCTS2010), Ormskirk, UK, 19pp
Ñeco RP, Forcada ML (1997) Asynchronous translations with recurrent neural nets. In: Proceedings of International Conference on Neural Networks (ICNN’97), Houston, USA, pp 2535--2540
Netisopakul P, Keawwan K (2007) Thai sentence segmentation using M-ATN. In: Proceedings of the 7th International Symposium on Natural Language Processing (SNLP 2007), Pattaya, Thailand, 2007, pp 91–96
Netjinda N, Facundes N, Sirinaovakul B (2009) Toward statistical machine translation for Thai and English. In: Proceedings of the International Symposium On Digital Libraries, Albuquerque, New Mexico, USA, pp 27–28
Niu X, Denkowski M, Carpuat M (2018) Bi-Directional neural machine translation with synthetic parallel data. In: Proceedings of the 2nd Workshop on Neural Machine Translation and Generation, Melbourne, Australia, pp 84–91
Nomponkrang T, Sanrach C (2016) The comparison of algorithms for Thai-Sentence Classification. Int J Inf Educ Techn 6(10):801–808
Noyunsan C, Haruechaiyasak C, Poltree S, Saikeaw KR (2014) A multi-aspect comparison and evaluation on Thai word segmentation programs. In: Proceedings of the Joint International Conference of Semantic Technology (JIST) (Workshops & Posters), Chiang Mai, Thailand, pp 132--135
Nusai C, Suzuki Y, Yamazaki H (2007) Method based on EM algorithm for estimating word translation probabilities in Thai–english machine translation. In: Proceedings of the 9th WSEAS international conference on data networks, communications, computers. World scientific and engineering academy and society (WSEAS), pp 407–412
Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Ling 29(1):19–51
Oupatcha T, Thammakoranonta N (2014) English-Thai translating algorithm by subject category using neural network. In: Proceedings of the International Conference on Challenges in IT, Engineering and Technology (ICCIET’2014), Phuket, Thailand, pp 57–59
Pa WP, Thu YK, Finch A, Sumita E (2016) A study of statistical machine translation methods for under resourced languages. Procedia Comp Sci 81:250–257
Papineni P, Roukos S, Ward T, Zhu W-J (2002) BLEU: a method of automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadephia, PA, pp 311–318
Phaholphinyo S, Modhiran T, Kritsuthikul N, Supnithi T (2005) A practical of memory-based approach for improving accuracy of MT. In: Proceedings of MT Summit X, Phuket, Thailand, pp 41–46
Phodong K, Kongkachandra R (2016) Improvement of word alignment in Thai-English statistical machine translation by grammatical attributes identification. In: Proceedings of the 8th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Ploiesti, Romania, 4pp
Poncelas A, Pidchamook W, Liu C-H, Way A (2020) Multiple segmentations of Thai sentences for neural machine translation. In: SLTU-CCURL 2020: Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages and Collaboration and Computing for Under-Resourced Languages, Marseille, France, pp 240–244
Pooworawan Y (1986) Dictionary-based thai syllable separation. In: Proceedings of the ninth electrical engineering conference (EECON-86), Thailand, pp 409–418
Porkaew P, Ruangrajitpakorn T, Trakultaweekoon K, Supnithi T (2001) Translation of noun phrase from English to Thai using phrase-based SMT with CCG reordering rules. In: Proceedings of the Conference of the Pacific Association for Computational Linguistics (PACLING 2009), Sapporo, Japan, 5pp
Porkeaw P, Supnithi T, Wutiwiwatchai C (2008) Statistical machine translation for Thai-English Electronic Translator (in Thai). NECTEC Technical Journal, NECTEC-ACE2008 Special Edition, Thailand, 7pp
Prasomsuk S, Mol P (2017) Thai to Khmer rule-based machine translation using reordering word to phrase. Glob J Comp Sci Technol 17(3):223–227
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, Burlington, Massachusetts
Ruangrajitpakorn T, na Chai W, Boonkwan P, Boriboon M, Supnithi T (2007) The design of lexical information for Thai to English MT. In: Proceedings of 7th International Symposium on Natural Language Processing (SNLP 2007), Pattaya, Thailand, 7pp
Saetia C, Chuangsuwanich E, Chalothorn T, Vateekul P (2019) Semi-supervised Thai Sentence segmentation using local and distant word representations. arXiv:1908.01294
Seljan S, Brkic M, Vicic T (2012) BLEU Evaluation of machine-translated English-Croatian legislation. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 12), Istanbul, Turkey, pp 2143–2148
Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, pp 1715–1725
Shen S, Cheng Y, He Z, He W, Wu H, Sun M, Liu Y (2015) Minimum risk training for neural machine translation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Beijing, China, pp 1683–1692
Slayden G, Hwang MY, Schwartz L (2010a) Thai sentence-breaking for large-scale SMT. In: Proceedings of the 1st Workshop on South and Southeast Asian Natural Language Processing, Beijing, China, pp 8–16
Slayden G, Hwang MY, Schwartz L (2010b) Large-Scale Thai statistical machine translation (Vol. 41). MSR-TR-2010. Redmond: Microsoft Corporation. https://www.microsoft.com/en-us/research/publication/large-scale-thai-statistical-machine-translation
Slayden G, Luqman E (2010) Derivative sentence breaking for moore alignment. Available from https://www.thai-language.com/ref/breaking-words
Sornlertlamvanich V (1993) Word segmentation for Thai in machine translation system machine translation. National electronics and computer technology center, Bangkok, Thailand, pp 50–56
Sornlertlamvanich V, Charoenporn T, Isahara H (1997) ORCHID: Thai part-of-speech tagged corpus. Technical Report Orchid TR-NECTEC-1997–001, National Electronics and Computer Technology Center, Thailand, pp 5–19
Sornlertlamvanich V, Potipiti T, Charoenporn T (2000a) Automatic corpus-based Thai Word extraction with the C4.5 learning algorithm. In: Proceedings of the 18th International Conference on Computational Linguistics (COLING2000), Saarbrucken, Germany, pp 802–807
Sornlertlamvanich V, Potipiti T, Wutiwiwatchai C, Mittrapiyanuruk P (2000b) The state of the art in Thai language processing. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL2000), Hong Kong, pp 597–598
Steedman Mark (1987) Combinatory grammars and parasitic gaps. Nat Lang Ling Theory 5:403–439
Stolcke A (2002) SRILM-An extensible language modeling toolkit. In: Proceedings of the Seventh International Conference on Spoken Language Processing, Denver, USA, pp 901–904
Suesatpanit K, Punyabukkana P, Suchato A (2009) Thai word segmentation using character-level information. In: Proceedings of the InterBEST 2009 Thai word segmentation workshop, Bangkok, Thailand, pp 18–23
Supnithi P, Boonkwan T (2008) Memory-inductive categorial grammar: an approach to gap resolution in analytic-language translation. In: Proceedings of the Third International Joint Conference on Natural Language Processing, Hyderabad, India, pp 80–87
Supnithi T, Ruangrajitpakorn T, Trakultaweekool K, Porkaew P (2010) AutoTagTCG: a framework for automatic Thai CG Tagging. In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta, pp 971–974
Supnithi T, Sornlertlamvanich V, Charoenporn T (2002) A cross system machine translation. In: Proceedings of the 2002 COLING workshop on Machine translation in Asia, Taipei, Taiwan, 7pp
Sutantayawalee V, Porkaew P, Boonkwan P, Phaholphinyo S, Supnithi T (2014) Improvement of statistical machine translation using character-based segmentation with monolingual and bilingual information. In: Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing, Phuket, Thailand, pp 145–151
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Proceedings of NeurIPS 2014, Twenty-eighth Conference on Neural Information Processing Systems. Montreal, Canada, pp 3104–3112
Tanruangporn P (2017) Experimenting with Neural Machine Translation for Thai. https://medium.com/@petepeeradejtanruangporn/experimenting-with-neural-machine-translation-for-thai-1681fd2b375a Accessed 11 June 2018
Tapsai C, Meesad P, Unger H (2019) An Overview on the development of Thai natural language processing. Inf Technol J 15(2):45–52
TeCho J, Nattee C, Theeramunkong T (2009) A corpus-based approach for automatic thai unknown word recognition using boosting techniques. IEICE Trans Inf Syst 92(12):2321–2333
Theeramunkong T, Usanavasin S (2001) Non-dictionary-based Thai word segmentation using decision trees. In: Proceedings of the First International Conference on Human Language Technology Research, San Diego, California, 5pp
Thonglor K (1972) Principles of Thai language. Ruamsam, Bangkok
Tongchim S, Altmeyer R, Sornlertlamvanich V, Isahara H (2008) A dependency parser for Thai. In: Proceedings of the international conference on language resources and evaluation, LREC 2008, Marrakech, Morocco
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Proceedings of NeurIPS 2017: Thirty-first Conference on Neural Information Processing Systems. Long Beach, CA., pp 5998–6008
Way A (2018) Quality expectations of machine translation. In: Moorkens J, Castilho S, Gaspari F, Doherty S (eds) Translation quality assessment. Springer, Cham, Switzerland
Way A, Hearne M (2011) On the role of translations in state-of-the-art statistical machine translation. Lang Ling Compass 5:227–248
Wutiwiwatchai C (2015) Language and speech translation activities in Thailand. ASEAN-NICT Round Table – Feb 2015 [PowerPoint presentation], Available at https://www.nict.go.jp/en/asean_ivo/lde9n2000000anwh-att/208-20150226_ASEAN-NICT_Chai.pdf
Wutiwiwatchai C, Furui S (2007) Thai speech processing technology: a review. Speech Commun 49(1):8–27
Wutiwiwatchai, C, Hansakunbuntheung C, Rugchatjaroen A, Saychum S, Kasuriya S, Chootrakool P (2017) Thai text-to-speech synthesis: a review. J Intell Inf Smart Technol 2:7
Wutiwiwatchai C, Supnithi T, Boonkwan P (2013) The network-based ASEAN language translation public service. ASEAN-MT NAC 2013 [Powerpoint presentation]. Available at https://www.nstda.or.th/nac/2013/download/presentation/NAC2013_Set2/CC-308-01-AM/Chai.pdf
Wutiwiwatchai C, Supnithi T, Kosawat K (2008) Speech-to-speech translation activities in Thailand. In: Proceedings of the Workshop on Technologies and Corpora for Asia-Pacific Speech Translation (TCAST), Hyderabad, India, pp 1–7
Wutiwiwatchai C, Supnithi T, Porkaew P, Thatphithakkul N (2009) Improvement Issues in English-Thai speech translation. In: Proceedings of the Workshop on Technologies and Corpora for Asia-Pacific Speech Translation (TCAST) Workshop, Suntec, Singapore, p 10–14
Xiong H, He Z, Hu X and Wu H (2018) Multi-channel encoder for neural machine translation. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, USA, pp 4962–4969
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Krikun M, Cao Y, Gao Q, Macherey K, Klingner J (2016) Google's neural machine translation system: bridging the gap between human and machine translation. arXiv:1609.08144
Zhang J, Wang M, Liu Q, Zhou J (2017) Incorporating word reordering knowledge into attention-based neural machine translation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vancouver, Canada, pp 1524–1534
Zhou J, Cao Y, Wang X, Li P, Xu W (2016) Deep recurrent models with fast-forward connections for neural machine translation. Trans Assoc Comput Ling 4:371–383
Zhou N, Aw A, Lertcheva N, Wang X (2016) A word labeling approach to Thai sentence boundary detection and POS Tagging. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, pp 319–327
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lyons, S. A review of Thai–English machine translation. Machine Translation 34, 197–230 (2020). https://doi.org/10.1007/s10590-020-09248-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-020-09248-8