Abstract
This paper presents some problems involved in the machine translation of proper names (PNs) from English into Vietnamese. Based on the building of an English-Vietnamese parallel corpus of texts with numerous PNs extracted from online BBC News and translated by four machine translation (MT) systems, we implement the PN error classification and analysis. Some pre-processing solutions for reducing and limiting errors are also proposed and tested with a manually annotated corpus in order to significantly improve the MT quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Babych, B., Hartley, A.: Improving MT Quality with Automatic Named Entity Recognition. Centre for Translation Studies, University of Leeds, UK (2003)
Coates-Stephens, S.: The Analysis and Acquisition of Proper Names for the Understanding of Free Text. Computers and the Humanities 26(4), 441–456 (1993)
Dinh, D., Nguyen, L.T.N., Do, X.Q., Van, C.N.: A hybrid approach to word order transfer in the English to Vietnamese machine translation (2004)
Ho, T.B., Ha, N.K., Nguyen, T.P.T.: Issues and First Development Phase of the English-Vietnamese Translation System EVSMT 1.0. (2007)
Nguyen, H.D.: Vietnamese-English Cross-Language Information Retrieval (CLIR) using bilingual dictionary. Hewlett-Packard Company, USA (2008)
Nguyen, V.K.: Principle draft of writing and pronouncing foreign PNs in the government management protocols. Journal of Linguistics and Society No. 6(128), 1–6 (2006)
Phan, T.T.T.: Proper Name Errors in Online Translation Texts from English to Vietnamese: an Analysis and a Proposed Solution. In: Bulag NLP and HLT, International Review 2010, No. 34, pp. 111–133. Presses UFC (2010)
Tran, Q.T., Pham, T.X.T., Ngo, Q.H., Dinh, D., Collier, N.: Named Entity Recognition in Vietnamese Document. Program in Informatics (4), 5–13 (2007)
Wolinski, F., et al.: Automatic Processing of Proper Names in Texts (1995)
Dinh, D., Hoang, K.: POS-Tagger for English-Vietnamese Bilingual Corpus. In: Proceedings of Human Language Technology- North American Chapter of the Association for Computational Linguistics (2003)
Do, D., et al.: Word alignment in English-Vietnamese bilingual corpus. In: Proceedings of EALPIIT 2002, Hanoi, Vietnam, pp. 3–11 (2002)
Do, T.N.D., Le, V.B., Besacier, L., Bigi, B.: Mining a comparable text corpus for a Vietnamese-French statistical MT system. In: Proceedings of the 4th Workshop on SMT, pp. 165–172. ACL, Athens (2009)
Krstev, C., Vitas, D., Maurel, D., Tran, M.: Multilingual Ontology of Proper Names. In: Vetulani, Z. (ed.) Proceedings of 2nd Language & Technology Conference, Poznan, Poland, pp. 116–119 (2005)
Website links of E-V MT systems, http://tratu.vietgle.vn/hoc-tieng-anh/dich-van-ban.html , http://translate.google.com , http://www.microsofttranslator.com/ , http://vdict.com/#translation
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Phan, T.T.T., Thomas, I. (2012). English-Vietnamese Machine Translation of Proper Names. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2012. Lecture Notes in Computer Science(), vol 7499. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32790-2_47
Download citation
DOI: https://doi.org/10.1007/978-3-642-32790-2_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32789-6
Online ISBN: 978-3-642-32790-2
eBook Packages: Computer ScienceComputer Science (R0)