Abstract
This paper presents our work on acquiring translational equivalence from a Japanese-Chinese parallel corpus. We follow and extend existing word alignment techniques, including statistical model and heuristic model, in order to achieve a high performance. In addition to the statistics of the parallel corpus, the lexical knowledge of the language pair, such as orthographic cognates and bilingual dictionary are exploited. The implemented aligner is applied to the annotation of word alignment in the parallel corpus and the evaluation is conducted also. The experimental results prove the usability of the aligner in our task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29, 19–51 (2003)
Och, F.J., Ney, H.: Giza++: Training of statistical translation models (2000), Available at: http://www-i6.informatik.rwthaachen.de/~och/software/GIZA++.html
Brown, P.F., Pietra, S.D., Pietra, V.J.D., Mercer, R.L.: The mathematic of statistical machine translation: Parameter estimation. Computational Linguistics 19, 263–311 (1993)
Ker, S.J., Chang, J.S.: A class-based approach to word alignment. Computational Linguistics 23, 313–343 (1997)
Melamed, I.D.: Models of translational equivalence among words. Computational Linguistics 26, 221–249 (2000)
Huang, J.X., Choi, K.S.: Chinese-korean word alignment based on linguistic comparison. In: ACL (2000)
Deng, D.: Research on Chinese-English word alignment. Master’s thesis, Institute of Computing Technology, Chinese Academy of Sciences (2004)
Zhang, Y., Uchimoto, K., Ma, Q., Isahara, H.: Building an annotated Japanese-Chinese parallel corpus - a part of NICT Multilingual Corpora. In: The Tenth Machine Translation Summit, pp. 71–78 (2005)
Maekawa, K., Koiso, H., Furui, F., Isahara, H.: Spontaneous speech corpus of Japanese. In: LRE 2000, pp. 947–952 (2000)
Zhou, Q., Yu, S.: Blending segmentation with tagging in Chinese language corpus processing. In: COLING, pp. 1274–1278 (1994)
NICT: EDR Electronic Dictionary Version 2.0 Technical Guide (2002)
LDC: English-to-Chinese Wordlist, version 2 (2002), Available at: http://www.ldc.upenn.edu/Projects/Chinese
Tanaka, K., Umemura, K.: Construction of a bilingual dictionary intermediated by a third language. In: COLING, pp. 297–303 (1994)
Zhang, Y., Ma, Q., Isahara, H.: Automatic construction of Japanese-Chinese translation dictionary using English as intermediary. Journal of Natural Language Processing 12, 63–85 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, Y., Ma, Q., Liu, Q., Chen, W., Isahara, H. (2006). Acquiring Translational Equivalence from a Japanese-Chinese Parallel Corpus. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_39
Download citation
DOI: https://doi.org/10.1007/11940098_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49667-0
Online ISBN: 978-3-540-49668-7
eBook Packages: Computer ScienceComputer Science (R0)