Acquiring Translational Equivalence from a Japanese-Chinese Parallel Corpus

Zhang, Yujie; Ma, Qing; Liu, Qun; Chen, Wenliang; Isahara, Hitoshi

doi:10.1007/11940098_39

Yujie Zhang²²,
Qing Ma²³,
Qun Liu²⁴,
Wenliang Chen²² &
…
Hitoshi Isahara²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4285))

Included in the following conference series:

International Conference on Computer Processing of Oriental Languages

1004 Accesses

Abstract

This paper presents our work on acquiring translational equivalence from a Japanese-Chinese parallel corpus. We follow and extend existing word alignment techniques, including statistical model and heuristic model, in order to achieve a high performance. In addition to the statistics of the parallel corpus, the lexical knowledge of the language pair, such as orthographic cognates and bilingual dictionary are exploited. The implemented aligner is applied to the annotation of word alignment in the parallel corpus and the evaluation is conducted also. The experimental results prove the usability of the aligner in our task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Och, F.J., Ney, H.: A systematic comparison of various statistical alignment models. Computational Linguistics 29, 19–51 (2003)
Article Google Scholar
Och, F.J., Ney, H.: Giza++: Training of statistical translation models (2000), Available at: http://www-i6.informatik.rwthaachen.de/~och/software/GIZA++.html
Brown, P.F., Pietra, S.D., Pietra, V.J.D., Mercer, R.L.: The mathematic of statistical machine translation: Parameter estimation. Computational Linguistics 19, 263–311 (1993)
Google Scholar
Ker, S.J., Chang, J.S.: A class-based approach to word alignment. Computational Linguistics 23, 313–343 (1997)
Google Scholar
Melamed, I.D.: Models of translational equivalence among words. Computational Linguistics 26, 221–249 (2000)
Article Google Scholar
Huang, J.X., Choi, K.S.: Chinese-korean word alignment based on linguistic comparison. In: ACL (2000)
Google Scholar
Deng, D.: Research on Chinese-English word alignment. Master’s thesis, Institute of Computing Technology, Chinese Academy of Sciences (2004)
Google Scholar
Zhang, Y., Uchimoto, K., Ma, Q., Isahara, H.: Building an annotated Japanese-Chinese parallel corpus - a part of NICT Multilingual Corpora. In: The Tenth Machine Translation Summit, pp. 71–78 (2005)
Google Scholar
Maekawa, K., Koiso, H., Furui, F., Isahara, H.: Spontaneous speech corpus of Japanese. In: LRE 2000, pp. 947–952 (2000)
Google Scholar
Zhou, Q., Yu, S.: Blending segmentation with tagging in Chinese language corpus processing. In: COLING, pp. 1274–1278 (1994)
Google Scholar
NICT: EDR Electronic Dictionary Version 2.0 Technical Guide (2002)
Google Scholar
LDC: English-to-Chinese Wordlist, version 2 (2002), Available at: http://www.ldc.upenn.edu/Projects/Chinese
Tanaka, K., Umemura, K.: Construction of a bilingual dictionary intermediated by a third language. In: COLING, pp. 297–303 (1994)
Google Scholar
Zhang, Y., Ma, Q., Isahara, H.: Automatic construction of Japanese-Chinese translation dictionary using English as intermediary. Journal of Natural Language Processing 12, 63–85 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Computational Linguistics Group, National Institute of Information and Communications Technology, 3-5 Hikari-dai, Seika-cho, Soraku-gun, Kyoto, 619-0289, Japan
Yujie Zhang, Wenliang Chen & Hitoshi Isahara
Department of Applied Mathematics and Informatics, Ryukoku University, Seta, Otsu, 520-2194, Japan
Qing Ma
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Qun Liu

Authors

Yujie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qing Ma
View author publications
You can also search for this author in PubMed Google Scholar
Qun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Wenliang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hitoshi Isahara
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Information Science, Nara Institute of Science and Technology, 630-0192, Takayama, Ikoma, Nara, Japan
Yuji Matsumoto
Dept of ECE, University of Illinois at Urbana Champaign, IL 61801, Urbana, USA
Richard W. Sproat
Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Kam-Fai Wong
State Key Lab of Intelligent Tech. & Sys., Tsinghua University,
Min Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Ma, Q., Liu, Q., Chen, W., Isahara, H. (2006). Acquiring Translational Equivalence from a Japanese-Chinese Parallel Corpus. In: Matsumoto, Y., Sproat, R.W., Wong, KF., Zhang, M. (eds) Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead. ICCPOL 2006. Lecture Notes in Computer Science(), vol 4285. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11940098_39

Download citation

DOI: https://doi.org/10.1007/11940098_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49667-0
Online ISBN: 978-3-540-49668-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics