Skip to main content

Identification of Subject Shareness for Korean-English Machine Translation

  • Conference paper
PRICAI 2008: Trends in Artificial Intelligence (PRICAI 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5351))

Included in the following conference series:

Abstract

One of the most critical issues in translating Korean into other languages is the common use of empty arguments. Since even mandatory elements in Korean are often dropped unlike English, the missing elements should be resolved during translation to obtain grammatical sentences. In this paper, we focus on missing subjects in intra-sentential level, which can be regarded as the identification of subject sharing between clauses. In order to reflect syntactic information in resolving missing subjects, we use a parse tree kernel, a specialized convolution kernel. In experimental evaluation, syntactic information turns out to be positively related to the identification of subject shareness. Our method achieves an accuracy of 81.39% and outperforms the baseline system assuming that two adjacent clauses share a subject.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Moschitti, A.: Making Tree Kernels Practical for Natural Language Learning. In: proceedings of the 11th International Conference on European Association for Computational Linguistics, pp. 113–120 (2006)

    Google Scholar 

  2. Egedi, D., Palmer, M., Park, H.S., Joshi, A.K.: Korean to English Translation Using Synchronous TAGs. In: Proceedings of the First Conference of the Association for Machine Translation in the Americas, pp. 48–55 (1994)

    Google Scholar 

  3. Haussler, D.: Convolution Kernels on Discrete Structures. UCS-CRL-99-10, UC Santa Cruz (1999)

    Google Scholar 

  4. Kawahara, D., Kurohashi, S.: Zero Pronoun Resolution based on Automatically Constructed Case Frames and Structural Preference of Antecedents. Journal of Natural Language Processing 11(3), 3–19 (2004)

    Article  Google Scholar 

  5. Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: A Framework for Modeling the Local Coherence of Discourse. Computational Linguistics 21(2), 203–225 (1995)

    Google Scholar 

  6. Isozaki, H., Hirao, T.: Japanese zero pronoun resolution based on ranking rules and machine learning. In: Proceedings of Empirical Methods in Natural Language Processing, pp. 184–191 (2003)

    Google Scholar 

  7. Kim, J.-J., Choi, K.-S., Chae, Y.-S.: Phrase-Pattern-based Korean to English Machine Translation using Two Level Translation Pattern Selection. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, pp. 31–36 (2002)

    Google Scholar 

  8. Peral, J., Ferrandez, A.: Pronominal Anaphora Generation in an English-Spanish MT Approach. In: Computational Linguistics and Intelligent Text Processing, pp. 187–196 (2002)

    Google Scholar 

  9. Jiang, J.J., Conrath, D.W.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In: Proceedings of the 10th International Conference on Research in Computational Linguistics (1997)

    Google Scholar 

  10. Roh, J.-E., Lee, J.-H.: An Empirical Study for Generating Zero Pronoun in Korean based on Cost-based Centering Model. In: Proceedings of Australasian Language Technology Association, pp. 90–97 (2003)

    Google Scholar 

  11. Collins, M., Duffy, N.: Convolution Kernels for Natural Language. In: Proceedings of NIPS 2001, pp. 625–632 (2001)

    Google Scholar 

  12. Collins, M., Koehn, P., Kucerova, I.: Clause Restructing for Statistical Machine Translation. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 531–540 (2005)

    Google Scholar 

  13. Kim, M.-K.: A Centering Dynamics Approach to Zero Pronouns in Korean. The Discourse and Cognitive 10(3), 57–73 (2003)

    Google Scholar 

  14. Kim, M.-Y., Lee, J.-H.: Two-Phase S-Clause Segmentation. IEICE Transaction on Information and System E88-D(7), 1724–1736 (2005)

    Article  Google Scholar 

  15. Hong, M.: Centering theory and Argument Deletion in Spoken Korean. The Korean Journal Cognitive Science (11-1), 9–24 (2000)

    Google Scholar 

  16. Chang, P.-C., Toutanova, K.: A Discriminative Syntactic Word Order Model for Machine Translation. In: Proceedings of 45th Annual Meeting of the Association for Computational Linguistics, pp. 9–16 (2007)

    Google Scholar 

  17. Zhao, S., Ng, H.T.: Identification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 541–550 (2007)

    Google Scholar 

  18. Joachims, T.: Making large-Scale SVM Learning Practical. In: Scholkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT-Press, Cambridge (1999)

    Google Scholar 

  19. Roh, Y.-H., Hong, M., Choi, S.-K., Lee, K.-Y., Park, S.-K.: For the Proper Treatment of Long Sentences in a Sentence Pattern based English-Korean MT System. In: Proceedings of Machine Translation Summit IX, pp. 23–27 (2003)

    Google Scholar 

  20. Kim, Y.-J.: Subject/Object Drop in the Acquisition of Korean: A Cross-Linguistic Comparision. East Asian Linguistics 9(4), 325–351 (2000)

    Article  MathSciNet  Google Scholar 

  21. Lee, Y.-S., Yi, W.S., Seneff, S., Weinstein, C.J.: Interlingua-Based Broad-Coverage Korean-to-English Tranlsation in CCLINC. In: Proceedings of the first International Conference on Human language Technology Research, pp. 1–6 (2001)

    Google Scholar 

  22. Leffa, V.J.: Clause Processing in Complex Sentences. In: Proceedings of 1st International Conference on Language Resources and Evaluation, pp. 937–943 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, KS., Park, SB., Song, HJ., Park, SY., Lee, SJ. (2008). Identification of Subject Shareness for Korean-English Machine Translation. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89197-0_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89196-3

  • Online ISBN: 978-3-540-89197-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics