Identification of Non-referential Zero Pronouns for Korean-English Machine Translation

Kim, Kye-Sung; Park, Seong-Bae; Song, Hyun-Je; Park, Se Young; Lee, Sang-Jo

doi:10.1007/978-3-642-15246-7_13

Kye-Sung Kim²¹,
Seong-Bae Park²¹,
Hyun-Je Song²¹,
Se Young Park²¹ &
…
Sang-Jo Lee²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6230))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

1626 Accesses

Abstract

The common use of null arguments is one of the most critical issues in pro-drop languages. When translating Korean into other languages, the omitted elements should be replaced with appropriate pronouns to get grammatical target sentences. One of the most important issues when dealing with zero pronouns is to determine the referentiality of zero pronouns. Since, like expletive ‘it’ in English, omitted elements do not have always explicit referents, it is important to determine whether a zero pronoun is referential or not. In this paper, we focus on identifying non-referential zero pronouns. Since non-referential zero pronouns are likely to occur in similar contexts, referentiality determination in this paper is regarded as the identification of clauses containing non-referential zero pronouns. Our method outperforms the baseline systems using n-grams and bag of words, and achieves the F-measure of 0.51 and 0.78.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Halliday, M.A.K., Hasan, R.: Cohesion in English. London Publishing Group (1976)
Google Scholar
Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: A Framework for Modeling the Local Coherence of Discourse. Computational Linguistics 21(2), 203–225 (1995)
Google Scholar
Haussler, D.: Convolution Kernels on Discrete Structures. UCS-CRL-99-10, UC Santa Cruz (1999)
Google Scholar
Joachims, T.: Making large-Scale SVM Learning Practical. In: Scholkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1999)
Google Scholar
Japkowicz, N.: The class imbalance problem: Significance and strategies. In: The International Conference on Artificial Intelligence, Las Vegas (2000)
Google Scholar
Collins, M., Duffy, N.: Convolution Kernels for Natural Language. In: Neural Information Processing Systems (NIPS), pp. 625–632 (2001)
Google Scholar
Soon, W.M., Ng, H.T., Lim, D.C.Y.: A Machine Learning Approach to Coreference Resolution of Noun Phrases. Computational Lingusitics 27(4), 521–544 (2001)
Article Google Scholar
Evans, R.: Applying machine learning toward an automatic classification of it. Literary and Linguistic Computing 16(1), 45–57 (2002)
Article Google Scholar
Kotsiantis, S.B., Pintelas, P.E.: Mixture of Expert Agents for Handling Imbalanced Data Sets. Annals of Mathematics, Computing & Teleinformatics 1(1), 46–55 (2003)
Google Scholar
Ng, V.: Learning noun phrase anaphoricity to improve coreference resolution: Issues in representation and optimization. In: 42nd Annual Meeting of the Association for Computational Linguistics (ACL), pp. 152–159 (2004)
Google Scholar
Iida, R., Inui, K., Matsumoto, Y.: Anaphora Resolution by Antecedent Identification Followed by Anaphoricity Determination. ACM Transactions on Asian Language Information Processing 4(4), 417–434 (2005)
Article Google Scholar
Han, N.-R.: Korean Zero Pronous: Analysis and Resolution. Doctoral dissertation, Department of Linguistics at the University of Pennsylvania (2006)
Google Scholar
Moschitti, A.: Making Tree Kernels Practical for Natural Language Learning. In: 11th International Conference on European Association for Computational Linguistics, pp. 113–120 (2006)
Google Scholar
Roh, J.-E., Lee, J.-H.: Generation of Zero Pronouns Based on the Centering Theory and Pairwise Salience of Entities. IEICE Transactions on Information and Systems E89-D(2), 837–846 (2006)
Article Google Scholar
Iida, R., Inui, K., Matsumoto, Y.: Zero-Anaphora Resolution by Learning Rich Syntactic Pattern Features. ACM Transactions on Asian Language Information Processing, article 12, 6(4) (2007)
Google Scholar
Zhao, S., Ng, H.T.: Identification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach. In: 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 541–550 (2007)
Google Scholar
Bergsma, S., Lin, D., Gorbel, R.: Distributional Identification of Non-Referential Pronouns. In: ACL-HLT 2008, Columbus, Ohio, pp. 10–18 (2008)
Google Scholar
Kim, K.-S., Park, S.-B., Song, H.-J., Park, S.-Y., Lee, S.-J.: Identification of Subject Shareness for Korean-English Machine Translation. In: 10th Pacific Rim International Conference on Artificial Intelligence, pp. 211–222 (2008)
Google Scholar
Yang, X., Su, J., Tan, C.L.: A Twin-Candidate Model for Learning-Based Anaphora Resolution. Computational Linguistics 34(3), 3270–3356 (2008)
Article Google Scholar
Iida, R., Inui, K., Matsumoto, Y.: Capturing Salience with a Trainable Cache Model for Zero-anaphora Resolution. In: Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing, pp. 647–655 (2009)
Google Scholar
Wu, D., Liang, T.: Zero Anaphora Resolution by Case-based Reasoning and Pattern Conceptualization. Expert Systems with Applications 36(4), 7544–7551 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Kyungpook National University, 702-701, Daegu, Korea
Kye-Sung Kim, Seong-Bae Park, Hyun-Je Song, Se Young Park & Sang-Jo Lee

Authors

Kye-Sung Kim
View author publications
You can also search for this author in PubMed Google Scholar
Seong-Bae Park
View author publications
You can also search for this author in PubMed Google Scholar
Hyun-Je Song
View author publications
You can also search for this author in PubMed Google Scholar
Se Young Park
View author publications
You can also search for this author in PubMed Google Scholar
Sang-Jo Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Engineering, Seoul National University, 151-744, Seoul, Korea
Byoung-Tak Zhang
Department of Computing,, Macquarie University, NSW, Sydney, Australia
Mehmet A. Orgun

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, KS., Park, SB., Song, HJ., Park, S.Y., Lee, SJ. (2010). Identification of Non-referential Zero Pronouns for Korean-English Machine Translation. In: Zhang, BT., Orgun, M.A. (eds) PRICAI 2010: Trends in Artificial Intelligence. PRICAI 2010. Lecture Notes in Computer Science(), vol 6230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15246-7_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-15246-7_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15245-0
Online ISBN: 978-3-642-15246-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics