Abstract
In this paper, we propose a learning classifier based on maximum entropy (ME) for resolving ZA in Chinese. Besides regular grammatical, lexical, positional and semantic features, we develop two innovative Web-based features for extracting additional semantic information of ZA from the Web. Our study shows the Web as a knowledge source can be incorporated effectively in the learning framework and significantly improves its performance. In the application of ZA resolution in MT, it is viewed as a pre-processing module that is detachable and MT-independent. The experiment results demonstrate a signifcant improvement on BLEU/NIST scores after the ZA resolution is employed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Feng, Z.W.: New Review of Machine Translation. Chinese Publishing Company (1994)
Li, C., Thompson, S.: Mandarin Chinese - A Functional Reference Grammar. University of California Press (1981)
Resnik, P., Smith, N.A.: The web as a parallel corpus. Computational Linguistics 127, 349–380 (2003)
Yeh, C.L., Chen, Y.C.: Zero anaphora resolution in chinese with shallow parsing. Journal of Chinese Language and Computing (to appear)
Zhang, W., Zhou, C.L.: Study on meta-anaphoric resolution in chinese discourse understanding. Journal of Software 13, 732–738 (2002)
Grosz, B.J., Joshi, A.K., Weinstein, S.: Centering: a framework for modeling the local coherence of discourse. Computational Linguistics 21, 203–225 (1995)
Ge, N.Y., Hale, J., Eugene, C.: A statistical approach to anaphora resolution. In: Proc. 6th Workshop on Very Large Corpora, Montreal, Canada, pp. 161–170 (1998)
Soon, W.M., Ng, H.T., Lim, C.Y.: Machine learning approach to coreference resolution of noun phrases. Computational Linguistics 127, 521–544 (2001)
Isozaki, H., Hirao, T.: Japanese zero pronoun resolution based on ranking rules and machine learning. In: Proc. the, Conf. on Empirical Methods in NLP (EMNLP), Sapporo, Japan, pp. 184–191 (2003)
Hinrichs, E.W., Filippova, K., Wunsch, H.: A data-driven approach to pronominal anaphora resolution for german. In: Proc. Recent Advances in Natural Language Processing (RANLP), Borovets, Bulgaria, pp. 239–245 (2005)
Ratnaparkhi, A.: Maximum entropy models for natural language ambiguity resolution. PhD thesis, University of Pennsylvania, Philadelphia (1998)
Pietra, S.D., Pietra, V.D., Lafferty, J.: Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 380–393 (1977)
Wang, H., Yu, S., Zhan, W.: The specification of the semantic knowledge-based on contemporary chinese. Journal of Chinese Language and Computing 113, 159–176 (2003)
Lapata, M., Keller, F.: Web-based models for natural language processing. ACM Transactions on Speech and Language Processing (TSLP) 2, 1–31 (2005)
Papineni, K., Roukos, S., Zhu, T.: Bleu: a method for automatic evaluation of machine translation. In: Proc. 40th Annual Meeting of the Association for Computational Linguistics (ACL-2002) Philadelphia, PA, US, pp. 311–318 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Peng, J., Araki, K. (2007). Zero Anaphora Resolution in Chinese and Its Application in Chinese-English Machine Translation. In: Kedad, Z., Lammari, N., Métais, E., Meziane, F., Rezgui, Y. (eds) Natural Language Processing and Information Systems. NLDB 2007. Lecture Notes in Computer Science, vol 4592. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73351-5_32
Download citation
DOI: https://doi.org/10.1007/978-3-540-73351-5_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73350-8
Online ISBN: 978-3-540-73351-5
eBook Packages: Computer ScienceComputer Science (R0)