Skip to main content
Log in

Automatic Extraction of Rules for Anaphora Resolution of Japanese Zero Pronouns in Japanese–English Machine Translation from Aligned Sentence Pairs

  • Published:
Machine Translation

Abstract

This paper proposes a method to extract rules for the anaphora resolution of Japanese zero pronouns in Japanese–English MT from aligned sentence pairs. After aligned sentence pairs unsuitable for rule extraction because of analysis errors or free translations are automatically rejected, zero pronouns in the Japanese sentences and the English translation equivalents of their antecedents are extracted from the remaining Japanese and English aligned sentence pairs using ten hand-developed alignment rules. This method identifies all Japanese zero pronouns whose translation equivalents are not explicitly expressed in an English sentence, this method identifies these as unalignable. Then, resolution rules for the remaining zero pronouns are automatically extracted using the aligned pairs, equivalent word/phrase pairs extracted from the aligned sentence pairs, and the syntactic and semantic structures of the Japanese sentences. This method was implemented in a Japanese–English MT system, ALT-J/E. 98.4% of all pairs were automatically aligned correctly in a window test, and 94.0% in a blind test. Furthermore, extracted rules for zero pronouns with deictic references created automatically from sentence pairs correctly resolved 99.0% of the zero pronouns in a window test and 85.0% of the zero pronouns in a blind test.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Alshawi, Hiyan (ed.): 1992, Core Language Engine, MIT Press, Cambridge, Mass.

    Google Scholar 

  • Bond, Francis, Kentaro Ogura and Satoru Ikehara: 1995, ‘Possessive Pronouns as Determiners in Japanese-to-English Machine Translation’, in Proceedings of the 2nd Conference of the Pacific Association for Computational Linguistics: PACLING-95, Brisbane, Australia, pp. 32-38.

  • Brill, Eric: 1992, ‘A Simple Rule-Based Part of Speech Tagger’, Third Conference on Applied Natural Language Processing, Trento, Italy, pp. 152-155.

  • Brown, Peter F., Jennifer C. Lai and Robert L. Mercer: 1991, ‘Aligning Sentences in Parallel Corpora’, 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, California, pp. 169-176.

  • Church, Kenneth W.: 1993, ‘Charalign: A Program for Aligning Parallel Texts at the Character Level’, 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 1-8.

  • Dagan, Ido and Kenneth W Church: 1994, ‘Termight: Identifying and Translating Technical Terminology’, 4th Conference on Applied Natural Language Processing, Stuttgart, Germany, pp. 34-40.

  • Dagan, Ido, Alon Itai and Ulrike Schwall: 1991, ‘Two Languages are More Informative than One’, 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, California, pp. 130-137.

  • Dōsaka, Kōji: 1994, ‘Goyōron-teki jōken-no kaishaku-ni motodzuku nihongo zero-daimeishi-no shiji-taishō-dōtei’, [Identifying the Referents of Japanese Zero Pronouns Based on Pragmatic Condition Interpretation]. Jōhō Shori Gakkai Ronbunshi 35, 768-778.

    Google Scholar 

  • Fung, Pascale and Kenneth W. Church: 1994, ‘K-vec: A New Approach for Aligning Parallel Texts’, COLING 94: The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 1096-1102.

  • Gale, William A. and Kenneth W. Church: 1991, ‘A Program for Aligning Sentences in Bilingual Corpora’, 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, California, pp. 177-184.

  • Haruno, Masahiko and Takefumi Yamazaki: 1996, ‘High-Performance Bilingual Text Alignment using Statistical and Dictionary Information’, 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, pp. 131-138.

  • Ikehara, Satoru, Masahiro Miyazaki, Satoshi Shirai, Akio Yokoo, Hiromi Nakaiwa, Kentaro Ogura, Yoshifumi Ooyama and Yoshihiko Hayashi (eds): 1997, Nihongo Goitaikei [Japanese Lexicon], Iwanami Shoten, Tōkyō, Japan.

    Google Scholar 

  • Ikehara, Satoru, Satoshi Shirai and Kentaro Ogura: 1994, ‘Criteria for Evaluating the Linguistic Quality of Japanese-English Machine Translation’, Jinkō Chihō Gakkai Kaishi, Journal of the Japanese Society for Artificial Intelligence 9, 569-579.

    Google Scholar 

  • Ikehara, Satoru, Satoshi Shirai, Akio Yokoo and Hiromi Nakaiwa: 1993, ‘Toward an MT System without Pre-Editing — Effects of New Methods in ALT-J/E’, in Sergei Nirenburg (ed.), Progress in Machine Translation, IOS Press, Amsterdam and Ohmsha, Tokyo, pp. 161-169.

    Google Scholar 

  • Kameyama, Megumi: 1986, ‘A Property-Sharing Constraint in Centering’, 24th Annual Meeting of the Association for Computational Linguistics, New York, NY, pp. 200-206.

  • Kawai, Atsuo: 1987, ‘Nichi-ei-honyaku-shisutemu ALT-J/E-ni-okeru yōsō, jisei-no shori’, [Modality, Tense and Aspect in the Japanese-English Translation System ALT-J/E], Jōhō Shori Gakkai dai-34-kai Zenkoku Taikai, Funabashi, Japan, pp. 1245-1246.

  • Kay, Martin and Martin Röscheisen: 1993, ‘Text-Translation Alignment’, Computational Linguistics 19, 121-142.

    Google Scholar 

  • Kuno, Susumu: 1978, Danwa no Bunpō [The Grammar of Discourse], Taishūkan Shoten, Tōkyō, Japan.

    Google Scholar 

  • Kupiec, Julian: 1993, ‘An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora’, 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 17-22.

  • Lappin, Shalom and Herbert J. Leass: 1994, ‘An Algorithm for Pronominal Anaphora Resolution’, Computational Linguistics 20, 535-561.

    Google Scholar 

  • Matsumoto, Yuji, Hiroyuki Ishimoto and Takehito Utsuro: 1993, ‘Structural Matching of Parallel Texts’, 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 23-30.

  • Murata, Masaaki and Makoto Nagao: 1997, ‘Yōrei-ya hyōsō-hyōgen-o riyō-shita nihongo-bunshōchū-no shiji-shi daimei-shi zero-daimei-shi-no shiji-taishō-no suitei’, [An Estimation of Referents of Pronouns in Japanese Sentences using Examples and Surface Expressions], Shizen Gengo Shori 4, 87-109.

    Google Scholar 

  • Nakaiwa, Hiromi and Satoru Ikehara: 1992, ‘Zero Pronoun Resolution in a Japanese-English Machine Translation System by using Verbal Semantic Attributes’, Third Conference on Applied Natural Language Processing, Trento, Italy, pp. 201-208.

  • Nakaiwa, Hiromi and Satoru Ikehara: 1995, ‘Intrasentential Resolution of Japanese Zero Pronouns in a Machine Translation System using Semantic and Pragmatic Constraints’, Proceedings of the Sixth International Conference on Theoretical and Methodological Issues in Machine Translation TMI 95, Leuven, Belgium, pp. 96-105.

  • Nakaiwa, Hiromi and Satoru Ikehara: 1996, ‘Anaphora Resolution of Japanese Zero Pronouns with Deictic Reference’, COLING-96: The 16th International Conference on Computational Linguistics, Copenhagen, pp. 812-817.

  • Nakaiwa, Hiromi, Akio Yokoo and Satoru Ikehara: 1994, ‘A System of Verbal Semantic Attributes Focused on the Syntactic Correspondence between Japanese and English’, COLING 94: The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 672-678.

  • Nasukawa, Tetsuya: 1996, ‘Full-Text Processing: Improving a Practical NLP System Based on Surface Information within the Context’, COLING-96: The 16th International Conference on Computational Linguistics, Copenhagen, pp. 824-829.

  • Ogura, Kentaro, Akio Yokoo, Satoshi Shirai and Satoru Ikehara: 1993, ‘Japanese to English MT and Dictionaries’, Proceedings of the 44th Congress of the International Astronautical Federation, Graz, Austria, Paper no. IAA.5.1-93-720.

  • Sleator, Daniel and Davy Temperley: 1991, Parsing English with a Link Grammar, Technical Report CMU-CS-91-196, Computer Science Department, Carnegie Mellon University, Pittsburg, PA.

    Google Scholar 

  • Tanaka, Hideki: 1994, ‘Verbal Case Frame Acquisition from a Bilingual Corpus: Gradual Knowledge Acquisition’, COLING 94: The 15th International Conference on Computational Linguistics, Kyoto, Japan, pp. 727-731.

  • Walker, Marilyn, Masayo Iida and Sharon Cote: 1994, ‘Japanese Discourse and the Process of Centering’, Computational Linguistics 20, 193-232.

    Google Scholar 

  • Wu, Dekai: 1995, ‘An Algorithm for Simultaneously Bracketing Parallel Texts’, 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, Mass., pp. 244-251.

  • Yamada, Setsuo, Hiromi Nakaiwa and Satoru Ikehara: 1996, ‘A New Method of Automatically Aligning Expressions within Aligned Sentence Pairs’, NeMLaP-2: Proceedings of the Second International Conference on New Methods in Language Processing, Ankara, Turkey, pp. 56-65.

  • Yamada, Setsuo, Hiromi Nakaiwa, Kentaro Ogura and Satoru Ikehara: 1995, ‘A Method of Automatically Adapting a MT System to Different Domains’, Proceedings of the Sixth International Conference on Theoretical and Methodological Issues in Machine Translation TMI 95, Leuven, Belgium, pp. 303-310.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nakaiwa, H. Automatic Extraction of Rules for Anaphora Resolution of Japanese Zero Pronouns in Japanese–English Machine Translation from Aligned Sentence Pairs. Machine Translation 14, 247–279 (1999). https://doi.org/10.1023/A:1011181124507

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011181124507

Navigation