Abstract
Dropped pronouns (DPs) are ubiquitous in prodrop languages like Chinese, Japanese etc. Previous work mainly focused on painstakingly exploring the empirical features for DPs recovery. In this paper, we propose a neural recovery machine (NRM) to model and recover DPs in Chinese to avoid the non-trivial feature engineering process. The experimental results show that the proposed NRM significantly outperforms the state-of-the-art approaches on two heterogeneous datasets. Further experimental results of Chinese zero pronoun (ZP) resolution show that the performance of ZP resolution can also be improved by recovering the ZPs to DPs.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Wang L Y, Tu Z P, Zhang X J, Li H, Way A, Liu Q. A novel approach for dropped pronoun translation. In: Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Lanuage Technologies. 2016, 983–993
Baran E, Yang Y Q, Xue N W. Annotating dropped pronouns in Chinese newswire text. In: Proceedings of the 8th International Conference on Language Resources and Evaluation. 2012, 2795–2799
Yang Y Q, Liu Y L, Xue N W. Recovering dropped pronouns from Chinese text messages. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference of the Asian Federation of Natural Language Processing. 2015, 309–313
Cohen J. A coefficient of agreement for nominal scales. Journal of Educational and Psychological Measurement, 1960, 20(1): 37–46
Ronan C, JasonW. A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning. 2008, 160–167
Yin Q Y, Zhang W N, Zhang Y, Liu T. A deep neural network for Chinese zero pronoun resolution. In: Proceedings of the 26th International Joint Conferences on Artificial Intelligence. 2017, 3322–3328
Sepp H, Jurgen S. Long short-term memory. Journal of Neural Computation, 1997, 9(8): 1735–1780
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. 2013, arXiv preprint arXiv: 1310.4546
Chang C C, Lin C J. V. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 1–27
Gao Y, Zhen Y, Li H J, Chua T S. Filtering of brand-related microblogs using social-smooth multiview embedding. IEEE Transactions on Multimedia, 2016, 18(10): 2115–2126
Janez D. Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 2006, 7(1): 1–30
Chen C, Ng V. Chinese zero pronoun resolution: a joint unsupervised discourse-aware model rivaling state-of-the-art resolvers. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference of the Asian Federation of Natural Language Processing. 2015, 320–326
Yang Y Q, Xue NW. Chasing the ghost: recovering empty categories in the Chinese Treebank. In: Proceedings of the 23rd International Conference on Computational Linguistics. 2010, 1382–1390
Cai S, Chiang D, Goldberg Y. Language-independent parsing with empty elements. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. 2011, 212–216
Xue N W, Yang Y Q. Dependency-based empty category detection via phrase structure trees. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013, 1051–1060
Kong F, Zhou G D. A clause-level hybrid approach to Chinese empty element recovery. In: Proceedings of the 23rd International Joint Conferences on Artificial Intelligence. 2013, 2113–2119
Xiang B, Luo X Q, Zhou B W. Enlisting the ghost: modeling empty categories for machine translation. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. 2013, 822–831
Zhao S H, Ng H T. Identification and resolution of Chinese zero pronouns: a machine learning approach. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. 2007, 541–550
Kong F, Zhou G D. A tree kernel-based unified framework for Chinese zero anaphora resolution. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. 2010, 882–891
Chen C, Ng V. Chinese zero pronoun resolution: some recent advances. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. 2013, 1360–1365
Ryu I, Kentaro I, Yuji M. Zero-anaphora resolution by learning rich syntactic pattern features. ACM Transactions on Asian Language Information Processing, 2007, 6(4): 1
Ryohei S, Sadao K. A discriminative approach to Japanese zero anaphora resolution with large-scale lexicalized case frames. In: Proceedings of the 5th International Joint Conference on Natural Language Processing. 2011, 758–766
Kim Y J. Subject/object drop in the acquisition of Korean: a crosslinguistic comparison. Journal of East Asian Linguistics, 2000, 9(4): 325–351
Kim K S, Choi S J, Park S B, Lee S J. A two-step zero pronoun resolution by reducing candidate cardinality. In: Proceedings of Pacific Rim International Conference on Artificial Intelligence. 2012, 194–205
Chen C, Ng V. Chinese zero pronoun resolution: an unsupervised approach combining ranking and integer linear programming. In: Proceedings of AAAI Conference on Artificial Intelligence. 2014, 1622–1628
Acknowledgements
This paper was supported by the National Natural Science Foundation of China (Grant Nos. 61502120, 61472105, 61772153), Heilongjiang philosophy and social science research project (16TQD03), Young research foundation of Harbin University (HUYF2013-002), the project of university library work committee of Heilongjiang (2013-B-065).
Author information
Authors and Affiliations
Corresponding author
Additional information
Weinan Zhang is a Lecturer in Research Center for Social Computing and Information Retrieval, School of Computer Science and Technology, Harbin Institute of Technology, China. His research interest includes human-computer dialogue, natural language processing and information retrieval.
Ting Liu is a professor in Research Center for Social Computing and Information Retrieval, School of Computer Science and Technology, Harbin Institute of Technology, China. His primary research interest is natural language processing, information retrieval and social computing.
Qingyu Yin is a PhD student in Research Center for Social Computing and Information Retrieval, School of Computer Science and Technology, Harbin Institute of Technology, China. His research interest is anaphora resolution and natural language processing.
Yu Zhang is a professor in Research Center for Social Computing and Information Retrieval, School of Computer Science and Technology, Harbin Institute of Technology, China. His primary research interest is question answering, natural language processing and information retrieval.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Zhang, W., Liu, T., Yin, Q. et al. Neural recovery machine for Chinese dropped pronoun. Front. Comput. Sci. 13, 1023–1033 (2019). https://doi.org/10.1007/s11704-018-7136-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11704-018-7136-7