Subject Recognition in Chinese Sentences for Chatbots

Li, Fangyuan; Wei, Huanhuan; Hao, Qiangda; Zeng, Ruihong; Shao, Hao; Chen, Wenliang

doi:10.1007/978-3-030-32236-6_46

Fangyuan Li¹³,
Huanhuan Wei¹³,
Qiangda Hao¹³,
Ruihong Zeng¹³,
Hao Shao¹³ &
…
Wenliang Chen¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11839))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

4664 Accesses

Abstract

Subject (In this paper, subject means “ /zhu ti” in Chinese, while we use “grammatical subject” to denote traditional “ /zhu yu” in Chinese.) recognition plays a significant role in the conversation with a Chatbot. The misclassification of the subject of a sentence leads to the misjudgment of the intention recognition. In this paper, we build a new dataset for subject recognition and propose several systems based on pre-trained language models. We first design annotation guidelines for human-chatbot conversational data, and hire annotators to build a new dataset according to the guidelines. Then, classification methods based on deep neural network are proposed. Finally, extensive experiments are conducted to testify the performance of different algorithms. The results show that our method achieves 88.5% \(F_1\) in the task of subject recognition. We also compare our systems with three other Chatbot systems and find ours perform the best.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The term limited refers to specific reference and definite quantity, while non-limited means those generic reference and non-definite quantity. The subject discussed in this paper belongs to the limited, so the non-limited people are usually not regarded as subjects, such as the “men” in “All men must die”.
2.
https://github.com/kpu/kenlm.
3.
A subject-verb relation in LTP. https://github.com/HIT-SCIR/ltp.
4.
https://github.com/winnie0/ChineseSubjectRecognition.
5.
https://radimrehurek.com/gensim/models/word2vec.html.

References

Christensen, J., Soderland, S., Etzioni, O., et al.: An analysis of open information extraction based on semantic role labeling. In: K-CAP, pp. 113–120. ACM (2011)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Kim, S.M., Hovy, E.: Identifying opinion holders for question answering in opinion texts. In: AAAI 2005 Workshop, pp. 1367–1373 (2005)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: AAAI (2015)
Google Scholar
Li, F., et al.: Structure-aware review mining and summarization. In: COLING, pp. 653–661 (2010)
Google Scholar
Miyato, T., Dai, A.M., Goodfellow, I.: Adversarial training methods for semi-supervised text classification. arXiv preprint arXiv:1605.07725 (2016)
Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Punyakanok, V., Roth, D., Yih, W.t.: The importance of syntactic parsing and inference in semantic role labeling. Comput. Linguist. 34(2), 257–287 (2008)
Article Google Scholar
Qi, H., Yang, M., Meng, Y., Han, X., Zhao, T.: Skeleton parsing for specific domain Chinese text. J. Chin. Inf. Process. 18(1), 1–5 (2004). (in Chinese)
Google Scholar
Qiu, G., Liu, B., Bu, J., Chen, C.: Opinion word expansion and target extraction through double propagation. Comput. Linguist. 37(1), 9–27 (2011)
Article Google Scholar
Rusu, D., Dali, L., Fortuna, B., Grobelnik, M., Mladenic, D.: Triplet extraction from sentences. In: IMSCI, pp. 8–12 (2007)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Wang, R., Ju, J., Li, S., Zhou, G.: Feature engineering for CRFs based opinion target extraction. J. Chin. Inf. Process. 26(2), 56–61 (2012). (in Chinese)
Google Scholar
Wiegand, M., Klakow, D.: Convolution kernels for opinion holder extraction. In: NAACL-HLT, pp. 795–803 (2010)
Google Scholar
Zhou, H., Huang, M., Zhang, T., Zhu, X., Liu, B.: Emotional chatting machine: emotional conversation generation with internal and external memory. In: AAAI (2018)
Google Scholar
Zhou, P., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: ACL (2016)
Google Scholar

Download references

Acknowledgements

The authors are supported by the National Natural Science Foundation of China (Grant No. 61603240). Corresponding author is Hao Shao. We also thank the anonymous reviewers for their insightful comments.

Author information

Authors and Affiliations

Gowild Robotics Co., Ltd., Shenzhen, China
Fangyuan Li, Huanhuan Wei, Qiangda Hao, Ruihong Zeng & Hao Shao
School of Computer Science and Technology, Soochow University, Suzhou, China
Wenliang Chen

Authors

Fangyuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Huanhuan Wei
View author publications
You can also search for this author in PubMed Google Scholar
Qiangda Hao
View author publications
You can also search for this author in PubMed Google Scholar
Ruihong Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Hao Shao
View author publications
You can also search for this author in PubMed Google Scholar
Wenliang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Shao .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, F., Wei, H., Hao, Q., Zeng, R., Shao, H., Chen, W. (2019). Subject Recognition in Chinese Sentences for Chatbots. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11839. Springer, Cham. https://doi.org/10.1007/978-3-030-32236-6_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-32236-6_46
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32235-9
Online ISBN: 978-3-030-32236-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)