Abstract
Entity linking, which usually involves mention recognition and entity disambiguation, is an important task in knowledge base question and answer (KBQA). However, due to the diversity of Chinese grammatical structure, the complexity of Chinese natural language expressions and the lack of contextual information, there are still many challenges in the task of the Chinese KBQA. We discussed two subtasks of the entity linking separately. For the mention recognition part, in order to get the only topic entity mention of the question, we proposed a topic entity mention recognition algorithm based on sequence annotation. The algorithm combines a variety of feature vectors based on word embedding, and uses model BiGRU-CRF model to perform sequence labeling modeling. We also proposed an entity disambiguation algorithm based on a similarity calculation with extended information. The algorithm not only realized the information expansion by crawling the candidate entity for related problems, but also made full use of contextual information by combining lexical level similarity and sentence semantic similarity. In addition, the experimental results show that the proposed entity linking solution possesses huge advantages compared to several baseline systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Alibaba Co.
- 2.
Yun Ma is the creator of Alibaba Co.
- 3.
- 4.
Smartisan Technology Co., Ltd., commonly known as Smartisan, is a Chinese multinational technology company headquartered in Beijing and Chengdu.
References
Basile, P., Caputo, A.: Entity linking for tweets. Encycl. Seman. Comput. Rob. Intell. 01(01), 1630020 (2017)
Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, vol. 385. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-642-24797-2
Gutmann, B., Kersting, K.: TildeCRF: conditional random fields for logical sequences. In: European Conference on Machine Learning (2006)
Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with wikipedia. Artif. Intell. 194(3), 130–150 (2013)
Han, X., Le, S., Zhao, J.: Collective entity linking in web text: a graph-based method (2011)
Hancock, J.M.: Jaccard Distance (Jaccard Index, Jaccard Similarity Coefficient) (2014)
Hkiri, A.O.E., Mallat, S., Zrigui, M.: Improving coverage of rule based NER systems. In: International Conference on Information & Communication Technology & Accessibility (2016)
Hoffart, J., et al.: Robust disambiguation of named entities in text. In: Conference on Empirical Methods in Natural Language Processing (2015)
Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)
Pilz, A., Paaß, G.: From names to entities using thematic context distance (2011)
Ratinov, L.A., Dan, R., Downey, D., Anderson, M.: Local and global algorithms for disambiguation to wikipedia. In: Meeting of the Association for Computational Linguistics: Human Language Technologies (2011)
Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)
Wei, Z., Yan, C.S., Jian, S., Tan, C.L.: Entity linking with effective acronym expansion, instance selection and topic modeling. In: International Joint Conference on Artificial Intelligence (2011)
Yao, H., Liu, H., Zhang, P.: A novel sentence similarity model with word embedding based on convolutional neural network: sentence similarity model with word embedding based on convolutional neural network. Concurrency Comput. Pract. Experience 30, e4415 (2018)
Zheng, Z., Li, F., Huang, M., Zhu, X.: Learning to link entities with knowledge base. In: Human Language Technologies: The Conference of the North American Chapter of the Association for Computational Linguistics (2010)
Acknowledgements
Gang Wu is supported by the NSFC (Grant No. 61872072), the State Key Laboratory of Computer Software New Technology Open Project Fund (Grant No. KFKT2018B05), and the National Key R&D Program of China (Grant No. 2016YFC1401900).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, G., Wu, W., Ji, H., Hou, X., Xia, L. (2020). Enhanced Entity Mention Recognition and Disambiguation Technologies for Chinese Knowledge Base Q&A. In: Wang, X., Lisi, F., Xiao, G., Botoeva, E. (eds) Semantic Technology. JIST 2019. Lecture Notes in Computer Science(), vol 12032. Springer, Cham. https://doi.org/10.1007/978-3-030-41407-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-41407-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41406-1
Online ISBN: 978-3-030-41407-8
eBook Packages: Computer ScienceComputer Science (R0)