Skip to main content

Enhanced Entity Mention Recognition and Disambiguation Technologies for Chinese Knowledge Base Q&A

  • Conference paper
  • First Online:
Book cover Semantic Technology (JIST 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12032))

Included in the following conference series:

  • 1023 Accesses

Abstract

Entity linking, which usually involves mention recognition and entity disambiguation, is an important task in knowledge base question and answer (KBQA). However, due to the diversity of Chinese grammatical structure, the complexity of Chinese natural language expressions and the lack of contextual information, there are still many challenges in the task of the Chinese KBQA. We discussed two subtasks of the entity linking separately. For the mention recognition part, in order to get the only topic entity mention of the question, we proposed a topic entity mention recognition algorithm based on sequence annotation. The algorithm combines a variety of feature vectors based on word embedding, and uses model BiGRU-CRF model to perform sequence labeling modeling. We also proposed an entity disambiguation algorithm based on a similarity calculation with extended information. The algorithm not only realized the information expansion by crawling the candidate entity for related problems, but also made full use of contextual information by combining lexical level similarity and sentence semantic similarity. In addition, the experimental results show that the proposed entity linking solution possesses huge advantages compared to several baseline systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Alibaba Co.

  2. 2.

    Yun Ma is the creator of Alibaba Co.

  3. 3.

    https://zhidao.baidu.com.

  4. 4.

    Smartisan Technology Co., Ltd., commonly known as Smartisan, is a Chinese multinational technology company headquartered in Beijing and Chengdu.

References

  1. Basile, P., Caputo, A.: Entity linking for tweets. Encycl. Seman. Comput. Rob. Intell. 01(01), 1630020 (2017)

    Article  Google Scholar 

  2. Graves, A.: Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, vol. 385. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-642-24797-2

    Book  MATH  Google Scholar 

  3. Gutmann, B., Kersting, K.: TildeCRF: conditional random fields for logical sequences. In: European Conference on Machine Learning (2006)

    Google Scholar 

  4. Hachey, B., Radford, W., Nothman, J., Honnibal, M., Curran, J.R.: Evaluating entity linking with wikipedia. Artif. Intell. 194(3), 130–150 (2013)

    Article  MathSciNet  Google Scholar 

  5. Han, X., Le, S., Zhao, J.: Collective entity linking in web text: a graph-based method (2011)

    Google Scholar 

  6. Hancock, J.M.: Jaccard Distance (Jaccard Index, Jaccard Similarity Coefficient) (2014)

    Google Scholar 

  7. Hkiri, A.O.E., Mallat, S., Zrigui, M.: Improving coverage of rule based NER systems. In: International Conference on Information & Communication Technology & Accessibility (2016)

    Google Scholar 

  8. Hoffart, J., et al.: Robust disambiguation of named entities in text. In: Conference on Empirical Methods in Natural Language Processing (2015)

    Google Scholar 

  9. Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)

  10. Lafferty, J., McCallum, A., Pereira, F.C.: Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)

    Google Scholar 

  11. Pilz, A., Paaß, G.: From names to entities using thematic context distance (2011)

    Google Scholar 

  12. Ratinov, L.A., Dan, R., Downey, D., Anderson, M.: Local and global algorithms for disambiguation to wikipedia. In: Meeting of the Association for Computational Linguistics: Human Language Technologies (2011)

    Google Scholar 

  13. Shen, W., Wang, J., Han, J.: Entity linking with a knowledge base: issues, techniques, and solutions. IEEE Trans. Knowl. Data Eng. 27(2), 443–460 (2015)

    Article  Google Scholar 

  14. Wei, Z., Yan, C.S., Jian, S., Tan, C.L.: Entity linking with effective acronym expansion, instance selection and topic modeling. In: International Joint Conference on Artificial Intelligence (2011)

    Google Scholar 

  15. Yao, H., Liu, H., Zhang, P.: A novel sentence similarity model with word embedding based on convolutional neural network: sentence similarity model with word embedding based on convolutional neural network. Concurrency Comput. Pract. Experience 30, e4415 (2018)

    Article  Google Scholar 

  16. Zheng, Z., Li, F., Huang, M., Zhu, X.: Learning to link entities with knowledge base. In: Human Language Technologies: The Conference of the North American Chapter of the Association for Computational Linguistics (2010)

    Google Scholar 

Download references

Acknowledgements

Gang Wu is supported by the NSFC (Grant No. 61872072), the State Key Laboratory of Computer Software New Technology Open Project Fund (Grant No. KFKT2018B05), and the National Key R&D Program of China (Grant No. 2016YFC1401900).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gang Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wu, G., Wu, W., Ji, H., Hou, X., Xia, L. (2020). Enhanced Entity Mention Recognition and Disambiguation Technologies for Chinese Knowledge Base Q&A. In: Wang, X., Lisi, F., Xiao, G., Botoeva, E. (eds) Semantic Technology. JIST 2019. Lecture Notes in Computer Science(), vol 12032. Springer, Cham. https://doi.org/10.1007/978-3-030-41407-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41407-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41406-1

  • Online ISBN: 978-3-030-41407-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics