Abstract
Knowledge-Based Question Answering (KBQA) is a technique that utilizes the rich semantic information present in knowledge bases to comprehensively understand questions and obtain answers. The mainstream approaches consist of two methods: Semantic Parsing-Based (SP-based) and Information Retrieval-Based (IR-based). The former converts the question into a logical form that can be understood and executed by machines through semantic analysis, and then queries the knowledge base for answers. The latter first identifies the topic entity in the question and retrieves candidate answers, and then extracts features from both the question and candidate answers. Finally, a ranking model is used to model and predict the question and candidate answers. Compared to the impressive results achieved by English KBQA systems, Chinese KBQA systems face challenges due to the sparse semantic expression and limited features of the Chinese knowledge base, as well as the large number of similar entities that are difficult to differentiate. This makes it difficult for general models to properly understand the text’s characteristics, resulting in a challenge to improve the accuracy of Entity Linking and to maximize the performance of the KBQA system. To address this, this paper proposes two steps to improve Entity Linking in the KBQA system: Candidate Generation (CG) and Entity Disambiguation (ED), with a focus on realizing Entity Disambiguation. In this paper, Entity Disambiguation is treated as a classification task, and a Dual-Channel Network Model based on Bi-LSTM and CNN is constructed. By combining different featuresextracted from Bi-LSTM and CNN, this paper also introduces an attention mechanism to fully explore the weak semantic relationship between the question answering system and candidate entity, effectively reducing the reliance of the question answering system on additional feature rules. Experimental results show that the Entity Linking model proposed in this paper can effectively improve the performance of the question and answer system, has strong generalization, weakens dependence on additional information, and ensures the quality of Q &A while reducing manual intervention. Our method has achieved the current best average F1 value in the Chinese open domain datasets NLPCC-2016KBQA and CCKS2019KBQA.
Similar content being viewed by others
Data Availability
The data sets supporting the results of this article are included within the article and its additional files.
References
Abujabal A, Yahya M, Riedewald M, Weikum G (2017) Automated template generation for question answering over knowledge graphs. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1191–1200. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE. https://doi.org/10.1145/3038912.3052583
Berant J, Chou A, Frostig R, Liang P (2013) Semantic parsing on freebase from question-answer pairs. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1533–1544
Berant J, Chou A, Frostig R, Liang P (2013) Semantic parsing on freebase from question-answer pairs. proceedings of EMNLP
Bizer C, Lehmann J, Kobilarov G, Auer S, Becker C, Cyganiak R, Hellmann S (2009) Dbpedia-a crystallization point for the web of data. Journal of Web Semantics 7(3):154–165
Bollacker K (2008) Freebase: A collaboratively created graph database for structuring human knowledge. Proc. SIGMOD’ 08
Bordes A, Usunier N, Chopra S, Weston J (2015) Large-scale simple question answering with memory networks. arXiv preprint arXiv:1506.02075
Botong Zhou CS, Jürgen BL (2018) Automatic question answering of large-scale knowledge base based on lstm. Journal of Peking University: Natural Science Edition(in Chinese), 286–292
Chang Zhao HL (2019) Entity linking method for knowledge base q & a. Chinese Journal of Information Technology (in Chinese)
Ding J, Hu W, Xu Q, Qu Y (2019) Leveraging frequent query substructures to generate formal queries for complex question answering. arXiv preprint arXiv:1908.11053
Do P, Phan TH (2021) Developing a bert based triple classification model using knowledge graph embedding for question answering system. Applied Intelligence 1–16
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural networks 18(5–6):602–610
Hao Y, Liu H, He S, Liu K, Zhao J (2018) Pattern-revising enhanced simple question answering over knowledge bases. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3272–3282
https://conference.bj.bcebos.com/ccks2019/eval/webpage/pdfs/eval_paper_6_1.pdf
https://conference.bj.bcebos.com/ccks2019/eval/webpage/pdfs/eval_paper_6_2.pdf
https://conference.bj.bcebos.com/ccks2019/eval/webpage/pdfs/eval_paper_6_3.pdf
Hu S, Tan Z, Zeng W, Ge B, Xiao W (2019) Entity linking via symmetrical attention-based neural network and entity structural features. Symmetry 11(4):453
Kwiatkowski T, Zettlemoyer L, Goldwater S, Steedman M (2011) Lexical generalization in ccg grammar induction for semantic parsing. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 1512–1523
Lai Y, Lin Y, Chen J, Feng Y, Zhao D (2016) Open domain question answering system based on knowledge base 722–733
Lan Y, Wang S, Jiang J (2019) Knowledge base question answering with topic units. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pp. 5046–5052. International Joint Conferences on Artificial Intelligence Organization. https://doi.org/10.24963/ijcai.2019/701
Li Y, Mu L, Li H, Zan H (2019) Automatic answer ranking based on sememe vector in kbqa. In: 2019 International Conference on Asian Language Processing (IALP), pp. 273–278. IEEE
Liu A, Huang Z, Lu H, Wang X, Yuan C (2019) Bb-kbqa: Bert-based knowledge base question answering. In: China National Conference on Chinese Computational Linguistics, pp. 81–92. Springer
Luo K, Lin F, Luo X, Zhu K (2018) Knowledge base question answering via encoding of complex query graphs. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium. https://aclanthology.org/D18-1242
Luo D, Su J, Yu S (2020) A bert-based approach with relation-aware attention for knowledge base question answering. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE
Ngomo N (2018) 9th challenge on question answering over linked data (qald-9). Language 7(1):58–64
Reddy S, Lapata M, Steedman M (2014) Large-scale semantic parsing without question-answer pairs. Transactions of the Association for Computational Linguistics 2:377–392
Schopf T, Braun D, Matthes F (2022) Evaluating unsupervised text classification: zero-shot and similarity-based approaches. arXiv preprint arXiv:2211.16285
Shen W, Wang J, Han J (2014) Entity linking with a knowledge base: Issues, techniques, and solutions. IEEE Transactions on Knowledge and Data Engineering 27(2):443–460
Sowa JF (2014) Principles of semantic networks: Explorations in the representation of knowledge
Sreedhar Kumar S, Ahmed ST, NishaBhai V (2019) Type of supervised text classification system for unstructured text comments using probability theory technique. International Journal of Recent Technology and Engineering (IJRTE) 8(10)
Stern R, Sagot B, Béchet F (2012) A joint named entity recognition and entity linking system. In: EACL 2012 Workshop on Innovative Hybrid Approaches to the Processing of Textual Data
Suchanek FM, Kasneci G, Weikum G (2007) Yago: A core of semantic knowledge unifying wordnet and wikipedia. In: International Conference on World Wide Web
Wu T, Liu L, Luo X, Qing L, Heürgen X (2021) Knowledge base Q & A method based on weakly dependent information. Comput Eng 47(6), 7 (in Chinese)
Yih SW, Chang MW, He X, Gao J (2015) Semantic parsing via staged query graph generation: Question answering with knowledge base. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Yin W, Yu M, Xiang B, Zhou B, Schütze H (2016) Simple question answering by attentive convolutional neural network. arXiv preprint arXiv:1606.03391
Yu Y, Hasan KS, Yu M, Zhang W, Wang Z (2018) Knowledge base relation detection via multi-view matching. In: New Trends in Databases and Information Systems: ADBIS 2018 Short Papers and Workshops, AI* QA, BIGPMED, CSACDB, M2U, BigDataMAPS, ISTREND, DC, Budapest, Hungary, September, 2-5, 2018, Proceedings 22, pp. 286–294. Springer
Yu M, Yin W, Hasan KS, Santos CD, Xiang B, Zhou B (2017) Improved neural relation detection for knowledge base question answering. arXiv preprint arXiv:1704.06194
Zettlemoyer LS, Collins M (2012) Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. arXiv preprint arXiv:1207.1420
Zhang L, Lin C, Zhou D, He Y, Zhang M (2021) A bayesian end-to-end model with estimated uncertainties for simple question answering over knowledge bases. Computer Speech & Language 66:101167
Zhang H, Xu G, Liang X, Huang T, et al (2018) An attention-based word-level interaction model: Relation detection for knowledge base question answering. arXiv preprint arXiv:1801.09893
Zhang F, YangJürgen Q (2020) Research on entity relationship extraction method in knowledge base q & a system. Computer engineering and application (in Chinese)
Acknowledgements
The authors gratefully acknowledge the financial supports by the National Key R &D Program of China (Grant No. 2020AAA0109300).
Funding
The research leading to these results received funding from the National Key R &D Program of China under Grant Agreement Grant No. 2020AAA0109300.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest/Competing interests.
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, Y., Wan, W., Zhao, Y. et al. Generalization performance optimization of KBQA system for Chinese open domain. Multimed Tools Appl 83, 12445–12466 (2024). https://doi.org/10.1007/s11042-023-16011-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16011-7