Fusing Essential Knowledge for Text-Based Open-Domain Question Answering

Su, Xiao; Li, Ying; Wu, Zhonghai

doi:10.1007/978-3-030-75765-6_50

Xiao Su ORCID: orcid.org/0000-0002-0666-7741¹⁵,
Ying Li¹⁶ &
Zhonghai Wu¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12713))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2289 Accesses
1 Citations

Abstract

Question answering (QA) systems can be classified as either text-based QA systems or knowledge base QA (KBQA) systems, depending on the used knowledge source. KBQA systems are generally domain-specific and can’t deal with a variety of questions in the open-domain QA setting, while text-based systems can. However, text-based systems’ performance is far from satisfactory. This paper focuses on the text-based open-domain QA setting. We argue that text-based approaches’ poor performance is largely caused by the lack of knowledge, which is often essential for answering the question and can be easily found in knowledge base (KB), in plain text. So in this paper, we propose a new text-based open-domain QA system called KF (Knowledge Fusion)-QA, which uses KB as a second knowledge source to incorporate essential knowledge into text to help answer the question. Our system has a Knowledge-Aware Encoder which extracts essential knowledge from KB and performs knowledge fusion to output knowledge-aware (KA) text representations. With this KA representations, the system first re-rank the retrieved documents, then read the re-ranked top-N documents to give the answer. Our system significantly outperforms existing text-based QA systems on multiple open-domain QA datasets, demonstrating the effectiveness of fusing essential knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Berant, J., Chou, A., Frostig, R., Liang, P.: Semantic parsing on freebase from question-answer pairs. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1533–1544 (2013)
Google Scholar
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250 (2008)
Google Scholar
Cer, D., et al.: Universal sentence encoder. arXiv preprint arXiv:1803.11175 (2018)
Chen, D., Bolton, J., Manning, C.D.: A thorough examination of the CNN/daily mail reading comprehension task. arXiv preprint arXiv:1606.02858 (2016)
Chen, D., Fisch, A., Weston, J., Bordes, A.: Reading Wikipedia to answer open-domain questions. arXiv preprint arXiv:1704.00051 (2017)
Das, R., Dhuliawala, S., Zaheer, M., McCallum, A.: Multi-step retriever-reader interaction for scalable open-domain question answering. arXiv preprint arXiv:1905.05733 (2019)
Dhingra, B., Mazaitis, K., Cohen, W.W.: Quasar: datasets for question answering by search and reading. arXiv preprint arXiv:1707.03904 (2017)
Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by Wikipedia entities). In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1625–1628 (2010)
Google Scholar
Haveliwala, T.H.: Topic-sensitive PageRank: a context-sensitive ranking algorithm for web search. IEEE Trans. Knowl. Data Eng. 15(4), 784–796 (2003)
Article Google Scholar
Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 (2020)
Lee, K., Chang, M.W., Toutanova, K.: Latent retrieval for weakly supervised open domain question answering. arXiv preprint arXiv:1906.00300 (2019)
Lin, Y., Ji, H., Liu, Z., Sun, M.: Denoising distantly supervised open-domain question answering. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Long Papers), vol. 1, pp. 1736–1745 (2018)
Google Scholar
Miller, A., Fisch, A., Dodge, J., Karimi, A.H., Bordes, A., Weston, J.: Key-value memory networks for directly reading documents. arXiv preprint arXiv:1606.03126 (2016)
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)
Sun, H., Dhingra, B., Zaheer, M., Mazaitis, K., Salakhutdinov, R., Cohen, W.W.: Open domain question answering using early fusion of knowledge bases and text. arXiv preprint arXiv:1809.00782 (2018)
Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
Wang, S., et al.: R 3: Reinforced ranker-reader for open-domain question answering. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Xiong, W., Yu, M., Chang, S., Guo, X., Wang, W.Y.: Improving question answering over incomplete KBs with knowledge-aware reader. arXiv preprint arXiv:1905.07098 (2019)

Download references

Author information

Authors and Affiliations

Center for Data Science, Peking University, Beijing, China
Xiao Su
National Engineering Research Center of Software Engineering, Peking University, Beijing, China
Ying Li & Zhonghai Wu

Authors

Xiao Su
View author publications
You can also search for this author in PubMed Google Scholar
Ying Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhonghai Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Li .

Editor information

Editors and Affiliations

IIIT, Hyderabad, Hyderabad, India
Kamal Karlapalem
Chinese University of Hong Kong, Shatin, Hong Kong
Hong Cheng
Virginia Tech, Arlington, VA, USA
Naren Ramakrishnan
Jawaharlal Nehru University, New Delhi, India
R. K. Agrawal
IIIT Hyderabad, Hyderabad, India
P. Krishna Reddy
University of Minnesota, Minneapolis, MN, USA
Jaideep Srivastava
IIIT Delhi, New Delhi, India
Tanmoy Chakraborty

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Su, X., Li, Y., Wu, Z. (2021). Fusing Essential Knowledge for Text-Based Open-Domain Question Answering. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12713. Springer, Cham. https://doi.org/10.1007/978-3-030-75765-6_50

Download citation

DOI: https://doi.org/10.1007/978-3-030-75765-6_50
Published: 08 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75764-9
Online ISBN: 978-3-030-75765-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics