research-article

Open access

Benchmark and Neural Architecture for Conversational Entity Retrieval from a Knowledge Graph

Authors:

Fedor Nikolaev,

Alexander KotovAuthors Info & Claims

WWW '24: Proceedings of the ACM Web Conference 2024

Pages 1519 - 1528

https://doi.org/10.1145/3589334.3645676

Published: 13 May 2024 Publication History

Abstract

This paper introduces a novel information retrieval (IR) task of Conversational Entity Retrieval from a Knowledge Graph (CER-KG), which extends non-conversational entity retrieval from a knowledge graph (KG) to the conversational scenario. The user queries in CER-KG dialog turns may rely on the results of the preceding turns, which are KG entities. Similar to the conversational document IR, CER-KG can be viewed as a sequence of interrelated ranking tasks. To enable future research on CER-KG, we created QBLink-KG, a publicly available benchmark that was adapted from QBLink, a benchmark for text-based conversational reading comprehension of Wikipedia. As an initial approach to CER-KG, we experimented with Transformer- and LSTM-based query encoders in combination with the Neural Architecture for Conversational Entity Retrieval (NACER), our proposed feature-based neural architecture for entity ranking in CER-KG. NACER computes the ranking score of a candidate KG entity by taking into account diverse lexical and semantic matching signals between various KG components in its neighborhood, such as entities, categories, and literals, as well as entities in the results of the preceding turns in dialog history. The reported experimental results reveal the key challenges of CER-KG along with the possible directions for new approaches to this task.

Supplemental Material

MP4 File

Supplemental video

Download
7.36 MB

References

[1]

Sanjeev Arora, Yingyu Liang, and Tengyu Ma. 2016. A simple but tough-to-beat baseline for sentence embeddings. In Proceedings of the 4th International Conference on Learning Representations (ICLR).

[2]

Saeid Balaneshinkordan, Alexander Kotov, and Fedor Nikolaev. 2018. Attentive Neural Architecture for Ad-hoc Structured Document Retrieval. In Proceedings of the 27th ACM International on Conference on Information and Knowledge Management (CIKM). 1173--1182.

Digital Library

[3]

Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Jason Weston. 2015. Large-scale simple question answering with memory networks. arXiv preprint arXiv:1506.02075.

[4]

Denny Britz, Anna Goldie, Minh-Thang Luong, and Quoc Le. 2017. Massive Exploration of Neural Machine Translation Architectures. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1442--1451.

[5]

Shulin Cao, Jiaxin Shi, Liangming Pan, Lunyiu Nie, Yutong Xiang, Lei Hou, Juanzi Li, Bin He, and Hanwang Zhang. 2022. KQA Pro: A Dataset with Explicit Compositional Programs for Complex Question Answering over Knowledge Base. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL). 6101--6119.

[6]

Shubham Chatterjee and Laura Dietz. 2022. BERT-ER: Query-specific BERT Entity Representations for Entity Ranking. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 1466--1477.

Digital Library

[7]

Jing Chen, Chenyan Xiong, and Jamie Callan. 2016. An empirical study of learning to rank for entity search. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR). 737--740.

Digital Library

[8]

Philipp Christmann, Rishiraj Saha Roy, Abdalghani Abujabal, Jyotsna Singh, and Gerhard Weikum. 2019. Look before you hop: Conversational question answering over knowledge graphs using judicious context expansion. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM). 729--738.

Digital Library

[9]

Philipp Christmann, Rishiraj Saha Roy, and Gerhard Weikum. 2022. Conversational question answering on heterogeneous sources. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 144--154.

Digital Library

[10]

Alexis Conneau, Douwe Kiela, Holger Schwenk, Lo"i c Barrault, and Antoine Bordes. 2017. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP). 670--680.

[11]

Jeffrey Dalton, Sophie Fischer, Paul Owoicho, Filip Radlinski, Federico Rossetto, Johanne R Trippas, and Hamed Zamani. 2022. Conversational Information Seeking: Theory and Application. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 3455--3458.

Digital Library

[12]

Jeffrey Dalton, Chenyan Xiong, Vaibhav Kumar, and Jamie Callan. 2020. CAsT-19: A dataset for conversational information seeking. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 1985--1988.

Digital Library

[13]

Nicola De Cao, Gautier Izacard, Sebastian Riedel, and Fabio Petroni. 2021. Autoregressive entity retrieval. Proceedings of the 9th International Conference on Learning Representations (ICLR).

[14]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). 4171--4186.

[15]

Ahmed Elgohary, Chen Zhao, and Jordan Boyd-Graber. 2018. Dataset and baselines for sequential open-domain question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1077--1083.

[16]

Emma J Gerritse, Faegheh Hasibi, and Arjen P de Vries. 2022. Entity-aware Transformers for Entity Search. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 1455--1465.

Digital Library

[17]

Daya Guo, Duyu Tang, Nan Duan, Ming Zhou, and Jian Yin. 2018. Dialog-to-action: Conversational question answering over a large-scale knowledge base. In Proceedings of the 32nd Annual Conference on Neural Information Processing Systems (NeurIPS). 2942--2951.

[18]

Nam Hai Le, Thomas Gerald, Thibault Formal, Jian-Yun Nie, Benjamin Piwowarski, and Laure Soulier. 2023. CoSPLADE: Contextualizing SPLADE for Conversational Information Retrieval. In Proceedings of the 45th European Conference on Information Retrieval (ECIR). 537--552.

[19]

Faegheh Hasibi, Fedor Nikolaev, Chenyan Xiong, Krisztian Balog, Svein Erik Bratsberg, Alexander Kotov, and Jamie Callan. 2017. DBpedia-entity v2: a test collection for entity search. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 1265--1268.

Digital Library

[20]

Xiaodong He and David Golub. 2016. Character-level question answering with attention. In Proceedings of the 2016 conference on empirical methods in natural language processing (EMNLP). 1598--1607.

[21]

Xin Huang, Jung-Jae Kim, and Bowei Zou. 2021. Unseen Entity Handling in Complex Question Answering over Knowledge Base via Language Generation. In Findings of the Association for Computational Linguistics: EMNLP. 547--557.

[22]

Mohit Iyyer, Wen-tau Yih, and Ming-Wei Chang. 2017. Search-based neural structured learning for sequential question answering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL). 1821--1831.

[23]

Endri Kacupaj, Joan Plepi, Kuldeep Singh, Harsh Thakkar, Jens Lehmann, and Maria Maleshkova. 2021. Conversational Question Answering over Knowledge Graphs with Transformer and Graph Attention Networks. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL). 850--862.

[24]

Magdalena Kaiser, Rishiraj Saha Roy, and Gerhard Weikum. 2021. Reinforcement learning from reformulations in conversational question answering over knowledge graphs. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 459--469.

Digital Library

[25]

Gangwoo Kim, Hyunjae Kim, Jungsoo Park, and Jaewoo Kang. 2021. Learn to resolve conversational dependency: A consistency training framework for conversational question answering. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP). 6130--6141.

[26]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceesings of the 3rd International Conference on Learning Representations (ICLR).

[27]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NeurIPS), Vol. 25, 1097--1105.

[28]

Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick Van Kleef, Sören Auer, et al. 2015. DBpedia--a large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, Vol. 6, 2, 167--195.

[29]

Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. 2021. Contextualized Query Embeddings for Conversational Search. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1004--1015.

[30]

Xi Victoria Lin, Richard Socher, and Caiming Xiong. 2018. Multi-Hop Knowledge Graph Reasoning with Reward Shaping. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 3243--3253.

[31]

Denis Lukovnikov, Asja Fischer, and Jens Lehmann. 2019. Pretrained transformers for simple question answering over knowledge graphs. In Proceedings of the 2019 International Semantic Web Conference (ISWC). 470--486.

Digital Library

[32]

Denis Lukovnikov, Asja Fischer, Jens Lehmann, and Sören Auer. 2017. Neural network-based question answering over knowledge graphs on word and character level. In Proceedings of the 26th international conference on World Wide Web (WWW). 1211--1220.

Digital Library

[33]

Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press.

[34]

Pierre Marion, Pawel Nowak, and Francesco Piccinno. 2021. Structured Context and High-Coverage Grammar for Conversational Question Answering over Knowledge Graphs. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). 8813--8829.

[35]

Alexander H. Miller, Adam Fisch, Jesse Dodge, Amir-Hossein Karimi, Antoine Bordes, and Jason Weston. 2016. Key-Value Memory Networks for Directly Reading Documents. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, (EMNLP). 1400--1409.

[36]

Salman Mohammed, Peng Shi, and Jimmy Lin. 2018. Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). 291--296.

[37]

Fedor Nikolaev and Alexander Kotov. 2020. Joint Word and Entity Embeddings for Entity Retrieval from a Knowledge Graph. In Proceedings of the 42nd European Conference on Information Retrieval (ECIR). 141--155.

Digital Library

[38]

Fedor Nikolaev, Alexander Kotov, and Nikita Zhiltsov. 2016. Parameterized fielded term dependence models for ad-hoc entity retrieval from knowledge graph. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR). 435--444.

Digital Library

[39]

Michael Petrochuk and Luke Zettlemoyer. 2018. SimpleQuestions Nearly Solved: A New Upperbound and Baseline Approach. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 554--558.

[40]

Kechen Qin, Cheng Li, Virgil Pavlu, and Javed Aslam. 2021. Improving Query Graph Generation for Complex Question Answering over Knowledge Base. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). 4201--4207.

[41]

Minghui Qiu, Xinjing Huang, Cen Chen, Feng Ji, Chen Qu, Wei Wei, Jun Huang, and Yin Zhang. 2021. Reinforced history backtracking for conversational question answering. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI). 13718--13726.

[42]

Chen Qu, Liu Yang, Cen Chen, Minghui Qiu, W Bruce Croft, and Mohit Iyyer. 2020. Open-retrieval conversational question answering. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval (SIGIR). 539--548.

Digital Library

[43]

Chen Qu, Liu Yang, Minghui Qiu, W Bruce Croft, Yongfeng Zhang, and Mohit Iyyer. 2019a. BERT with history answer embedding for conversational question answering. In Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval (SIGIR). 1133--1136.

Digital Library

[44]

Chen Qu, Liu Yang, Minghui Qiu, Yongfeng Zhang, Cen Chen, W Bruce Croft, and Mohit Iyyer. 2019b. Attentive history selection for conversational question answering. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM). 1391--1400.

Digital Library

[45]

Stephen Robertson, Hugo Zaragoza, et al. 2009. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval, Vol. 3, 4 (2009), 333--389.

Digital Library

[46]

Amrita Saha, Vardaan Pahuja, Mitesh M Khapra, Karthik Sankaranarayanan, and Sarath Chandar. 2018. Complex sequential question answering: Towards learning to converse over linked question answer pairs with a knowledge graph. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI). 705--713.

[47]

Apoorv Saxena, Aditay Tripathi, and Partha Talukdar. 2020. Improving multi-hop question answering over knowledge graphs using knowledge base embeddings. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL). 4498--4507.

[48]

Tao Shen, Xiubo Geng, Tao Qin, Daya Guo, Duyu Tang, Nan Duan, Guodong Long, and Daxin Jiang. 2019. Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2442--2451.

[49]

Haitian Sun, Tania Bedrax-Weiss, and William Cohen. 2019. PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2380--2390.

[50]

Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Kathryn Mazaitis, Ruslan Salakhutdinov, and William Cohen. 2018. Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 4231--4242.

[51]

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. 2023. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).

[52]

Ferhan Ture and Oliver Jojic. 2017. No Need to Pay Attention: Simple Recurrent Neural Networks Work!. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2866--2872.

[53]

Svitlana Vakulenko, Shayne Longpre, Zhucheng Tu, and Raviteja Anantha. 2021. Question rewriting for conversational question answering. In Proceedings of the 14th ACM International Conference on Web search and Data Mining (WSDM). 355--363.

Digital Library

[54]

Nikos Voskarides, Li Dan, Ren Pengjie, Kanoulas Evangelos, and Maarten de Rijke. 2020. Query resolution for conversational search with limited supervision. In In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 921--930.

Digital Library

[55]

Ruize Wang, Duyu Tang, Nan Duan, Zhongyu Wei, Xuanjing Huang, Guihong Cao, Daxin Jiang, Ming Zhou, et al. 2021. K-adapter: Infusing knowledge into pre-trained models with adapters. In Findings of the Association for Computational Linguistics: ACL-IJCNLP. 1405--1418.

[56]

Jason Weston, Sumit Chopra, and Antoine Bordes. 2015. Memory Networks. In Proceedings of the 3rd International Conference on Learning Representations, (ICLR).

[57]

Ledell Wu, Fabio Petroni, Martin Josifoski, Sebastian Riedel, and Luke Zettlemoyer. 2020. Scalable Zero-shot Entity Linking with Dense Entity Retrieval. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 6397--6407.

[58]

Wenpeng Yin, Mo Yu, Bing Xiang, Bowen Zhou, and Hinrich Schü tze. 2016. Simple Question Answering by Attentive Convolutional Neural Network. In Proceedings of the 2016 International Conference on Computational Linguistics (COLING). 1746--1756.

[59]

Nikita Zhiltsov, Alexander Kotov, and Fedor Nikolaev. 2015. Fielded sequential dependence model for ad-hoc entity retrieval in the web of data. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 253--262. io

Digital Library

Index Terms

Benchmark and Neural Architecture for Conversational Entity Retrieval from a Knowledge Graph
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Learning contextual representations for entity retrieval
Abstract
In this paper, we introduce Contextual Entity Ranking (CoER) for the task of entity retrieval. CoER utilizes a textual knowledge graph to learn entities’ representations that are contextualized based on a given query. With these contextual ...
Utilizing Knowledge Graphs for Text-Centric Information Retrieval
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

The past decade has witnessed the emergence of several publicly available and proprietary knowledge graphs (KGs). The depth and breadth of content in these KGs made them not only rich sources of structured knowledge by themselves, but also valuable ...
Entity linking and retrieval
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

This full-day tutorial presents a comprehensive introduction to entity linking and retrieval. Part I provides a detailed overview of entity linking: identifying and disambiguating entity occurrences in unstructured text. Part II focuses on entity ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '24: Proceedings of the ACM Web Conference 2024

May 2024

4826 pages

ISBN:9798400701719

DOI:10.1145/3589334

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Proceedings Chair:
Roy Ka-Wei Lee
Singapore University of Technology and Design
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Check for updates

Badges

Artifacts Available / v1.1

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

WWW '24

Sponsor:

SIGWEB

WWW '24: The ACM Web Conference 2024

May 13 - 17, 2024

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
437
Total Downloads

Downloads (Last 12 months)437
Downloads (Last 6 weeks)65

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten