abstract

Conversational Bibliographic Search

Author:

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Page 3491

https://doi.org/10.1145/3539618.3591789

Published: 18 July 2023 Publication History

Get Access

Abstract

In almost every area of research, it is necessary to find experts and publications on a topic. However, finding experts and publications is a difficult task not only for computers, but also for humans. For example, searching for experts, a user often enters a topic into a search engine, which then checks which people have published on that topic. A problem arises when a user does not make their query specific enough which can happen intentionally, e.g. when the user is doing a navigational search, or unintentionally, e.g., when the user lacks knowledge. As a result, the quality of the search results may not be very high and the best results may not be found. Current and widely used search engines for bibliographic metadata, such as dblp[2], ResearchGate, Google Scholar or Semantic Scholar allow only keyword-based searches. Kreutz et al.[1] presented SchenQL, a query language for bibliographic metadata that allows users to formulate their queries more easily and precisely than SQL. However, it requires training to understand the language and is not as easy for non-experts to use e.g. Google Scholar.

To address the limitations of insufficient attention to the user's search intent and lack of search support, we aim to develop a conversational retrieval system in the domain of bibliographic metadata. This conversational search system assists users in achieving their search intent through a natural language dialog. It should be possible not only to find experts, but also to search for bibliographic metadata with the help of the system without prior knowledge.

In this work, we aim to answer the research question: How beneficial is a conversational information retrieval system for searching bibliographic data? To address the research question, our contribution is threefold. First, we present an architecture for such a conversational information retrieval system for bibliographic metadata. Second, we will implement all the components of this system and evaluate our system by comparing it to existing bibliographic data search engines in terms of effectiveness, efficiency, and user satisfaction. Third, we will create and publish a dataset consisting of user queries that we will use to train our system.

The architecture we propose consists of five main components: i) user intent classification, ii) a keyword extractor, iii) a search module, iv) a conversational module and v) the conversation history.

i) The task of user intent classification is to determine the goal the user wants to achieve with their search query. The user intent classification consists of a set of corresponding user intent classifiers, each of which is responsible for one intent. If no classifier can match the user's query to their intent, or multiple classifiers conclude that the query matches their intents, the system asks the user to specify the query accordingly. ii) If the intent is correctly determined, a keyword extractor extracts the actual search term from the query. iii) After the intent and the search term have been determined, the actual search takes place in the search module. The user's query could be reformulated into a SchenQL query (Kreutz et al.[1] and sent to the database. iv) In the conversational module, the results are converted into a natural language response and the user is given suggestions for further queries related to the previous ones. v) The conversation history stores the user's queries and the system's responses, both to consider the entire session when determining intent and to improve the system's components. For example, new question formulations could improve the accuracy of the user intent classifiers as well as reveal what new intents a user of such a conversational search system might have that have not yet been implemented.

In first experiments, we defined four user intents and already evaluated the user intent classification. The four user intents are: (1) searching for persons/authors/experts on a topic, (2) searching for publications by author name, (3) searching for publications on a topic and (4) searching for similar topics of a topic. Our classifiers achieved an accuracy of 0.998 in correctly determining user intent.

In the future, we not only want to evaluate the individual components of a conversational information retrieval system for bibliographic data, but also want to work out the advantages and disadvantages of the conversational information retrieval system for bibliographic data, in comparison to already existing systems that do not support the user in their search process via natural language conversations.

Supplemental Material

MP4 File

Presentation Video

Download
113.66 MB

References

[1]

Christin Katharina Kreutz, MichaelWolz, Jascha Knack, BenjaminWeyers, and Ralf Schenkel. 2022. SchenQL: in-depth analysis of a query language for bibliographic metadata. Int. J. Digit. Libr. 23, 2 (2022), 113--132. https://doi.org/10.1007/s00799-021-00317-8

Digital Library

Google Scholar

[2]

Michael Ley. 2009. DBLP - Some Lessons Learned. Proc. VLDB Endow. 2, 2 (2009), 1493--1500. https://doi.org/10.14778/1687553.1687577

Digital Library

Google Scholar

Index Terms

Conversational Bibliographic Search
1. Information systems
  1. Information retrieval
    1. Information retrieval query processing
  2. Information systems applications
    1. Digital libraries and archives

Recommendations

User Intent Inference for Web Search and Conversational Agents
WSDM '20: Proceedings of the 13th International Conference on Web Search and Data Mining

User intent understanding is a crucial step in designing both conversational agents and search engines. Detecting or inferring user intent is challenging, since the user utterances or queries can be short, ambiguous, and contextually dependent. To ...
Concordance-based entity-oriented search

We consider the problem of finding relevant named entities in response to a search query over a given text corpus. Entity search can readily be used to augment conventional web search engines for a variety of applications. We use entity concordance ...
A Prototype Process-Based Search Engine
ICSC '09: Proceedings of the 2009 IEEE International Conference on Semantic Computing

This paper introduces a novel approach to solving a certain type of Web search problem. The concept of Process-Based Web search is proposed and a prototype Process-Based Search Engine has been developed and evaluated to help users solve problems more ...

Comments

Information & Contributors

Information

Published In

SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2023

3567 pages

ISBN:9781450394086

DOI:10.1145/3539618

General Chairs:
Hsin-Hsi Chen
National Taiwan University
,
Wei-Jou (Edward) Duh
National Taiwan University
,
Hen-Hsen Huang
Academia Sinica
,
Program Chairs:
Makoto P. Kato
Spotify
,
Josiane Mothe
Universite de Toulouse
,
Barbara Poblete
University of Chile and Amazon Visiting Academic

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Check for updates

Author Tags

Qualifiers

Abstract

Conference

SIGIR '23

Sponsor:

SIGIR

SIGIR '23: The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 23 - 27, 2023

Taipei, Taiwan

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
65
Total Downloads

Downloads (Last 12 months)26
Downloads (Last 6 weeks)3

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

Supplemental Material

References

Index Terms

Recommendations

User Intent Inference for Web Search and Conversational Agents

Concordance-based entity-oriented search

A Prototype Process-Based Search Engine

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations