Abstract
In almost every area of research, it is necessary to find experts and publications on a topic. However, finding experts and publications is a difficult task not only for computers, but also for humans. For example, searching for experts, a user often enters a topic into a search engine, which then checks which people have published on that topic. A problem arises when a user does not make their query specific enough which can happen intentionally, e.g. when the user is doing a navigational search, or unintentionally, e.g., when the user lacks knowledge. As a result, the quality of the search results may not be very high and the best results may not be found. Current and widely used search engines for bibliographic metadata, such as dblp[2], ResearchGate, Google Scholar or Semantic Scholar allow only keyword-based searches. Kreutz et al.[1] presented SchenQL, a query language for bibliographic metadata that allows users to formulate their queries more easily and precisely than SQL. However, it requires training to understand the language and is not as easy for non-experts to use e.g. Google Scholar.
To address the limitations of insufficient attention to the user's search intent and lack of search support, we aim to develop a conversational retrieval system in the domain of bibliographic metadata. This conversational search system assists users in achieving their search intent through a natural language dialog. It should be possible not only to find experts, but also to search for bibliographic metadata with the help of the system without prior knowledge.
In this work, we aim to answer the research question: How beneficial is a conversational information retrieval system for searching bibliographic data? To address the research question, our contribution is threefold. First, we present an architecture for such a conversational information retrieval system for bibliographic metadata. Second, we will implement all the components of this system and evaluate our system by comparing it to existing bibliographic data search engines in terms of effectiveness, efficiency, and user satisfaction. Third, we will create and publish a dataset consisting of user queries that we will use to train our system.
The architecture we propose consists of five main components: i) user intent classification, ii) a keyword extractor, iii) a search module, iv) a conversational module and v) the conversation history.
i) The task of user intent classification is to determine the goal the user wants to achieve with their search query. The user intent classification consists of a set of corresponding user intent classifiers, each of which is responsible for one intent. If no classifier can match the user's query to their intent, or multiple classifiers conclude that the query matches their intents, the system asks the user to specify the query accordingly. ii) If the intent is correctly determined, a keyword extractor extracts the actual search term from the query. iii) After the intent and the search term have been determined, the actual search takes place in the search module. The user's query could be reformulated into a SchenQL query (Kreutz et al.[1] and sent to the database. iv) In the conversational module, the results are converted into a natural language response and the user is given suggestions for further queries related to the previous ones. v) The conversation history stores the user's queries and the system's responses, both to consider the entire session when determining intent and to improve the system's components. For example, new question formulations could improve the accuracy of the user intent classifiers as well as reveal what new intents a user of such a conversational search system might have that have not yet been implemented.
In first experiments, we defined four user intents and already evaluated the user intent classification. The four user intents are: (1) searching for persons/authors/experts on a topic, (2) searching for publications by author name, (3) searching for publications on a topic and (4) searching for similar topics of a topic. Our classifiers achieved an accuracy of 0.998 in correctly determining user intent.
In the future, we not only want to evaluate the individual components of a conversational information retrieval system for bibliographic data, but also want to work out the advantages and disadvantages of the conversational information retrieval system for bibliographic data, in comparison to already existing systems that do not support the user in their search process via natural language conversations.