Elsevier

Journal of Web Semantics

Volume 19, March 2013, Pages 1-21
Journal of Web Semantics

Improving habitability of natural language interfaces for querying ontologies with feedback and clarification dialogues

https://doi.org/10.1016/j.websem.2013.02.002Get rights and content

Abstract

Natural Language Interfaces (NLIs) are a viable, human-readable alternative to complex, formal query languages like SPARQL, which are typically used for accessing semantically structured data (e.g. RDF and OWL repositories). However, in order to cope with natural language ambiguities, NLIs typically support a more restricted language. A major challenge when designing such restricted languages is habitability–how easily, naturally and effectively users can use the language to express themselves within the constraints imposed by the system. In this paper, we investigate two methods for improving the habitability of a Natural Language Interface: feedback and clarification dialogues. We model feedback by showing the user how the system interprets the query, thus suggesting repair through query reformulation. Next, we investigate how clarification dialogues can be used to control the query interpretations generated by the system. To reduce the cognitive overhead, clarification dialogues are coupled with a learning mechanism. Both methods are shown to have a positive effect on the overall performance and habitability.

Introduction

Recent years have seen a tremendous increase in structured data on the Web, with public sectors such as UK and USA governments opening their data to the public,1 and encouraging others to build useful applications on top. At the same time, the Linked Open Data (LOD) project2 continues to promote the authoring, publication and interlinking of new RDF graphs with those already in the LOD cloud [1]. In March 2009, around 4 billion RDF statements were available while in September 2010 this number increased to 25 billion, and continues to grow. This massive amount of data requires effective exploitation, which is now a great challenge largely due to the complexity and syntactic unfamiliarity of the underlying triple models and the query languages built on top of them. Natural Language Interfaces (NLIs) to rich, structured data, such as RDF and OWL repositories, are a viable, human-readable alternative.

The main challenges related to building NLIs are centred around solving the Natural Language understanding problem, the data that is being queried, and the user, and the way in which the user’s information need is verbalised into a question.

Solving the Natural Language understanding problem includes grammar analysis, and solving language ambiguity and expressiveness, e.g. [2]. Ambiguity can be avoided through the use of a Controlled Natural Language (CNL): a subset of Natural Language (NL) that includes a limited vocabulary and grammar rules that must be followed. Expressiveness can be improved by extending the system vocabulary with the use of external resources such as WordNet [3] or FrameNet [4].

The second group of challenges is related to the data that is being queried, and building portable systems—those that can be easily ported from one domain or ontology to another without significant effort. According to [5], a major challenge when building NLIs is to provide the information the system needs to bridge the gap between the way the user thinks about the domain of discourse and the way the domain knowledge is structured for computer processing. This implies that in the context of NLIs to ontologies, it is very important to consider the ontology structure and content. Two ontologies describing identical domains (e.g., music) can use different modelling conventions. For example, while one ontology can use a datatype property artistName of class Artist, the other one might use instances of a special class to model the artist’s name.3

Ontologies can be constructed to include sufficient lexical information to support a domain-independent query analysis engine. However, due to different processes used to generate ontologies, the extracted domain lexicon might be of varying quality. In addition, some words might have different meanings in two different domains. For example, “How big” might refer to height, but also to length, area, or population—depending on the question context, but also on the ontology structure. This kind of adjustments–or mappings from words or phrases to ontology concepts/relations–is performed during customisation of NLIs.

The third group of challenges is centred around the users and how they translate their information need into questions. While NLIs are intuitive, having only one text query box can pose difficulties for users, who need to express their information need through a natural language query effectively [6]. In order to address this problem, several usability enhancement methods have been developed with the aim to either assist users with query formulation, or to communicate the system’s interpretation of the query to the user. In other words, the role of these methods is to increase the habitability of the system. Habitability refers to how easily, naturally and effectively users can use language to express themselves within the constraints imposed by the system. If users can express everything they need for their tasks, using the constrained system language, then such a language is considered habitable.

Our focus is on building portable systems that do not require a strict adherence to syntax—the supported language includes both grammatically correct and ill-formed questions, but also question fragments. We look at improving the habitability of such NLIs to ontologies through the application of feedback and clarification dialogues. We first discuss habitability and the four different domains that it covers in Section 2. We then describe how we model feedback relative to the specific habitability domains, and evaluate it in a user-centric, task-based evaluation (Section 3). Further on, in Section 4 we look at clarification dialogues and whether they can improve the specific habitability domains, by making the process of mapping a NL question into a formal query, transparent to the user. We combine the dialogue with a light learning model in order to reduce the user’s cognitive overhead and improve the system’s performance over time. We then examine the approach we have taken, which combines clarification dialogues with learning, in the controlled evaluation using the Mooney GeoQuery dataset.

Section snippets

Habitability

According to Epstein [7], a language is habitable if:

  • Users are able to construct expressions of the language which they have not previously encountered, without significant conscious effort.

  • Users are able to easily avoid constructing expressions that are not part of the language.

Another way of viewing habitability is as the mismatch between user expectations and the capabilities of an NLI system [8]. Ogden and Bernick [9] describe habitability in the context of four domains [9]:

  • The conceptual

Feedback

Showing the user the system’s interpretation of the query in a suitably understandable format is called feedback. Feedback increases the user’s confidence and in the case of failures, helps the user understand which habitability domain is affected. Several early studies [11], [12] show that after receiving feedback, users are becoming more familiar with the system’s interpretations and the next step is usually that they try to imitate the system’s feedback language. In other words, returning

Clarification dialogues

Using clarification dialogues is a common way of solving the ambiguity problem in NLIs to ontologies (e.g., Querix [18], AquaLog [19]), and involves engaging the user in a dialogue whenever the system fails to solve the ambiguities automatically. This method is especially effective for large knowledge bases with a huge number of items with identical names, but also when the question is ambiguous. For example, if the user asks “How big is California?”, the system might discover ambiguity when

Related work

While research has been active in testing usability of various semantic search interfaces (see [29], [34]), little work has been done in the area of testing the usability of NLIs to ontologies themselves. There are evaluation campaigns of SEALS project21 [35], which partially address this problem, however there is a little emphasis on testing individual usability enhancement methods and their effect on habitability as well as the overall performance and usability of

Conclusion

The NLIs to ontologies that we discuss in this paper are those that are portable and also, those with a flexible supported language so that not only grammatically correct questions, but also question fragments and ill-formed questions are supported. In particular, we discussed the application of feedback and clarification dialogues and how they can affect the habitability of such Natural Language Interfaces to Ontologies.

First, we looked at the effect of modelling feedback by showing users the

Acknowledgements

We would like to thank Abraham Bernstein and Esther Kaufmann from the University of Zurich, for sharing with us the Mooney dataset in OWL format and J. Mooney from the University of Texas for making this dataset publicly available. Grateful acknowledgement for proofreading and correcting the English edition go to Amy Walkers from Kuato Studios.

This research has been partially supported by the EU-funded TAO (FP6-026460) and LarKC (FP7-215535) projects.

References (37)

  • W. Ogden et al.

    Habitability in question–answering systems, series: text, speech and language technology

  • E. Zolton-Ford

    Reducing variability in natural-language interactions with computers

  • B.M. Slator et al.

    Pygmalion at the interface

    Commun. ACM

    (1986)
  • H.H. Clark et al.

    Grounding in communication

  • D. Frohlich et al.

    Management of repair in human–computer interaction

    Hum. -Comput. Interact.

    (1994)
  • D. Damljanovic, V. Tablan, K. Bontcheva, A text-based query interface to OWL ontologies, in: 6th Language Resources and...
  • A. Bangor et al.

    Determining what individual SUS scores mean: adding an adjective rating scale

    J. Usability Stud.

    (2009)
  • Hai H. Wang et al.

    Transition of legacy systems to semantically enabled applications: TAO method and tools

    Semantic Web

    (2012)
  • Cited by (29)

    • A survey on knowledge representation in materials science and engineering: An ontological perspective

      2015, Computers in Industry
      Citation Excerpt :

      One is to provide visual query interface in which some graphical widgets are available for users to build an ontology query, e.g., OptiqueVQS [113] and VisualSPEED [114]. In the other direction, Natural Language Interface (usually called NLI) is used to enable users to make query in natural language, which then will be translated into SPARQL, e.g., Damljanović [115], SPARQL2NL [116], FREyA [117], LODQA [118], ONLI [119], and SWSNL [120]. In our opinion, features of materials domain should be considered in both directions to make the tool more convenient for users in materials science.

    • ONLI: An ontology-based system for querying DBpedia using natural language paradigm

      2015, Expert Systems with Applications
      Citation Excerpt :

      This issue can be solved through NLP techniques that allow to automatically extract question patterns from a specific domain (Dwivedi & Singh, 2014) or (Figueroa & Neumann, 2014) from sources such as Wikis, or community-driven question-and-answer (Q&A) sites, among others; (3) ONLI system needs to be improved respect to the answer searching and building process since sometimes the system provides a set of possible answers instead of a unique and precise answer. This issue can be solved by integrating a feedback (Ayan, Mandal, & Zheng, 2014; Damljanović et al., 2013) mechanism allowing users to solve possible ambiguities that cannot be solved by the system; (4) in the context of NLIKB is important to take into account the content and structure of the ontology, since two ontologies describing identical domains can use different modeling conventions. From this point of view, this study can be improved through integration of more semantic technologies that allow express semantic relations among ontology modeling conventions (Arch-int & Arch-int, 2013; Gómez, Chesñevar, & Simari, 2013; Sánchez & Batet, 2013), in order to allow searching for an answer through several knowledge bases.

    • Mímir: An open-source semantic search framework for interactive information seeking and discovery

      2015, Journal of Web Semantics
      Citation Excerpt :

      This would build on our earlier work on natural language interfaces to ontologies [49].

    • SeeQuery: An Automatic Method for Recommending Translations of Ontology Competency Questions into SPARQL-OWL

      2021, International Conference on Information and Knowledge Management, Proceedings
    View all citing articles on Scopus
    View full text