MEANS: A medical question-answering system combining NLP techniques and semantic Web technologies

https://doi.org/10.1016/j.ipm.2015.04.006Get rights and content

Highlights

  • We propose a semantic question answering system (MEANS) for the medical domain.

  • We introduce a novel query relaxation approach for question answering.

  • MEANS integrates NLP methods allowing a deep analysis of questions and documents.

  • MEANS uses semantic Web technologies and standards for data sharing and integration.

  • Our experiments show promising results in terms of MRR and precision.

Abstract

The Question Answering (QA) task aims to provide precise and quick answers to user questions from a collection of documents or a database. This kind of IR system is sorely needed with the dramatic growth of digital information. In this paper, we address the problem of QA in the medical domain where several specific conditions are met. We propose a semantic approach to QA based on (i) Natural Language Processing techniques, which allow a deep analysis of medical questions and documents and (ii) semantic Web technologies at both representation and interrogation levels. We present our Semantic Question-Answering System, called MEANS and our proposed method for “Answer Search” based on semantic search and query relaxation. We evaluate the overall system performance on real questions and answers extracted from MEDLINE articles. Our experiments show promising results and suggest that a query-relaxation strategy can further improve the overall performance.

Introduction

The increasing knowledge accessible via internet affects our habits to find information and to obtain answers to our questions. According to an american health survey1 published in January 2013, 35% of U.S. adults state that they have gone online specifically to try to figure out what medical condition they or someone else might have. Asked about the accuracy of their initial diagnosis, 41% of “online diagnosers” (who searched for answers on the internet) say a medical professional confirmed their diagnosis, but 35% say they did not visit a clinician to get a professional opinion. 18% say they consulted a medical professional and the clinician either did not agree or offered a different opinion about the condition. 77% say that they start looking for health information using a search engine. However, while these search engines contribute strongly in making large volumes of medical knowledge accessible, their users have often to deal with the burden of browsing and filtering the numerous results of their queries in order to find the precise information they were looking for. This point is more crucial for practitioners who may need an immediate answer to their questions during their work.

Ely et al. (1999) presented an observational study in which investigators visited family doctors for two half days and collected their questions. The 103 doctors saw 2467 patients and asked 1101 questions during 732 observation hours. Each doctor asked an average of 7.6 questions during the two half days (3.2 questions per 10 patients).

Covell, Uman, and Manning (1985) studied information needs of physicians during office practice. In their study, information needs were obtained by self-reports from 47 physicians who raised a total of 269 questions during their half-day practice. The raised questions were very heterogeneous in terms of topics and highly specific to the patients. On average only 30% of their information needs were met during the patient visit, most often by other physicians having different subspecialities. As shown in their study, print sources were not used for several reasons such as inadequate indexation of books and drug information sources, age of the available textbooks, lack of knowledge about the relevant sources or the time needed to access the required information.

In this context, we need tools such as question answering (QA) systems in order to respond to user queries with precise answers. Such systems need deep analysis of both user questions and documents in order to extract the relevant information. At the first level of this information come the medical entities (e.g. diseases, drugs, symptoms). At the second, more complicated level comes the extraction of semantic relations between these entities (e.g. treats, prevents, causes).

Within an overall common framework, QA systems aim to provide precise answers to natural language questions. The answer can be a piece of text extracted from a document collection (Demner-Fushman & Lin, 2006) or the Web (Lin & Katz, 2003) as well as data retrieved from a database (Popescu, Etzioni, & Kautz, 2003) or a knowledge base (Rinaldi, Dowdall, & Schneider, 2004). In more rare cases, the returned answers are multimedia information (Katz, 1999). A question answering system can be composed of three main tasks: (i) analysis of the user question, (ii) analysis of the documents used to find the answers and (iii) answer retrieval and extraction. The second task is not required for systems that use databases or knowledge bases as answer sources. Methods used to analyze questions and/or documents can be semantic, surface-level or hybrid.

In this paper, we address the problem of answering English questions formulated in natural language. We consider several types of questions, but we focus on two main types: (i) factual questions expressed by WH pronouns and (ii) boolean questions expecting a yes/no answer. An answer can be (i) a medical entity for factual questions or (ii) Yes or No for boolean questions. Moreover, for each answer extracted from a corpus, we associate a justification2 including the line containing the answer, the two previous sentences and the two following sentences. We focus on searching and extracting answers from scientific articles and clinical texts. However, the proposed approach can be extended to consider other resources like websites, textual corpora, Linked Open Data and ontologies.

There are three main contributions in this paper:

  • 1.

    We propose an original system for medical QA combining: (i) NLP methods which allow a deep analysis of medical questions and corpora used to find answers and (ii) semantic Web technologies which offer a high level of expressiveness and standards for data sharing and integration.

  • 2.

    We introduce a novel query relaxation approach for QA systems that deals with errors or weaknesses of NLP methods in some cases (e.g. implicit information, need for reasoning).

  • 3.

    We experimentally evaluate our system, called MEANS, with a benchmark (Corpus for Evidence Based Medicine Summarisation) and we discuss the obtained results.

The remainder of the paper is organized as follows. Section 2 introduces related work and discussion about the main QA approaches with a particular focus on the medical domain. Section 3 describes the overall architecture of the proposed approach and its main three steps: offline corpora annotation using NLP methods (described in Section 4), online question analysis (described in Section 5) and answer retrieval based on semantic search and query relaxation (described in Section 6). Section 7 presents our experiments on a standard corpus and the results of our QA system MEANS. In Section 8, we discuss the combined use of NLP methods and semantic technologies, then we analyze the error cases for the boolean and factual questions. Finally, the conclusions are made in Section 9.

Section snippets

Related work

BASEBALL (Green, Wolf, Chomsky, & Laughery, 1961) and LUNAR (Woods, 1973) are among the first known question answering systems. BASEBALL was able to answer questions about dates, locations and American baseball games. LUNAR was one of the first scientific question-answering systems. It was conceived to support the geological analysis of the rocks brought by the Apollo mission. In its evaluation, it answered correctly 90% of the questions asked by human users. Both BASEBALL and LUNAR exploited

Proposed approach

In this paper we propose a semantic approach to medical question-answering from document corpora. Fig. 1 presents the main steps of our approach which are: corpora annotation (detailed in Section 4), question analysis (described in Section 5) and answer search (Section 6). We apply NLP methods to analyze the source documents used to extract the answers and the users questions expressed in natural language (NL).

We exploit this first NL analysis to build RDF annotations of the source documents

Offline corpora annotation using NLP methods

NLP methods exploiting medical and semantic resources (e.g. UMLS) are well suited to process textual corpora and to annotate them. We use NLP methods to extract medical entities and semantic relations.

Question classification

We propose a classification of medical questions into 10 categories:

  • 1.

    Yes/No questions (e.g. “Can Group B streptococcus cause urinary tract infections in adults?”)

  • 2.

    Explanation, Reason or “why” questions (e.g. “Why do phenobarbital and Dilantin counteract each other?”)

  • 3.

    Condition, case or the greatest number of “when” questions (e.g. “When would you use gemfibrozil rather than an HMG (3-hydroxy-3-methylglutaryl) coenzyme A inhibitor?”)

  • 4.

    Manner, some “how” questions (e.g. (i) “How are homocysteine and

Answer search

Ontology-based information retrieval has several benefits such as (i) handling synonymy and morphological variations and (ii) allowing semantic query expansion or approximation through subsumption and domain relations. For example, if we search the term “ACTH stimulation test”, the semantic search engine can find documents containing “cosyntropin test” or “tetracosactide test” or “Synacthen test” as they are synonyms. Also, if we search treatments for “Cardiovascular disease” we can also find,

Evaluation criteria

Several QA evaluation campaigns such as TREC21 (English), CLEF22 (multi-language), NTCIR23 (Japenese) and Quaero24 (French, English) have been conducted in open domain. In the medical field, English QA challenges are rare. The Genomics task of the TREC challenge can be seen as an related track, even if it was not officially introduced as such.

The performance of QA systems is often evaluated using

NLP methods

We experimented NLP methods on different corpora (e.g. clinical texts, scientific articles). Our experiments confirm the fact that:

  • Rule or pattern based methods have average performances, even weak, on large corpora with heterogeneous documents.

  • Statistical methods can be very robust, but their performance diminishes significantly if (i) a small number of training examples is available and/or (ii) the test corpus has different characteristics from the training corpus.

To tackle the scalability

Conclusion

In this paper, we tackled automatic Question Answering in the medical domain and presented an original approach having four main characteristics:

  • The proposed approach allows dealing with different types of questions, including questions with more than one expected answer type and more than one focus.

  • It allows a deep analysis for questions and corpora using different information extraction methods based on (i) domain knowledge and (ii) Natural Language Processing techniques (e.g use of patterns,

References (45)

  • V. Lopez et al.

    Evaluating question answering over linked data

    Journal of Web Semantics

    (2013)
  • R.M. Terol et al.

    A knowledge based method for the medical question answering problem

    Computers in Biology and Medicine

    (2007)
  • Aronson, A. R. (2001). Effective mapping of biomedical text to the UMLS metathesaurus: The MetaMap program (Vol. 8, pp....
  • Ben Abacha, A., & Zweigenbaum, P. (2011). A hybrid approach for the extraction of semantic relations from MEDLINE...
  • Ben Abacha, A., & Zweigenbaum, P. (2012). Medical question answering: Translating medical questions into sparql...
  • A. Ben Abacha et al.

    Medical entity recognition: A comparison of semantic and statistical methods

  • Y.-G. Cao et al.

    Evaluation of the clinical question answering presentation

  • Cimiano, P., Haase, P., Heizmann, J., Mantel, M., & Studer, R. (2008). Towards portable natural language interfaces to...
  • K.B. Cohen et al.

    High-precision biological event extraction: Effects of system and of data

    Computational Intelligence

    (2011)
  • D.G. Covell et al.

    Information needs in office practice: are they being met?

    Annals Of Internal Medicine

    (1985)
  • Demner-Fushman, D., & Lin, J. (2005). Knowledge extraction for clinical question answering: Preliminary results. In...
  • Demner-Fushman, D., & Lin, J. J., (2006). Answer extraction, semantic clustering, and extractive summarization for...
  • J.W. Ely et al.

    Analysis of questions asked by family doctors regarding patient care

    BMJ

    (1999)
  • J.W. Ely et al.

    Obstacles to answering doctors’ questions about patient care with evidence: Qualitative study

    British Medical Journal

    (2002)
  • J.W. Ely et al.

    A taxonomy of generic clinical questions: Classification study

    British Medical Journal

    (2000)
  • B.F. Green et al.

    Baseball: an automatic question-answerer

  • B.L. Humphreys et al.

    The UMLS project: Making the conceptual connection between users and the information they need

    Bulletin of the Medical Library Association

    (1993)
  • P. Jacquemart et al.

    Towards a medical question-answering system: A feasibility study

  • Katz, B. (1999). From sentence processing to information access on the world wide web. In AAAI spring symposium on...
  • Katz, B., Felshin, S., Yuret, D., Ibrahim, A., Lin, J. J., Marton, G., et al. (2002). Omnibase: Uniform access to...
  • H. Kilicoglu et al.

    Effective bio-event extraction using trigger words and syntactic dependencies

    Computational Intelligence

    (2011)
  • Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for...
  • Cited by (206)

    View all citing articles on Scopus
    View full text