MEANS: A medical question-answering system combining NLP techniques and semantic Web technologies

doi:10.1016/j.ipm.2015.04.006

Information Processing & Management

Volume 51, Issue 5, September 2015, Pages 570-594

https://doi.org/10.1016/j.ipm.2015.04.006 Get rights and content

Highlights

•
We propose a semantic question answering system (MEANS) for the medical domain.
•
We introduce a novel query relaxation approach for question answering.
•
MEANS integrates NLP methods allowing a deep analysis of questions and documents.
•
MEANS uses semantic Web technologies and standards for data sharing and integration.
•
Our experiments show promising results in terms of MRR and precision.

Abstract

The Question Answering (QA) task aims to provide precise and quick answers to user questions from a collection of documents or a database. This kind of IR system is sorely needed with the dramatic growth of digital information. In this paper, we address the problem of QA in the medical domain where several specific conditions are met. We propose a semantic approach to QA based on (i) Natural Language Processing techniques, which allow a deep analysis of medical questions and documents and (ii) semantic Web technologies at both representation and interrogation levels. We present our Semantic Question-Answering System, called MEANS and our proposed method for “Answer Search” based on semantic search and query relaxation. We evaluate the overall system performance on real questions and answers extracted from MEDLINE articles. Our experiments show promising results and suggest that a query-relaxation strategy can further improve the overall performance.

Introduction

The increasing knowledge accessible via internet affects our habits to find information and to obtain answers to our questions. According to an american health survey¹ published in January 2013, 35% of U.S. adults state that they have gone online specifically to try to figure out what medical condition they or someone else might have. Asked about the accuracy of their initial diagnosis, 41% of “online diagnosers” (who searched for answers on the internet) say a medical professional confirmed their diagnosis, but 35% say they did not visit a clinician to get a professional opinion. 18% say they consulted a medical professional and the clinician either did not agree or offered a different opinion about the condition. 77% say that they start looking for health information using a search engine. However, while these search engines contribute strongly in making large volumes of medical knowledge accessible, their users have often to deal with the burden of browsing and filtering the numerous results of their queries in order to find the precise information they were looking for. This point is more crucial for practitioners who may need an immediate answer to their questions during their work.

Ely et al. (1999) presented an observational study in which investigators visited family doctors for two half days and collected their questions. The 103 doctors saw 2467 patients and asked 1101 questions during 732 observation hours. Each doctor asked an average of 7.6 questions during the two half days (3.2 questions per 10 patients).

Covell, Uman, and Manning (1985) studied information needs of physicians during office practice. In their study, information needs were obtained by self-reports from 47 physicians who raised a total of 269 questions during their half-day practice. The raised questions were very heterogeneous in terms of topics and highly specific to the patients. On average only 30% of their information needs were met during the patient visit, most often by other physicians having different subspecialities. As shown in their study, print sources were not used for several reasons such as inadequate indexation of books and drug information sources, age of the available textbooks, lack of knowledge about the relevant sources or the time needed to access the required information.

In this context, we need tools such as question answering (QA) systems in order to respond to user queries with precise answers. Such systems need deep analysis of both user questions and documents in order to extract the relevant information. At the first level of this information come the medical entities (e.g. diseases, drugs, symptoms). At the second, more complicated level comes the extraction of semantic relations between these entities (e.g. treats, prevents, causes).

Within an overall common framework, QA systems aim to provide precise answers to natural language questions. The answer can be a piece of text extracted from a document collection (Demner-Fushman & Lin, 2006) or the Web (Lin & Katz, 2003) as well as data retrieved from a database (Popescu, Etzioni, & Kautz, 2003) or a knowledge base (Rinaldi, Dowdall, & Schneider, 2004). In more rare cases, the returned answers are multimedia information (Katz, 1999). A question answering system can be composed of three main tasks: (i) analysis of the user question, (ii) analysis of the documents used to find the answers and (iii) answer retrieval and extraction. The second task is not required for systems that use databases or knowledge bases as answer sources. Methods used to analyze questions and/or documents can be semantic, surface-level or hybrid.

In this paper, we address the problem of answering English questions formulated in natural language. We consider several types of questions, but we focus on two main types: (i) factual questions expressed by WH pronouns and (ii) boolean questions expecting a yes/no answer. An answer can be (i) a medical entity for factual questions or (ii) Yes or No for boolean questions. Moreover, for each answer extracted from a corpus, we associate a justification² including the line containing the answer, the two previous sentences and the two following sentences. We focus on searching and extracting answers from scientific articles and clinical texts. However, the proposed approach can be extended to consider other resources like websites, textual corpora, Linked Open Data and ontologies.

There are three main contributions in this paper:

1.
We propose an original system for medical QA combining: (i) NLP methods which allow a deep analysis of medical questions and corpora used to find answers and (ii) semantic Web technologies which offer a high level of expressiveness and standards for data sharing and integration.
2.
We introduce a novel query relaxation approach for QA systems that deals with errors or weaknesses of NLP methods in some cases (e.g. implicit information, need for reasoning).
3.
We experimentally evaluate our system, called MEANS, with a benchmark (Corpus for Evidence Based Medicine Summarisation) and we discuss the obtained results.

The remainder of the paper is organized as follows. Section 2 introduces related work and discussion about the main QA approaches with a particular focus on the medical domain. Section 3 describes the overall architecture of the proposed approach and its main three steps: offline corpora annotation using NLP methods (described in Section 4), online question analysis (described in Section 5) and answer retrieval based on semantic search and query relaxation (described in Section 6). Section 7 presents our experiments on a standard corpus and the results of our QA system MEANS. In Section 8, we discuss the combined use of NLP methods and semantic technologies, then we analyze the error cases for the boolean and factual questions. Finally, the conclusions are made in Section 9.

Section snippets

Related work

BASEBALL (Green, Wolf, Chomsky, & Laughery, 1961) and LUNAR (Woods, 1973) are among the first known question answering systems. BASEBALL was able to answer questions about dates, locations and American baseball games. LUNAR was one of the first scientific question-answering systems. It was conceived to support the geological analysis of the rocks brought by the Apollo mission. In its evaluation, it answered correctly 90% of the questions asked by human users. Both BASEBALL and LUNAR exploited

Proposed approach

In this paper we propose a semantic approach to medical question-answering from document corpora. Fig. 1 presents the main steps of our approach which are: corpora annotation (detailed in Section 4), question analysis (described in Section 5) and answer search (Section 6). We apply NLP methods to analyze the source documents used to extract the answers and the users questions expressed in natural language (NL).

We exploit this first NL analysis to build RDF annotations of the source documents

Offline corpora annotation using NLP methods

NLP methods exploiting medical and semantic resources (e.g. UMLS) are well suited to process textual corpora and to annotate them. We use NLP methods to extract medical entities and semantic relations.

Question classification

We propose a classification of medical questions into 10 categories:

1.
Yes/No questions (e.g. “Can Group B streptococcus cause urinary tract infections in adults?”)
2.
Explanation, Reason or “why” questions (e.g. “Why do phenobarbital and Dilantin counteract each other?”)
3.
Condition, case or the greatest number of “when” questions (e.g. “When would you use gemfibrozil rather than an HMG (3-hydroxy-3-methylglutaryl) coenzyme A inhibitor?”)
4.
Manner, some “how” questions (e.g. (i) “How are homocysteine and

Answer search

Ontology-based information retrieval has several benefits such as (i) handling synonymy and morphological variations and (ii) allowing semantic query expansion or approximation through subsumption and domain relations. For example, if we search the term “ACTH stimulation test”, the semantic search engine can find documents containing “cosyntropin test” or “tetracosactide test” or “Synacthen test” as they are synonyms. Also, if we search treatments for “Cardiovascular disease” we can also find,

Evaluation criteria

Several QA evaluation campaigns such as TREC²¹ (English), CLEF²² (multi-language), NTCIR²³ (Japenese) and Quaero²⁴ (French, English) have been conducted in open domain. In the medical field, English QA challenges are rare. The Genomics task of the TREC challenge can be seen as an related track, even if it was not officially introduced as such.

The performance of QA systems is often evaluated using

NLP methods

We experimented NLP methods on different corpora (e.g. clinical texts, scientific articles). Our experiments confirm the fact that:

•
Rule or pattern based methods have average performances, even weak, on large corpora with heterogeneous documents.
•
Statistical methods can be very robust, but their performance diminishes significantly if (i) a small number of training examples is available and/or (ii) the test corpus has different characteristics from the training corpus.

To tackle the scalability

Conclusion

In this paper, we tackled automatic Question Answering in the medical domain and presented an original approach having four main characteristics:

•
The proposed approach allows dealing with different types of questions, including questions with more than one expected answer type and more than one focus.
•
It allows a deep analysis for questions and corpora using different information extraction methods based on (i) domain knowledge and (ii) Natural Language Processing techniques (e.g use of patterns,

References (45)

V. Lopez et al.
Evaluating question answering over linked data
Journal of Web Semantics
(2013)
R.M. Terol et al.
A knowledge based method for the medical question answering problem
Computers in Biology and Medicine
(2007)
Aronson, A. R. (2001). Effective mapping of biomedical text to the UMLS metathesaurus: The MetaMap program (Vol. 8, pp....
Ben Abacha, A., & Zweigenbaum, P. (2011). A hybrid approach for the extraction of semantic relations from MEDLINE...
Ben Abacha, A., & Zweigenbaum, P. (2012). Medical question answering: Translating medical questions into sparql...
A. Ben Abacha et al.
Medical entity recognition: A comparison of semantic and statistical methods
Y.-G. Cao et al.
Evaluation of the clinical question answering presentation
Cimiano, P., Haase, P., Heizmann, J., Mantel, M., & Studer, R. (2008). Towards portable natural language interfaces to...
K.B. Cohen et al.
High-precision biological event extraction: Effects of system and of data
Computational Intelligence
(2011)
D.G. Covell et al.
Information needs in office practice: are they being met?
Annals Of Internal Medicine
(1985)

Demner-Fushman, D., & Lin, J. (2005). Knowledge extraction for clinical question answering: Preliminary results. In...

Demner-Fushman, D., & Lin, J. J., (2006). Answer extraction, semantic clustering, and extractive summarization for...

J.W. Ely et al.

Analysis of questions asked by family doctors regarding patient care

BMJ

(1999)

J.W. Ely et al.

Obstacles to answering doctors’ questions about patient care with evidence: Qualitative study

British Medical Journal

(2002)

J.W. Ely et al.

A taxonomy of generic clinical questions: Classification study

British Medical Journal

(2000)

B.F. Green et al.

Baseball: an automatic question-answerer

B.L. Humphreys et al.

The UMLS project: Making the conceptual connection between users and the information they need

Bulletin of the Medical Library Association

(1993)

P. Jacquemart et al.

Towards a medical question-answering system: A feasibility study

Katz, B. (1999). From sentence processing to information access on the world wide web. In AAAI spring symposium on...

Katz, B., Felshin, S., Yuret, D., Ibrahim, A., Lin, J. J., Marton, G., et al. (2002). Omnibase: Uniform access to...

H. Kilicoglu et al.

Effective bio-event extraction using trigger words and syntactic dependencies

Computational Intelligence

(2011)

Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional random fields: Probabilistic models for...

Cited by (206)

A phrase-based questionnaire–answering approach for automatic initial frailty assessment based on clinical notes
2024, Computers in Biology and Medicine
Frailty stands out as a particularly challenging multidimensional geriatric syndrome in the elderly population, often resulting in diminished quality of life and heightened mortality risk. Negative consequences encompass a heightened likelihood of hospitalization and institutionalization, as well as suboptimal post-hospitalization outcomes and elevated mortality rates. Using a questionnaire-based approach for assessing frailty has been shown to be an effective method for early diagnosis of frailty. Nonetheless, the majority of current frailty assessment tools necessitate in-person consultations. This poses a significant challenge for elderly patients residing in rural areas, who often encounter difficulties in accessing healthcare compared to their urban or suburban counterparts. Additionally, elderly patients face an elevated risk of contracting diseases as a result of frequent hospital visits, given that many of them are immunocompromised. An automated initial frailty assessment approach can help mitigate the challenges mentioned above and conserve clinical resources by circumventing the need for extensive manual assessments. The primary aim of this paper is to introduce an automatic initial frailty assessment method. This method efficiently identifies individuals who may necessitate further frailty evaluation by automatically extracting relevant information from a patient’s clinical notes and using it to complete the Tillburg Frailty Indicator (TFI) questionnaire. The introduced phrase-based query expansion technique is designed to identify the most pertinent phrases related to the frailty assessment questionnaire using Unified Medical Language System (UMLS) ontology and incorporates information from clinical notes to enhance its accuracy. Additionally, a method for retrieving pertinent clinical notes to automatically facilitate the frailty assessment process based on the identified phrases was also proposed. The proposed approaches are evaluated using a dataset containing a collection of clinical notes from elderly patients, assessing their effectiveness in terms of automating frailty assessment and question–answering tasks. This research underscores the significance of incorporating phrases as features in the automated frailty assessment process using clinical notes. The research empowers clinicians to conduct automatic frailty assessments utilizing medical data, thereby reducing the need for frequent hospital visits and in-patient consultations. This becomes particularly valuable during unusual or unexpected situations, such as the COVID-19 pandemic, where minimizing in-person interactions is crucial.
A knowledge graph-based data harmonization framework for secondary data reuse
2024, Computer Methods and Programs in Biomedicine
The adoption of new technologies in clinical care systems has propitiated the availability of a great amount of valuable data. However, this data is usually heterogeneous, requiring its harmonization to be integrated and analysed. We propose a semantic-driven harmonization framework that (1) enables the meaningful sharing and integration of healthcare data across institutions and (2) facilitates the analysis and exploitation of the shared data.
The framework includes an ontology-based common data model (i.e. SCDM), a data transformation pipeline and a semantic query system. Heterogeneous datasets, mapped to different terminologies, are integrated by using an ontology-based infrastructure rooted in a top-level ontology. A graph database is generated by using these mappings, and web-based semantic query system facilitates data exploration.
Several datasets from different European institutions have been integrated by using the framework in the context of the European H2020 Precise4Q project. Through the query system, data scientists were able to explore data and use it for building machine learning models.
The flexible data representation using RDF, together with the formal semantic underpinning provided by the SCDM, have enabled the semantic integration, query and advanced exploitation of heterogeneous data in the context of the Precise4Q project.
A novel self-attention enriching mechanism for biomedical question answering
2023, Expert Systems with Applications
The task of biomedical question answering is a subtask of the more general question answering task, that is concerned only with biomedical questions. The current state-of-the-art models in this task like BioBERT, and BioM-ELECTRA are all based on the transformer architecture. The self-attention layer in the transformer plays a central role in the model predictions. Recent studies on the inner-workings of the transformer self-attention layer in the case of question answering hypothesize that context passage tokens with bigger attention scores have a bigger probability of being part of the predicted answer. Starting from this hypothesis, we experimented with a novel self-attention enriching mechanism for biomedical question answering targeting factoid and list type questions. In our approach, we enrich BioBERT’s self-attention layer with biomedical and named entity information previously extracted from the question and the context passage. The proposed enriching mechanism increases the attention scores for the biomedical and named entities. Which are in most cases the answer to the question. This increase in attention scores influences the model final prediction as hypothesized. Our proposed method achieves state-of-the-art results on several batches of the BioASQ’s 10b, 9b, 8b, and 7b datasets.
A novel DL-based algorithm integrating medical knowledge graph and doctor modeling for Q&A pair matching in OHP
2023, Information Processing and Management
Using AI technology to automatically match Q&A pairs on online health platforms (OHP) can improve the efficiency of doctor-patient interaction. However, previous methods often neglected to fully exploit rich information contained in OHP, especially the medical expertise that could be leveraged through medical text modeling. Therefore, this paper proposes a model named MKGA-DM-NN, which first uses the named entities of the medical knowledge graph (KG) to identify the intention of the problem, and then uses graph embedding technology to learn the representation of entities and entity relationships in the KG. The proposed model also employs the relationship between entities in KG to optimize the hybrid attention mechanism. In addition, doctors' historical Q&A records on OHP are used to learn modeling doctors’ expertise to improve the accuracy of Q&A matching. This method is helpful to bridge the semantic gap of text and improve the accuracy and interpretability of medical Q&A matching. Through experiments on a real dataset from a Chinese well-known OHP, our model has been verified to be superior to the baseline models. The accuracy of our model is 4.4% higher than the best baseline model. The cost-sensitive error of our model is 13.53% lower than that of the best baseline model. The ablation experiment shows that the accuracy rate can be significantly improved by 8.72% by adding the doctor modeling module, and the cost-sensitive error can be significantly reduced by 17.27% by adding the medical KG module.
Medical knowledge-based network for Patient-oriented Visual Question Answering
2023, Information Processing and Management
Visual Question Answering (VQA) systems have achieved great success in general scenarios. In medical domain, VQA systems are still in their infancy as the datasets are limited by scale and application scenarios. Current medical VQA datasets are designed to conduct basic analyses of medical imaging such as modalities, planes, organ systems, abnormalities, etc., aiming to provide constructive medical suggestions for doctors, containing a large number of professional terms with limited help for patients. In this paper, we introduce a new Patient-oriented Visual Question Answering (P-VQA) dataset, which builds a VQA system for patients by covering an entire treatment process including medical consultation, imaging diagnosis, clinical diagnosis, treatment advice, review, etc. P-VQA covers 20 common diseases with 2,169 medical images, 24,800 question-answering pairs, and a medical knowledge graph containing 419 entities. In terms of methodology, we propose a Medical Knowledge-based VQA Network (MKBN) to answer questions according to the images and a medical knowledge graph in our P-VQA. MKBN learns two cluster embeddings (disease-related and relation-related embeddings) according to structural characteristics of the medical knowledge graph and learns three different interactive features (image-question, image-disease, and question-relation) according to characteristics of diagnosis. For comparisons, we evaluate several state-of-the-art baselines on the P-VQA dataset as benchmarks. Experimental results on P-VQA demonstrate that MKBN achieves the state-of-the-art performance compared with baseline methods. The dataset is available at https://github.com/cs-jerhuang/P-VQA.
Question answering systems for health professionals at the point of care - a systematic review
2024, Journal of the American Medical Informatics Association

View all citing articles on Scopus

View full text

MEANS: A medical question-answering system combining NLP techniques and semantic Web technologies

Highlights

Abstract

Introduction

Section snippets

Related work

Proposed approach

Offline corpora annotation using NLP methods

Question classification

Answer search

Evaluation criteria

NLP methods

Conclusion

Journal of Web Semantics

Computers in Biology and Medicine

Medical entity recognition: A comparison of semantic and statistical methods

Evaluation of the clinical question answering presentation

High-precision biological event extraction: Effects of system and of data

Computational Intelligence

Information needs in office practice: are they being met?

Annals Of Internal Medicine

Analysis of questions asked by family doctors regarding patient care

BMJ

Obstacles to answering doctors’ questions about patient care with evidence: Qualitative study

British Medical Journal

A taxonomy of generic clinical questions: Classification study

British Medical Journal

Baseball: an automatic question-answerer

The UMLS project: Making the conceptual connection between users and the information they need

Bulletin of the Medical Library Association

Towards a medical question-answering system: A feasibility study

Effective bio-event extraction using trigger words and syntactic dependencies

Computational Intelligence