Skip to main content
Log in

A system for information retrieval in a medical digital library based on modular ontologies and query reformulation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Ontologies have proven to be useful in the area of Information Retrieval and the biomedical informatics community has acknowledged, in recent years, their utility. However, building and updating manually ontologies is a long and tedious task. This paper proposes a system that allows any search engine to develop its semantic layer by applying ontology learning techniques on Web snippets and applies it to a well-known medical digital library, PubMed. The new system (SemPubMed) automatically builds new ontology fragments related to the user’s query and then it reformulates queries using the new concepts in order to improve information retrieval. Our system has endured a twofold evaluations. On the one hand, we have evaluated the quality of the modular ontologies built by the system. On the other hand, we have studied how the semantic reformulation of the queries has led to an improvement of the quality of the results given by PubMed, both in terms of precision and recall. Obtained results show that adding semantic layer to PubMed enables an improvement of query reformulation and predicted ranking score.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. http://www.ncbi.nlm.nih.gov/mesh

  2. http://developer.yahoo.com/search/boss/

  3. http://nlp.stanford.edu/software/tagger.shtml

  4. http://www.nlm.nih.gov/mesh/MBrowser.html

References

  1. Baruzzo A, Casoto P, Challapalli P, Dattolo A, Pudota N, Tasso C (2009) Toward semantic digital libraries: exploiting Web 2.0 and semantic services in cultural heritage. J Digit Inf 10(6)

  2. Ben Mustapha N, Aufaure M, Baazaoui Zghal H, Ben Ghezala H (2012) Modular ontological warehouse for adaptative information search. In: MEDI 2012, pp 79–90

  3. Ben Mustapha N, Aufaure M-A, Baazaoui-Zghal H, Ben-Ghzala H (2011) Contextual ontology module learning from Web snippets and past user queries. In: Procs. of the 15th int. conf. on knowledge-based and intelligent information and engineering systems, KES’11, pp 538–547

  4. Berland M, Charniak E (1999) Finding parts in verylarge corpora. In: Proceedings of the 37th annual meeting of the association for computational linguistics, ACL ’99, pp 57–64

  5. Bettembourg C, Diot C, Burgun A, Dameron O (2012) GO2PUB: querying PubMed with semantic expansion of gene ontology terms. J Bio Semant 3:7

    Google Scholar 

  6. Boldi P, Bonchi F, Castillo C, Vigna S (2011) Query reformulation mining: models, patterns and applications. J Inf Ret Arch 14(3):257–289

    Article  Google Scholar 

  7. Christopher D, Schtze H (1999) Foundations of statistical natural language processing. MIT Press

  8. Corby O, Dieng-Kuntz R, Faron-Zucker C (2004) Querying the semantic web withcorese search engine. In: de Mntaras RL, Saitta L (eds) Proceeding of European conference on artificial intelligence (ECAI 2004). IOS Press, pp 705–709

  9. Elloumi-Chaabene M, Ben Mustapha N, Baazaoui Zghal H, Moreno A, Snchez D (2011) Semantic-based composition of modular ontologies applied to web query reformulation. In: Proceedings of the 6th international conference on software and data technologies ICSOFT, pp 305–308

  10. Kafsi S, Ben Mustapha N, Baazaoui Zghal H, Moreno A (2012) Sem-PubMed: a semantic medical digital library that integrates ontology learning and query reformulation. KES, pp 1932–1941

  11. Kiefer S, Rauch J, Albertoni R, Attene M, Giannini F, Marini S, Schneider L, Mesquita C, Xing X (2011) An ontology-driven search module for accessing chronic pathology literature. OTM Workshops, pp 382–391

  12. Mastora A, Monopoli M, Kapidakis S (2008) Exploring query formulation and reformulation: a preliminary study to map users’ search behaviour. In: Christensen-Dalsgaard B et al (eds) ECDL 2008, LNCS 5173, pp 427–430

  13. Mayr P, Mutschke P, Petras V (2007) Reducing semantic complexity indistributed digital librariesTreatment of term vagueness and document reranking. GESIS-IZ Social Science Information Centre, pp 213–234

  14. Ferran N, Mor E, Minguillon J (2005) Towards personalization in digital libraries through ontologies. Libr Manag 26(4–5):206–217

    Article  Google Scholar 

  15. Perez-Carballo J, Xie I (2011) Design principles of help systems for digital libraries. University of Wisconsin-Milwaukee

  16. Price C, Summers R (2010) Decision support in large-scale healthcare information systems: the challenge of integrating ontologies. Int J Biomed Eng Technol 3(3–4):375–392

    Article  Google Scholar 

  17. Sanchez D, Moreno A (2007) Bringing taxonomic structure to large digital libraries. Int J Metadata Semant Ontol 2(2):112–122

    Article  Google Scholar 

  18. Sanchez D, Moreno A (2008) Pattern-based automatictaxonomy learning from the Web. AI Commun 21(1):27–48

    MATH  MathSciNet  Google Scholar 

  19. Sanchez D, Moreno A (2008) Learning non-taxonomicrelationshipsfrom web documents for domainontologyconstruction. Data Knowl Eng J 64:600–623

    Article  Google Scholar 

  20. Sanchez D, Moreno A, Del Vasto-Terrientes L (2012) Learning relation axioms from text: an automatic Web-based approach. Expert Syst Appl 39:5792–5805

    Article  Google Scholar 

  21. Suomela S, Kekalainen J (2005) Ontology as a search-tool: a study of real user’s query formulationwithand without conceptual support. In: Proceedings of ECIR 2005. LNCS 3408. Springer, pp 315–329

  22. Tan P-N, Steinbach M, Kumar V (2005) Introduction to data mining. Addison Wesley

  23. Thinn Mya Mya Swe (2011) Intelligent Information Retrieval Within Digital Library Using Domain Ontology. Computer Science & Information Technology (CS & IT), vol 1–2

  24. Turney P (2001) Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. In: Proceedings of the 12th Eu-ropean conference on machine learning, pp 491–510

  25. Vallet D, Fernandez M, Castells P (2005) An ontology-based information retrieval model. In: Gomez-Perez A, Euzenat J (eds) Proceedings of ESWC 2005. LNCS 3532. Springer, pp 455–470

  26. Yu H, Kim T, Oh J, Ko I, Kim S (2009) RefMed: relevance feedback retrieval system fo PubMed. CIKM, pp 2099–2100

  27. Zhao P, Zhang M, Yang D, Tang S (2005) Finding hidden semantics behind reference linkages: an ontological approach for scientific digital libraries. DASFAA, pp 699–710

Download references

Acknowledgements

This research work has been supported by the Spanish-Tunisian AECID project number A/030058/10, “A framework for the integration of Ontology Learning and Semantic Search”.

The authors acknowledge the work and contributions of Nesrine Ben Mustapha and Safa Kafsi in the previous stages of this work [10].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hajer Baazaoui Zghal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baazaoui Zghal, H., Moreno, A. A system for information retrieval in a medical digital library based on modular ontologies and query reformulation. Multimed Tools Appl 72, 2393–2412 (2014). https://doi.org/10.1007/s11042-013-1527-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1527-4

Keywords

Navigation