Skip to main content

Navigating through Very Large Sets of Medical Records: An Information Retrieval Evaluation Architecture for Non-standardized Text

  • Conference paper
Information Quality in e-Health (USAB 2011)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 7058))

Included in the following conference series:

  • 2308 Accesses

Abstract

Despite the prevalence of informatics and advanced information systems, there exists large amounts of unstructured text data. This is especially true in medicine and health care, where free text is an indispensable part of information representation. In this paper, the motivation behind developing information retrieval systems in medicine and health care is described. An overview of information retrieval evaluation is given, before describing the architecture and the development of an extendible information retrieval evaluation framework. This framework allows different information retrieval tools to be compared to a gold standard in order to test its effectiveness. The paper also gives a review of available gold standards which can be used for research purposes in the area of information retrieval of medical free texts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Holzinger, A., Geierhofer, R., Errath, M.: Semantic information in medical information systems-from data and information to knowledge: Facing information overload. In: Proc. of I-MEDIA, vol. 7, pp. 323–330 (2007)

    Google Scholar 

  2. Buckley, C.: Why current ir engines fail. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 584–585. ACM (2004)

    Google Scholar 

  3. Holzinger, A., Geierhofer, R., Mödritscher, F., Tatzl, R.: Semantic information in medical information systems: Utilization of text mining techniques to analyze medical diagnoses. Journal of Universal Computer Science 14(22), 3781–3795 (2008)

    Google Scholar 

  4. Holzinger, A.: Usability engineering methods for software developers. Communications of the ACM 48(1), 71–74 (2005)

    Article  Google Scholar 

  5. Hersh, W.R., Hickam, D.H.: How well do physicians use electronic information retrieval systems? JAMA: The Journal of the American Medical Association 280(15), 1347 (1998)

    Article  Google Scholar 

  6. Robertson, S.E., Hancock-Beaulieu, M.M.: On the evaluation of ir systems. Information Processing & Management 28(4), 457–466 (1992)

    Article  Google Scholar 

  7. Tange, H.J., Schouten, H.C., Kester, A.D.M., Hasman, A.: The granularity of medical narratives and its effect on the speed and completeness of information retrieval. Journal of the American Medical Informatics Association 5(6), 571 (1998)

    Article  Google Scholar 

  8. Brown, P.J.B., Sönksen, P.: Evaluation of the quality of information retrieval of clinical findings from a computerized patient database using a semantic terminological model. Journal of the American Medical Informatics Association 7(4), 392 (2000)

    Article  Google Scholar 

  9. Sullivan, F., Gardner, M., Van Rijsbergen, K.: An information retrieval service to support clinical decision-making at the point of care. The British Journal of General Practice 49(449), 1003 (1999)

    Google Scholar 

  10. Noone, J., Warren, J., Brittain, M.: Information overload: opportunities and challenges for the gp’s desktop. Studies in Health Technology and Informatics 52, 1287 (1998)

    Google Scholar 

  11. Gell, G., Oser, W., Schwarz, G.: Experiences with the aura free text system. Radiology 119, 105–109 (1976)

    Article  Google Scholar 

  12. Gell, G.: Aura: routine documentation of medical texts. Methods Inf. Med. 22, 63–68 (1983)

    Google Scholar 

  13. Zingmond, D., Lenert, L.A.: Monitoring free-text data using medical language processing. Computers and Biomedical Research 26(5), 467–481 (1993)

    Article  Google Scholar 

  14. Holzinger, A., Geierhofer, R., Errath, M.: Semantische informationsextraktion in medizinischen informationssystemen. Informatik-Spektrum 30(2), 69–78 (2007)

    Article  Google Scholar 

  15. Gregory, J., Mattison, J.E., Linde, C.: Naming notes: transitions from free text to structured entry. Methods of Information in Medicine 34(1-2), 57 (1995)

    Google Scholar 

  16. Holzinger, A., Kainz, A., Gell, G., Brunold, M., Maurer, H.: Interactive computer assisted formulation of retrieval requests for a medical information system using an intelligent tutoring system. In: Proceedings of ED-MEDIA, pp. 431–436 (2000)

    Google Scholar 

  17. Lovis, C., Baud, R.H., Planche, P.: Power of expression in the electronic patient record: structured data or narrative text? International Journal of Medical Informatics 58, 101–110 (2000)

    Article  Google Scholar 

  18. Smalheiser, N.R., Swanson, D.R.: Using arrowsmith: a computer-assisted approach to formulating and assessing scientific hypotheses. Computer Methods and Programs in Biomedicine 57(3), 149–153 (1998)

    Article  Google Scholar 

  19. Harter, S.P., Hert, C.A.: Evaluation of Information Retrieval Systems: Approaches, Issues, and Methods. Annual Review of Information Science and Technology (ARIST) 32, 3–94 (1997)

    Google Scholar 

  20. Kreuzthaler, M., Bloice, M.D., Simonic, K.M., Holzinger, A.: On the Need for Open Source Ground Truths for Medical Information Retrieval Systems. In: International Conference on Knowledge Management and Knowledge Technologies, vol. 10, pp. 371–381 (September 2010)

    Google Scholar 

  21. Roberts, A., Gaizauskas, R., Hepple, M., Demetriou, G., Guo, Y., Setzer, A., Roberts, I.: Semantic Annotation of Clinical Text: The CLEF Corpus. In: Workshop Programme, p. 19 (2008)

    Google Scholar 

  22. Pestian, J.P., Brew, C., Matykiewicz, P., Hovermale, D.J., Johnson, N., Cohen, K.B., Duch, W.: A shared task involving multi-label classification of clinical free text. In: Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing, pp. 97–104. Association for Computational Linguistics (2007)

    Google Scholar 

  23. Hersh, W.R., Müller, H., Jensen, J.R., Yang, J., Gorman, P.N., Ruch, P.: Advancing biomedical image retrieval: development and analysis of a test collection. Journal of the American Medical Informatics Association 13(5), 488 (2006)

    Article  Google Scholar 

  24. Müller, H., Deselaers, T., Deserno, T.M., Clough, P., Kim, E., Hersh, W.: Overview of the ImageCLEFmed 2006 Medical Retrieval and Medical Annotation Tasks. In: Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (eds.) CLEF 2006. LNCS, vol. 4730, pp. 595–608. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  25. Ogren, P.V., Savova, G., Buntrock, J.D., Chute, C.G.: Building and Evaluating Annotated Corpora for Medical NLP Systems. In: AMIA Annual Symposium Proceedings, p. 1050. American Medical Informatics Association (2006)

    Google Scholar 

  26. Roberts, A., Gaizauskas, R., Hepple, M., Davis, N., Demetriou, G., Guo, Y., Kola, J.S., Roberts, I., Setzer, A., Tapuria, A., et al.: The CLEF corpus: semantic annotation of clinical text. In: AMIA Annual Symposium Proceedings, p. 625. American Medical Informatics Association (2007)

    Google Scholar 

  27. Uzuner, Ö., Luo, Y., Szolovits, P.: Evaluating the State-of-the-Art in Automatic De-identification. Journal of the American Medical Informatics Association 14(5), 550 (2007)

    Article  Google Scholar 

  28. Uzuner, Ö., Goldstein, I., Luo, Y., Kohane, I.: Identifying patient smoking status from medical discharge records. Journal of the American Medical Informatics Association 15(1), 14–24 (2008)

    Article  Google Scholar 

  29. Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern information retrieval. Addison-Wesley, Reading (1999)

    Google Scholar 

  30. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval. Cambridge Univ. Pr. (2008)

    Google Scholar 

  31. Saracevic, T.: Evaluation of evaluation in information retrieval. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 138–146. ACM (1995)

    Google Scholar 

  32. Wilson, T.D.: Human information behavior. Informing Science 3(2), 49–56 (2000)

    Google Scholar 

  33. Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: State of the art and challenges. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) 2(1), 1–19 (2006)

    Article  Google Scholar 

  34. Kreuzthaler, M., Bloice, M.D., Faulstich, L., Simonic, K.-M., Holzinger, A.: A comparison of different retrieval strategies working on medical free texts. Journal of Universal Computer Science 17(7), 1109–1133 (2011)

    Google Scholar 

  35. Wingert, F.: Automated indexing based on SNOMED. Methods of Information in Medicine 24(1), 27–34 (1985)

    Google Scholar 

  36. Wingert, F.: Morphologic analysis of compound words. Methods of Information in Medicine 24(3), 155 (1985)

    Google Scholar 

  37. Wingert, F.: An indexing system for SNOMED. Methods of Information in Medicine 25(1), 22–30 (1986)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kreuzthaler, M., Bloice, M., Simonic, KM., Holzinger, A. (2011). Navigating through Very Large Sets of Medical Records: An Information Retrieval Evaluation Architecture for Non-standardized Text. In: Holzinger, A., Simonic, KM. (eds) Information Quality in e-Health. USAB 2011. Lecture Notes in Computer Science, vol 7058. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25364-5_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25364-5_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25363-8

  • Online ISBN: 978-3-642-25364-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics