Abstract
Despite the prevalence of informatics and advanced information systems, there exists large amounts of unstructured text data. This is especially true in medicine and health care, where free text is an indispensable part of information representation. In this paper, the motivation behind developing information retrieval systems in medicine and health care is described. An overview of information retrieval evaluation is given, before describing the architecture and the development of an extendible information retrieval evaluation framework. This framework allows different information retrieval tools to be compared to a gold standard in order to test its effectiveness. The paper also gives a review of available gold standards which can be used for research purposes in the area of information retrieval of medical free texts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Holzinger, A., Geierhofer, R., Errath, M.: Semantic information in medical information systems-from data and information to knowledge: Facing information overload. In: Proc. of I-MEDIA, vol. 7, pp. 323–330 (2007)
Buckley, C.: Why current ir engines fail. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 584–585. ACM (2004)
Holzinger, A., Geierhofer, R., Mödritscher, F., Tatzl, R.: Semantic information in medical information systems: Utilization of text mining techniques to analyze medical diagnoses. Journal of Universal Computer Science 14(22), 3781–3795 (2008)
Holzinger, A.: Usability engineering methods for software developers. Communications of the ACM 48(1), 71–74 (2005)
Hersh, W.R., Hickam, D.H.: How well do physicians use electronic information retrieval systems? JAMA: The Journal of the American Medical Association 280(15), 1347 (1998)
Robertson, S.E., Hancock-Beaulieu, M.M.: On the evaluation of ir systems. Information Processing & Management 28(4), 457–466 (1992)
Tange, H.J., Schouten, H.C., Kester, A.D.M., Hasman, A.: The granularity of medical narratives and its effect on the speed and completeness of information retrieval. Journal of the American Medical Informatics Association 5(6), 571 (1998)
Brown, P.J.B., Sönksen, P.: Evaluation of the quality of information retrieval of clinical findings from a computerized patient database using a semantic terminological model. Journal of the American Medical Informatics Association 7(4), 392 (2000)
Sullivan, F., Gardner, M., Van Rijsbergen, K.: An information retrieval service to support clinical decision-making at the point of care. The British Journal of General Practice 49(449), 1003 (1999)
Noone, J., Warren, J., Brittain, M.: Information overload: opportunities and challenges for the gp’s desktop. Studies in Health Technology and Informatics 52, 1287 (1998)
Gell, G., Oser, W., Schwarz, G.: Experiences with the aura free text system. Radiology 119, 105–109 (1976)
Gell, G.: Aura: routine documentation of medical texts. Methods Inf. Med. 22, 63–68 (1983)
Zingmond, D., Lenert, L.A.: Monitoring free-text data using medical language processing. Computers and Biomedical Research 26(5), 467–481 (1993)
Holzinger, A., Geierhofer, R., Errath, M.: Semantische informationsextraktion in medizinischen informationssystemen. Informatik-Spektrum 30(2), 69–78 (2007)
Gregory, J., Mattison, J.E., Linde, C.: Naming notes: transitions from free text to structured entry. Methods of Information in Medicine 34(1-2), 57 (1995)
Holzinger, A., Kainz, A., Gell, G., Brunold, M., Maurer, H.: Interactive computer assisted formulation of retrieval requests for a medical information system using an intelligent tutoring system. In: Proceedings of ED-MEDIA, pp. 431–436 (2000)
Lovis, C., Baud, R.H., Planche, P.: Power of expression in the electronic patient record: structured data or narrative text? International Journal of Medical Informatics 58, 101–110 (2000)
Smalheiser, N.R., Swanson, D.R.: Using arrowsmith: a computer-assisted approach to formulating and assessing scientific hypotheses. Computer Methods and Programs in Biomedicine 57(3), 149–153 (1998)
Harter, S.P., Hert, C.A.: Evaluation of Information Retrieval Systems: Approaches, Issues, and Methods. Annual Review of Information Science and Technology (ARIST) 32, 3–94 (1997)
Kreuzthaler, M., Bloice, M.D., Simonic, K.M., Holzinger, A.: On the Need for Open Source Ground Truths for Medical Information Retrieval Systems. In: International Conference on Knowledge Management and Knowledge Technologies, vol. 10, pp. 371–381 (September 2010)
Roberts, A., Gaizauskas, R., Hepple, M., Demetriou, G., Guo, Y., Setzer, A., Roberts, I.: Semantic Annotation of Clinical Text: The CLEF Corpus. In: Workshop Programme, p. 19 (2008)
Pestian, J.P., Brew, C., Matykiewicz, P., Hovermale, D.J., Johnson, N., Cohen, K.B., Duch, W.: A shared task involving multi-label classification of clinical free text. In: Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing, pp. 97–104. Association for Computational Linguistics (2007)
Hersh, W.R., Müller, H., Jensen, J.R., Yang, J., Gorman, P.N., Ruch, P.: Advancing biomedical image retrieval: development and analysis of a test collection. Journal of the American Medical Informatics Association 13(5), 488 (2006)
Müller, H., Deselaers, T., Deserno, T.M., Clough, P., Kim, E., Hersh, W.: Overview of the ImageCLEFmed 2006 Medical Retrieval and Medical Annotation Tasks. In: Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (eds.) CLEF 2006. LNCS, vol. 4730, pp. 595–608. Springer, Heidelberg (2007)
Ogren, P.V., Savova, G., Buntrock, J.D., Chute, C.G.: Building and Evaluating Annotated Corpora for Medical NLP Systems. In: AMIA Annual Symposium Proceedings, p. 1050. American Medical Informatics Association (2006)
Roberts, A., Gaizauskas, R., Hepple, M., Davis, N., Demetriou, G., Guo, Y., Kola, J.S., Roberts, I., Setzer, A., Tapuria, A., et al.: The CLEF corpus: semantic annotation of clinical text. In: AMIA Annual Symposium Proceedings, p. 625. American Medical Informatics Association (2007)
Uzuner, Ö., Luo, Y., Szolovits, P.: Evaluating the State-of-the-Art in Automatic De-identification. Journal of the American Medical Informatics Association 14(5), 550 (2007)
Uzuner, Ö., Goldstein, I., Luo, Y., Kohane, I.: Identifying patient smoking status from medical discharge records. Journal of the American Medical Informatics Association 15(1), 14–24 (2008)
Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern information retrieval. Addison-Wesley, Reading (1999)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval. Cambridge Univ. Pr. (2008)
Saracevic, T.: Evaluation of evaluation in information retrieval. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 138–146. ACM (1995)
Wilson, T.D.: Human information behavior. Informing Science 3(2), 49–56 (2000)
Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: State of the art and challenges. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) 2(1), 1–19 (2006)
Kreuzthaler, M., Bloice, M.D., Faulstich, L., Simonic, K.-M., Holzinger, A.: A comparison of different retrieval strategies working on medical free texts. Journal of Universal Computer Science 17(7), 1109–1133 (2011)
Wingert, F.: Automated indexing based on SNOMED. Methods of Information in Medicine 24(1), 27–34 (1985)
Wingert, F.: Morphologic analysis of compound words. Methods of Information in Medicine 24(3), 155 (1985)
Wingert, F.: An indexing system for SNOMED. Methods of Information in Medicine 25(1), 22–30 (1986)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kreuzthaler, M., Bloice, M., Simonic, KM., Holzinger, A. (2011). Navigating through Very Large Sets of Medical Records: An Information Retrieval Evaluation Architecture for Non-standardized Text. In: Holzinger, A., Simonic, KM. (eds) Information Quality in e-Health. USAB 2011. Lecture Notes in Computer Science, vol 7058. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25364-5_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-25364-5_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25363-8
Online ISBN: 978-3-642-25364-5
eBook Packages: Computer ScienceComputer Science (R0)