Abstract
Diagnostic decision support systems necessitate disease knowledge base, and this part may occupy dominant portion in the total development cost of such systems. Accordingly, toward automated generation of disease knowledge base, we conducted a preliminary study for efficient extraction of symptomatic expressions, utilizing MetaMap, a tool for assigning UMLS (Unified Medical Language System) semantic tags onto phrases in a given medical literature text.
We first utilized several tags in the MetaMap output, related to symptoms and findings, for extraction of symptomatic terms. This straightforward approach resulted in Recall 82% and Precision 64%. Then, we applied a heuristics that exploits certain patterns of tag sequences that frequently appear in typical symptomatic expressions. This simple approach achieved 7% recall gain, without sacrificing precision.
Although the extracted information requires manual inspection, the study suggested that the simple approach can extract symptomatic expressions, at very low cost. Failure analysis of the output was also performed to further improve the performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: AMIA Annual Symposium, pp. 17–21 (2001)
Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. Journal of the American Medical Informatics Association 17(3), 229–236 (2010)
Bashyam, V., Divita, G., Bennett, D.B., Browne, A.C., Taira, R.K.: A normalized lexical lookup approach to identifying UMLS concepts in free text. Studies in Health Technology and Informatics 129(Pt 1), 545–549 (2007)
Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Research 32(Database issue), D267–D270 (2004)
Cantor, M.N., Lussier, Y.A.: Mining OMIM for insight into complex diseases. Studies in Health Technology and Informatics 107(Pt 2), 753–757 (2004)
Chapman, W.W., Fiszman, M., Dowling, J.N., Chapman, B.E., Rindflesch, T.C.: Identifying respiratory findings in emergency department reports for biosurveillance using MetaMap. Studies in Health Technology and Informatics 107(Pt 1), 487–491 (2004)
Cohen, R., Gefen, A., Elhadad, M., Birk, O.S.: CSI-OMIM–Clinical Synopsis Search in OMIM. BMC Bioinformatics 12, 65 (2011)
Divita, G., Tse, T., Roth, L.: Failure analysis of MetaMap Transfer (MMTx). Studies in Health Technology and Informatics 107(Pt 2), 763–767 (2004)
Gschwandtner, T., Kaiser, K., Martini, P., Miksch, S.: Easing semantically enriched information retrieval-an interactive semi-automatic annotation system for medical documents. International Journal of Human-Computer Studies 68(6), 370–385 (2010)
INSERM SC11: Orphanet, http://www.orpha.net/
Jimeno, A., Jimenez-Ruiz, E., Lee, V., Gaudan, S., Berlanga, R., Rebholz-Schuhmann, D.: Assessment of disease named entity recognition on a corpus of annotated sentences. BMC Bioinformatics 9(suppl. 3), S3 (2008)
John Hopkins University: OMIM: Online Mendelian Inheritance in Man, http://www.ncbi.nlm.nih.gov/omim
Meystre, S., Haug, P.J.: Evaluation of medical problem extraction from electronic clinical documents using MetaMap Transfer (MMTx). Studies in Health Technology and Informatics 116, 823–828 (2005)
Miller, R.A.: Computer-assisted diagnostic decision support: history, challenges, and possible paths forward. Adv. in Health Sci. Educ. 14, 89–106 (2009)
Osborne, J.D., Lin, S., Zhu, L., Kibbe, W.A.: Mining biomedical data using MetaMap Transfer (MMTx) and the Unified Medical Language System (UMLS). Methods in Molecular Biology 408, 153–169 (2007)
Pratt, W., Yetisgen-Yildiz, M.: A study of biomedical concept identification: MetaMap vs. people. In: AMIA Annual Symposium, pp. 529–533 (2003)
Segura-Bedmar, I., Martinez, P., Segura-Bedmar, M.: Drug name recognition and classification in biomedical texts. a case study outlining approaches underpinning automated systems. Drug Discovery Today 13(17-18), 816–823 (2008)
Sneiderman, C.A., Rindflesch, T.C., Aronson, A.R.: Finding the findings: identification of findings in medical literature using restricted natural language processing. In: AMIA Annual Fall Symposium, pp. 239–243 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Okumura, T., Tateisi, Y. (2012). A Lightweight Approach for Extracting Disease-Symptom Relation with MetaMap toward Automated Generation of Disease Knowledge Base. In: He, J., Liu, X., Krupinski, E.A., Xu, G. (eds) Health Information Science. HIS 2012. Lecture Notes in Computer Science, vol 7231. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29361-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-29361-0_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29360-3
Online ISBN: 978-3-642-29361-0
eBook Packages: Computer ScienceComputer Science (R0)