Skip to main content

An Active Learning Framework for Efficient Condition Severity Classification

  • Conference paper
Artificial Intelligence in Medicine (AIME 2015)

Abstract

Understanding condition severity, as extracted from Electronic Health Records (EHRs), is important for many public health purposes. Methods requiring physicians to annotate condition severity are time-consuming and costly. Previously, a passive learning algorithm called CAESAR was developed to capture severity in EHRs. This approach required physicians to label conditions manually, an exhaustive process. We developed a framework that uses two Active Learning (AL) methods (Exploitation and Combination_XA) to decrease manual labeling efforts by selecting only the most informative conditions for training. We call our approach CAESAR-Active Learning Enhancement (CAESAR-ALE). As compared to passive methods,CAESAR-ALE’s first AL method, Exploitation, reduced labeling efforts by 64% and achieved an equivalent true positive rate, while CAESAR-ALE’s second AL method, Combination_XA, reduced labeling efforts by 48% and achieved equivalent accuracy. In addition, both these AL methods outperformed the traditional AL method (SVM-Margin). These results demonstrate the potential of AL methods for decreasing the labeling efforts of medical experts, while achieving greater accuracy and lower costs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Stang, P.E., Ryan, P.B., Racoosin, J.A., et al.: Advancing the science for active surveillance: rationale and design for the Observational Medical Outcomes Partnership. Ann. Intern. Med. 153(9), 600–606 (2010)

    Article  Google Scholar 

  2. Kho, A.N., Pacheco, J.A., Peissig, P.L., et al.: Electronic medical records for genetic research: results of the eMERGE consortium. Science Translational Medicine 3(79), 79re1 (2011)

    Google Scholar 

  3. Denny, J.C., Ritchie, M.D., Basford, M.A., et al.: PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations. Bioinformatics 26(9), 1205–1210 (2010)

    Article  Google Scholar 

  4. Boland, M.R., Hripcsak, G., Shen, Y., Chung, W.K., Weng, C.: Defining a comprehensive verotype using electronic health records for personalized medicine. J. Am. Med. Inform. Assoc. 20(e2), e232–e238 (2013)

    Google Scholar 

  5. Weiskopf, N.G., Weng, C.: Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J. Am. Med. Inform. Assoc. 20(1), 144–151 (2013)

    Article  Google Scholar 

  6. Hripcsak, G., Knirsch, C., Zhou, L., Wilcox, A., Melton, G.B.: Bias associated with mining electronic health records. Journal of Biomedical Discovery and Collaboration 6, 48 (2011)

    Article  Google Scholar 

  7. Hripcsak, G., Albers, D.J.: Correlating electronic health record concepts with healthcare process events. J. Am. Med. Inform. Assoc. 20(e2), e311–e318 (2013)

    Google Scholar 

  8. Rich, P., Scher, R.K.: Nail psoriasis severity index: a useful tool for evaluation of nail psoriasis. Journal of the American Academy of Dermatology 49(2), 206–212 (2003)

    Article  Google Scholar 

  9. Bastien, C.H., Vallières, A., Morin, C.M.: Validation of the Insomnia Severity Index as an outcome measure for insomnia research. Sleep Medicine 2(4), 297–307 (2001)

    Article  Google Scholar 

  10. McLellan, A.T., Kushner, H., Metzger, D., et al.: The fifth edition of the Addiction Severity Index. Journal of Substance Abuse Treatment 9(3), 199–213 (1992)

    Article  Google Scholar 

  11. Rockwood, T.H., Church, J.M., Fleshman, J.W., et al.: Patient and surgeon ranking of the severity of symptoms associated with fecal incontinence. Diseases of the Colon & Rectum 42(12), 1525–1531 (1999)

    Article  Google Scholar 

  12. Horn, S.D., Horn, R.: Reliability and validity of the severity of illness index. Medical Care 24(2), 159–178 (1986)

    Article  Google Scholar 

  13. Boland, M.R., Tatonetti, N., Hripcsak, G.: CAESAR: A classification approach for extracting severity automatically from electronic health records. In: Intelligent Systems for Molecular Biology Phenotype Day, Boston, MA, pp. 1–8 (2014) (in Press)

    Google Scholar 

  14. Elkin, P.L., Brown, S.H., Husser, C.S., et al.: Evaluation of the content coverage of SNOMED CT: ability of SNOMED clinical terms to represent clinical problem lists. In: Mayo Clinic Proceedings, pp. 741–748. Elsevier (2006)

    Google Scholar 

  15. Stearns, M.Q., Price, C., Spackman, K.A., Wang, A.: SNOMED clinical terms: overview of the development process and project status. In: Proceedings of the AMIA Symposium 2001, p. 662. American Medical Informatics Association (2001)

    Google Scholar 

  16. Elhanan, G., Perl, Y., Geller, J.: A survey of SNOMED CT direct users, 2010: impressions and preferences regarding content and quality. Journal of the American Medical Informatics Association 18(suppl. 1), i36–i44 (2011)

    Google Scholar 

  17. Moskovitch, R., Shahar, Y.: Vaidurya–a concept-based, context-sensitive search engine for clinical guidelines. American Medical Informatics Association (2004)

    Google Scholar 

  18. HCUP Chronic Condition Indicator for ICD-9-CM. Healthcare Cost and Utilization Project (HCUP) (2011), http://www.hcup-us.ahrq.gov/toolssoftware/chronic/chronic.jsp (accessed on February 25, 2014)

  19. Hwang, W., Weller, W., Ireys, H., Anderson, G.: Out-of-pocket medical spending for care of chronic conditions. Health Affairs 20(6), 267–278 (2001)

    Article  Google Scholar 

  20. Chi, M.-J., Lee, C.-Y., Wu, S.-C.: The prevalence of chronic conditions and medical expenditures of the elderly by chronic condition indicator (CCI). Archives of Gerontology and Geriatrics 52(3) (2011)

    Google Scholar 

  21. Perotte, A., Pivovarov, R., Natarajan, K., Weiskopf, N., Wood, F., Elhadad, N.: Diagnosis code assignment: models and evaluation metrics. Journal of the American Medical Informatics Association 21(2), 231–237 (2014)

    Article  Google Scholar 

  22. Perotte, A., Hripcsak, G.: Temporal properties of diagnosis code time series in aggregate. IEEE Journal of Biomedical and Health Informatics 17(2), 477–483 (2013)

    Article  Google Scholar 

  23. Torii, M., Wagholikar, K., Liu, H.: Using machine learning for concept extraction on clinical documents from multiple data sources. Journal of the American Medical Informatics Association (June 27, 2011)

    Google Scholar 

  24. Nguyen, A.N., Lawley, M.J., Hansen, D.P., et al.: Symbolic rule-based classification of lung cancer stages from free-text pathology reports. Journal of the American Medical Informatics Association 17(4), 440–445 (2010)

    Article  Google Scholar 

  25. Nissim, N., Moskovitch, R., Rokach, L., Elovici, Y.: Novel active learning methods for enhanced PC malware detection in windows OS. Expert Systems with Applications 41(13), 5843–5857 (2014)

    Article  Google Scholar 

  26. Nissim, N., Moskovitch, R., Rokach, L., Elovici, Y.: Detecting unknown computer worm activity via support vector machines and active learning. Pattern Analysis and Applications 15, 459–475 (2012)

    Article  MathSciNet  Google Scholar 

  27. Nissim, N., Cohen, A., Glezer, C., Elovici, Y.: Detection of malicious PDF files and directions for enhancements: A state-of-the art survey. Computers & Security 48, 246–266 (2015)

    Article  Google Scholar 

  28. Angluin, D.: Queries and concept learning. Machine Learning 2, 319–342 (1988)

    Google Scholar 

  29. Lewis, D., Gale, W.: A sequential algorithm for training text classifiers. In: Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pp. 3–12. Springer (1994)

    Google Scholar 

  30. Liu, Y.: Active learning with support vector machine applied to gene expression data for cancer classification. Journal of Chemical Information and Computer Sciences 44(6), 1936–1941 (2004)

    Google Scholar 

  31. Warmuth, M.K., Liao, J., Rätsch, G., Mathieson, M., Putta, S., Lemmen, C.: Active learning with support vector machines in the drug discovery process. Journal of Chemical Information and Computer Sciences 43(2), 667–673 (2003)

    Google Scholar 

  32. Figueroa, R.L., Zeng-Treitler, Q., Ngo, L.H., Goryachev, S., Wiechmann, E.P.: Active learning for clinical text classification: is it better than random sampling? Journal of the American Medical Informatics Association (2011), 2012:amiajnl-2011-000648

    Google Scholar 

  33. Nguyen, D.H., Patrick, J.D.: Supervised machine learning and active learning in classification of radiology reports. Journal of the American Medical Informatics Association (2014)

    Google Scholar 

  34. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 27 (2011)

    Google Scholar 

  35. Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. Journal of Machine Learning Research 2, 45–66 (2000-2001)

    Google Scholar 

  36. Ralf, H., Graepel, T., Campbell, C.: Bayes point machines. The Journal of Machine Learning Research 1, 245–279 (2001)

    MATH  Google Scholar 

  37. Nissim, N., Moskovitch, R., Rokach, L., Elovici, Y.: Novel active learning methods for enhanced pc malware detection in Windows OS. Expert Systems With Applications 41(13) (2014)

    Google Scholar 

  38. Nissim, N., Moskovitch, R., Rokach, L., Elovici, Y.: Detecting unknown computer worm activity via support vector machines and active learning. Pattern Analysis and Applications 15(4), 459–475 (2012)

    Article  MathSciNet  Google Scholar 

  39. Moskovitch, R., Nissim, N., Elovici, Y.: Malicious code detection using active learning. In: ACM SIGKDD Workshop in Privacy, Security and Trust in KDD, Las Vegas (2008)

    Google Scholar 

  40. Moskovitch, R., Stopel, D., Feher, C., Nissim, N., Japkowicz, N., Elovici, Y.: Unknown malcode detection and the imbalance problem. Journal in Computer Virology 5(4) (2009)

    Google Scholar 

  41. Nissim, N., Cohen, A., Moskovitch, R., et al.: ALPD: Active learning framework for enhancing the detection of malicious PDF files aimed at organizations. In: Proceedings of JISIC (2014)

    Google Scholar 

  42. Baram, Y., El-Yaniv, R., Luz, K.: Online choice of active learning algorithms. Journal of Machine Learning Research 5, 255–291 (2004)

    MathSciNet  Google Scholar 

  43. Herman R. 72 Statistics on Hourly Physician Compensation (2013), http://www.beckershospitalreview.com/compensation-issues/72-statistics-on-hourly-physician-compensation.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nir Nissim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Nissim, N. et al. (2015). An Active Learning Framework for Efficient Condition Severity Classification. In: Holmes, J., Bellazzi, R., Sacchi, L., Peek, N. (eds) Artificial Intelligence in Medicine. AIME 2015. Lecture Notes in Computer Science(), vol 9105. Springer, Cham. https://doi.org/10.1007/978-3-319-19551-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19551-3_3

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19550-6

  • Online ISBN: 978-3-319-19551-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics