An Active Learning Framework for Efficient Condition Severity Classification

Nissim, Nir; Boland, Mary Regina; Moskovitch, Robert; Tatonetti, Nicholas P.; Elovici, Yuval; Shahar, Yuval; Hripcsak, George

doi:10.1007/978-3-319-19551-3_3

Nir Nissim⁸,
Mary Regina Boland⁹,
Robert Moskovitch⁹,
Nicholas P. Tatonetti⁹,
Yuval Elovici⁸,
Yuval Shahar⁸ &
…
George Hripcsak⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9105))

Included in the following conference series:

Conference on Artificial Intelligence in Medicine in Europe

8757 Accesses
4 Citations

Abstract

Understanding condition severity, as extracted from Electronic Health Records (EHRs), is important for many public health purposes. Methods requiring physicians to annotate condition severity are time-consuming and costly. Previously, a passive learning algorithm called CAESAR was developed to capture severity in EHRs. This approach required physicians to label conditions manually, an exhaustive process. We developed a framework that uses two Active Learning (AL) methods (Exploitation and Combination_XA) to decrease manual labeling efforts by selecting only the most informative conditions for training. We call our approach CAESAR-Active Learning Enhancement (CAESAR-ALE). As compared to passive methods,CAESAR-ALE’s first AL method, Exploitation, reduced labeling efforts by 64% and achieved an equivalent true positive rate, while CAESAR-ALE’s second AL method, Combination_XA, reduced labeling efforts by 48% and achieved equivalent accuracy. In addition, both these AL methods outperformed the traditional AL method (SVM-Margin). These results demonstrate the potential of AL methods for decreasing the labeling efforts of medical experts, while achieving greater accuracy and lower costs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Detecting Asthma Presentations from Emergency Department Notes: An Active Learning Approach

Handwork vs machine: a comparison of rheumatoid arthritis patient populations as identified from EHR free-text by diagnosis extraction through machine-learning or traditional criteria-based chart review

Article Open access 22 June 2021

Using Machine Learning to Identify Health Outcomes from Electronic Health Record Data

Article 20 September 2018

References

Stang, P.E., Ryan, P.B., Racoosin, J.A., et al.: Advancing the science for active surveillance: rationale and design for the Observational Medical Outcomes Partnership. Ann. Intern. Med. 153(9), 600–606 (2010)
Article Google Scholar
Kho, A.N., Pacheco, J.A., Peissig, P.L., et al.: Electronic medical records for genetic research: results of the eMERGE consortium. Science Translational Medicine 3(79), 79re1 (2011)
Google Scholar
Denny, J.C., Ritchie, M.D., Basford, M.A., et al.: PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations. Bioinformatics 26(9), 1205–1210 (2010)
Article Google Scholar
Boland, M.R., Hripcsak, G., Shen, Y., Chung, W.K., Weng, C.: Defining a comprehensive verotype using electronic health records for personalized medicine. J. Am. Med. Inform. Assoc. 20(e2), e232–e238 (2013)
Google Scholar
Weiskopf, N.G., Weng, C.: Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J. Am. Med. Inform. Assoc. 20(1), 144–151 (2013)
Article Google Scholar
Hripcsak, G., Knirsch, C., Zhou, L., Wilcox, A., Melton, G.B.: Bias associated with mining electronic health records. Journal of Biomedical Discovery and Collaboration 6, 48 (2011)
Article Google Scholar
Hripcsak, G., Albers, D.J.: Correlating electronic health record concepts with healthcare process events. J. Am. Med. Inform. Assoc. 20(e2), e311–e318 (2013)
Google Scholar
Rich, P., Scher, R.K.: Nail psoriasis severity index: a useful tool for evaluation of nail psoriasis. Journal of the American Academy of Dermatology 49(2), 206–212 (2003)
Article Google Scholar
Bastien, C.H., Vallières, A., Morin, C.M.: Validation of the Insomnia Severity Index as an outcome measure for insomnia research. Sleep Medicine 2(4), 297–307 (2001)
Article Google Scholar
McLellan, A.T., Kushner, H., Metzger, D., et al.: The fifth edition of the Addiction Severity Index. Journal of Substance Abuse Treatment 9(3), 199–213 (1992)
Article Google Scholar
Rockwood, T.H., Church, J.M., Fleshman, J.W., et al.: Patient and surgeon ranking of the severity of symptoms associated with fecal incontinence. Diseases of the Colon & Rectum 42(12), 1525–1531 (1999)
Article Google Scholar
Horn, S.D., Horn, R.: Reliability and validity of the severity of illness index. Medical Care 24(2), 159–178 (1986)
Article Google Scholar
Boland, M.R., Tatonetti, N., Hripcsak, G.: CAESAR: A classification approach for extracting severity automatically from electronic health records. In: Intelligent Systems for Molecular Biology Phenotype Day, Boston, MA, pp. 1–8 (2014) (in Press)
Google Scholar
Elkin, P.L., Brown, S.H., Husser, C.S., et al.: Evaluation of the content coverage of SNOMED CT: ability of SNOMED clinical terms to represent clinical problem lists. In: Mayo Clinic Proceedings, pp. 741–748. Elsevier (2006)
Google Scholar
Stearns, M.Q., Price, C., Spackman, K.A., Wang, A.: SNOMED clinical terms: overview of the development process and project status. In: Proceedings of the AMIA Symposium 2001, p. 662. American Medical Informatics Association (2001)
Google Scholar
Elhanan, G., Perl, Y., Geller, J.: A survey of SNOMED CT direct users, 2010: impressions and preferences regarding content and quality. Journal of the American Medical Informatics Association 18(suppl. 1), i36–i44 (2011)
Google Scholar
Moskovitch, R., Shahar, Y.: Vaidurya–a concept-based, context-sensitive search engine for clinical guidelines. American Medical Informatics Association (2004)
Google Scholar
HCUP Chronic Condition Indicator for ICD-9-CM. Healthcare Cost and Utilization Project (HCUP) (2011), http://www.hcup-us.ahrq.gov/toolssoftware/chronic/chronic.jsp (accessed on February 25, 2014)
Hwang, W., Weller, W., Ireys, H., Anderson, G.: Out-of-pocket medical spending for care of chronic conditions. Health Affairs 20(6), 267–278 (2001)
Article Google Scholar
Chi, M.-J., Lee, C.-Y., Wu, S.-C.: The prevalence of chronic conditions and medical expenditures of the elderly by chronic condition indicator (CCI). Archives of Gerontology and Geriatrics 52(3) (2011)
Google Scholar
Perotte, A., Pivovarov, R., Natarajan, K., Weiskopf, N., Wood, F., Elhadad, N.: Diagnosis code assignment: models and evaluation metrics. Journal of the American Medical Informatics Association 21(2), 231–237 (2014)
Article Google Scholar
Perotte, A., Hripcsak, G.: Temporal properties of diagnosis code time series in aggregate. IEEE Journal of Biomedical and Health Informatics 17(2), 477–483 (2013)
Article Google Scholar
Torii, M., Wagholikar, K., Liu, H.: Using machine learning for concept extraction on clinical documents from multiple data sources. Journal of the American Medical Informatics Association (June 27, 2011)
Google Scholar
Nguyen, A.N., Lawley, M.J., Hansen, D.P., et al.: Symbolic rule-based classification of lung cancer stages from free-text pathology reports. Journal of the American Medical Informatics Association 17(4), 440–445 (2010)
Article Google Scholar
Nissim, N., Moskovitch, R., Rokach, L., Elovici, Y.: Novel active learning methods for enhanced PC malware detection in windows OS. Expert Systems with Applications 41(13), 5843–5857 (2014)
Article Google Scholar
Nissim, N., Moskovitch, R., Rokach, L., Elovici, Y.: Detecting unknown computer worm activity via support vector machines and active learning. Pattern Analysis and Applications 15, 459–475 (2012)
Article MathSciNet Google Scholar
Nissim, N., Cohen, A., Glezer, C., Elovici, Y.: Detection of malicious PDF files and directions for enhancements: A state-of-the art survey. Computers & Security 48, 246–266 (2015)
Article Google Scholar
Angluin, D.: Queries and concept learning. Machine Learning 2, 319–342 (1988)
Google Scholar
Lewis, D., Gale, W.: A sequential algorithm for training text classifiers. In: Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pp. 3–12. Springer (1994)
Google Scholar
Liu, Y.: Active learning with support vector machine applied to gene expression data for cancer classification. Journal of Chemical Information and Computer Sciences 44(6), 1936–1941 (2004)
Google Scholar
Warmuth, M.K., Liao, J., Rätsch, G., Mathieson, M., Putta, S., Lemmen, C.: Active learning with support vector machines in the drug discovery process. Journal of Chemical Information and Computer Sciences 43(2), 667–673 (2003)
Google Scholar
Figueroa, R.L., Zeng-Treitler, Q., Ngo, L.H., Goryachev, S., Wiechmann, E.P.: Active learning for clinical text classification: is it better than random sampling? Journal of the American Medical Informatics Association (2011), 2012:amiajnl-2011-000648
Google Scholar
Nguyen, D.H., Patrick, J.D.: Supervised machine learning and active learning in classification of radiology reports. Journal of the American Medical Informatics Association (2014)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 27 (2011)
Google Scholar
Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. Journal of Machine Learning Research 2, 45–66 (2000-2001)
Google Scholar
Ralf, H., Graepel, T., Campbell, C.: Bayes point machines. The Journal of Machine Learning Research 1, 245–279 (2001)
MATH Google Scholar
Nissim, N., Moskovitch, R., Rokach, L., Elovici, Y.: Novel active learning methods for enhanced pc malware detection in Windows OS. Expert Systems With Applications 41(13) (2014)
Google Scholar
Nissim, N., Moskovitch, R., Rokach, L., Elovici, Y.: Detecting unknown computer worm activity via support vector machines and active learning. Pattern Analysis and Applications 15(4), 459–475 (2012)
Article MathSciNet Google Scholar
Moskovitch, R., Nissim, N., Elovici, Y.: Malicious code detection using active learning. In: ACM SIGKDD Workshop in Privacy, Security and Trust in KDD, Las Vegas (2008)
Google Scholar
Moskovitch, R., Stopel, D., Feher, C., Nissim, N., Japkowicz, N., Elovici, Y.: Unknown malcode detection and the imbalance problem. Journal in Computer Virology 5(4) (2009)
Google Scholar
Nissim, N., Cohen, A., Moskovitch, R., et al.: ALPD: Active learning framework for enhancing the detection of malicious PDF files aimed at organizations. In: Proceedings of JISIC (2014)
Google Scholar
Baram, Y., El-Yaniv, R., Luz, K.: Online choice of active learning algorithms. Journal of Machine Learning Research 5, 255–291 (2004)
MathSciNet Google Scholar
Herman R. 72 Statistics on Hourly Physician Compensation (2013), http://www.beckershospitalreview.com/compensation-issues/72-statistics-on-hourly-physician-compensation.html

Download references

Author information

Authors and Affiliations

Department of Information Systems Engineering, Ben-Gurion University, Beer-Sheva, Israel
Nir Nissim, Yuval Elovici & Yuval Shahar
Department of Biomedical Informatics, Columbia University, New York, USA
Mary Regina Boland, Robert Moskovitch, Nicholas P. Tatonetti & George Hripcsak

Authors

Nir Nissim
View author publications
You can also search for this author in PubMed Google Scholar
Mary Regina Boland
View author publications
You can also search for this author in PubMed Google Scholar
Robert Moskovitch
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas P. Tatonetti
View author publications
You can also search for this author in PubMed Google Scholar
Yuval Elovici
View author publications
You can also search for this author in PubMed Google Scholar
Yuval Shahar
View author publications
You can also search for this author in PubMed Google Scholar
George Hripcsak
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nir Nissim .

Editor information

Editors and Affiliations

University of Pennsylvania, Philadelphia, Pennsylvania, USA
John H. Holmes
University of Pavia, Pavia, Italy
Riccardo Bellazzi
University of Pavia, Pavia, Italy
Lucia Sacchi
University of Manchester, Manchester, United Kingdom
Niels Peek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nissim, N. et al. (2015). An Active Learning Framework for Efficient Condition Severity Classification. In: Holmes, J., Bellazzi, R., Sacchi, L., Peek, N. (eds) Artificial Intelligence in Medicine. AIME 2015. Lecture Notes in Computer Science(), vol 9105. Springer, Cham. https://doi.org/10.1007/978-3-319-19551-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-19551-3_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19550-6
Online ISBN: 978-3-319-19551-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics