Automatic Identification of Substance Abuse from Social History in Clinical Text

Yetisgen, Meliha; Vanderwende, Lucy

doi:10.1007/978-3-319-59758-4_18

Meliha Yetisgen^17,18 &
Lucy Vanderwende^17,19

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10259))

Included in the following conference series:

Conference on Artificial Intelligence in Medicine in Europe

2721 Accesses

Abstract

Substance abuse poses many negative health risks. Tobacco use increases the rates of many diseases such as coronary heart disease and lung cancer. Clinical notes contain rich information detailing the history of substance abuse from caregivers perspective. In this work, we present our work on automatic identification of substance abuse from clinical text. We created a publicly available dataset that has been annotated for three types of substance abuse including tobacco, alcohol, and drug, with 7 entity types per event, including status, type, method, amount, frequency, exposure-history and quit-history. Using a combination of machine learning and natural language processing approaches, our results on an unseen test set range from 0.51–0.58 F1 on stringent, full event, identification, and from 0.80–0.91 F1 for identification of the substance abuse event and status. These results indicate the feasibility of extracting detailed substance abuse information from clinical records.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Decoding substance use disorder severity from clinical notes using a large language model

Article Open access 07 February 2025

Building a tobacco user registry by extracting multiple smoking behaviors from clinical notes

Article Open access 25 July 2019

Markov logic networks for adverse drug event extraction from text

Article 08 August 2016

References

Anand, P., Kunnumakara, A.B., Sundaram, C., et al.: Cancer is a preventable disease that requires major lifestyle changes. Pharm. Res. 25(9), 2097–2116 (2008)
Article Google Scholar
Srivastava, R.: Complicated lives – taking the social history. NEJM 265(7), 587–589 (2011)
Article Google Scholar
Melton, G.B., Manaktala, S., Sarkar, I.N., Chen, E.S.: Social and behavioral history information in public health datasets. In: AMIA Annual Symposium Proceedings 2012, pp. 625–634 (2012)
Google Scholar
Uzuner, Ö., Goldstein, I., Luo, Y., Kohane, I.: Identifying patient smoking status from medical discharge records. J. Am. Med. Inform. Assoc. 15(1), 15–24 (2008)
Article Google Scholar
Cohen, A.M.: Five-way smoking status classification using text hot-spot identification and error-correcting output codes. J. Am. Med. Inform. Assoc. 15(1), 32–35 (2008)
Article Google Scholar
Clark, C., Good, K., Jezierny, L., Macpherson, M., Wilson, B., Chajewska, U.: Identifying smokers with a medical extraction system. J. Am. Med. Inform. Assoc. 15(1), 36–39 (2008)
Article Google Scholar
Jonnagaddala, J., Dai, H.J., Ray, P., Liaw, S.T.: A preliminary study on automatic identification of patient smoking status in unstructured electronic health records. In: ACL-IJCNLP 2015, pp. 147–151, 30 July 2015
Google Scholar
Carter, E.W., Sarkar, I.N., Melton, G.B., Chen, E.S.: Representation of drug use in biomedical standards, clinical text, and research measures. In: AMIA Annual Symposium Proceeding 2015, pp. 376–385 (2015)
Google Scholar
Chen, E., Garcia-Webb, M.: An analysis of free-text alcohol use documentation in the electronic health record: early findings and implications. Appl. Clin. Inform. 5(2), 402–415 (2014)
Article Google Scholar
Wang, Y., Chen, E.S., Pakhomov, S., Arsoniadis, E., Carter, E.W., Lindemann, E., Sarkar, I.N., Melton, G.B.: Automated extraction of substance use information from clinical texts. In: AMIA Annual Symposium Proceeding 2015, pp. 2121–2130, 5 November 2015
Google Scholar
Tepper, M., Capurro, D., Xia, F., Vanderwende, L., Yetisgen-Yildiz, M.: Statistical section segmentation in free-text clinical records. In: Proceedings of LREC, Istanbul, May 2012
Google Scholar
Millet, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Bejan, C.A., Vanderwende, L., Xia, F., Yetisgen-Yildiz, M.: Assertion modeling and its role in clinical phenotype identification. J. Biomed. Inform. 46(1), 68–74 (2013)
Article Google Scholar
McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In: Proceedings of CONLL at HLT-NAACL, pp. 188–191 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Biomedical and Health Informatics, School of Medicine, University of Washington, Seattle, WA, USA
Meliha Yetisgen & Lucy Vanderwende
Department of Linguistics, University of Washington, Seattle, WA, USA
Meliha Yetisgen
Microsoft Research, Redmond, WA, USA
Lucy Vanderwende

Authors

Meliha Yetisgen
View author publications
You can also search for this author in PubMed Google Scholar
Lucy Vanderwende
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meliha Yetisgen .

Editor information

Editors and Affiliations

Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Annette ten Teije
Medical University of Vienna, Vienna, Austria
Christian Popow
University of Pennsylvania, Philadelphia, Pennsylvania, USA
John H. Holmes
University of Pavia, Pavia, Italy
Lucia Sacchi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yetisgen, M., Vanderwende, L. (2017). Automatic Identification of Substance Abuse from Social History in Clinical Text. In: ten Teije, A., Popow, C., Holmes, J., Sacchi, L. (eds) Artificial Intelligence in Medicine. AIME 2017. Lecture Notes in Computer Science(), vol 10259. Springer, Cham. https://doi.org/10.1007/978-3-319-59758-4_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-59758-4_18
Published: 30 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59757-7
Online ISBN: 978-3-319-59758-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics