short-paper

Melanoma Risk Prediction with Structured Electronic Health Records

Authors:
Aaron N. Richter

Florida Atlantic University, Boca Raton, FL, USA

Florida Atlantic University, Boca Raton, FL, USA
View Profile

,
Taghi M. Khoshgoftaar

Florida Atlantic University, Boca Raton, FL, USA

Florida Atlantic University, Boca Raton, FL, USA
View Profile

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health InformaticsAugust 2018Pages 194–199https://doi.org/10.1145/3233547.3233561

Published:15 August 2018Publication History

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

Pages 194–199

ABSTRACT

Melanoma is one of the fastest growing cancers in the world, and can affect patients earlier in life than most other cancers. Therefore, it is imperative to be able to identify patients at high risk for melanoma and enroll them in screening programs to detect the cancer early. In this study, we explore data from dermatology outpatients to build a risk model for the disease. Using millions of patient records with thousands of data points in each record, we show that we can build a melanoma risk model from real-world Electronic Health Record (EHR) data without any expert knowledge or manually engineered features. While other risk models for melanoma have been developed, this is the first to use routinely collected EHR data rather than expert features targeted specifically for melanoma. The random forest model achieves similar or better performance than these previous models (AUC 0.79, sensitivity 0.71, specificity 0.72), which allows larger populations of patients to get screened for melanoma risk without having to perform specialized and time-consuming data collection. Important features from the model can be extracted and studied, and features influencing a specific prediction can be explained to providers and patients. The process for building this model can be further refined to improve performance, as well as used for risk prediction of other diseases.

References

Anne-Marie Audet, David Squires, and Michelle M. Doty. 2014. Where Are We on the Diffusion Curve? Trends and Drivers of Primary Care Physicians' Use of Health Information Technology. Health Services Research 49, 1 (2014), 347--360.Google ScholarCross Ref
Lucio Bakos, Simeona Mastroeni, Renan Rangel Bonamigo, Franco Melchi, Paolo Pasquini, Cristina Fortes, Lucio Bakos, Simeona Mastroeni, Renan Rangel Bonamigo, Franco Melchi, Paolo Pasquini, and Cristina Fortes. 2013. A melanoma risk score in a Brazilian population. Anais Brasileiros de Dermatologia 88, 2 (April 2013), 226--232.Google ScholarCross Ref
Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5-- 32. http://machinelearning202.pbworks.com/w/file/fetch/60606349/breiman_randomforests.pdf Google ScholarDigital Library
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research 16 (2002), 321--357. http://www.jair.org/papers/paper953. html Google ScholarCross Ref
Esther Erdei and Salina M Torres. 2010. A new understanding in the epidemiology of melanoma. Expert Review of Anticancer Therapy 10, 11 (2010), 1811--1823.Google ScholarCross Ref
Thomas R. Fears, DuPont Guerry, Ruth M. Pfeiffer, Richard W. Sagebiel, David E. Elder, Allan Halpern, Elizabeth A. Holly, Patricia Hartge, and Margaret A. Tucker. 2006. Identifying Individuals at High Risk of Melanoma: A Practical Predictor of Absolute Risk. Journal of Clinical Oncology 24, 22 (Aug. 2006), 3590--3596.Google ScholarCross Ref
Cristina Fortes, Simona Mastroeni, Lucio Bakos, Gianluca Antonelli, Livia Alessandroni, Maria Antonietta Pilla, Massimo Alotto, Alba Zappal, Thomas Manoorannparampill, Renan Bonamigo, Paolo Pasquini, and Franco Melchi. 2010. Identifying individuals at high risk of melanoma: a simple tool. European Journal of Cancer Prevention 19, 5 (Sept. 2010), 393--400.Google ScholarCross Ref
Benjamin A. Goldstein, Ann Marie Navar, and Michael J. Pencina. 2016. Risk Prediction With Electronic Health Records. JAMA cardiology 1, 9 (Dec. 2016), 976--977.Google Scholar
Alan N. Houghton and David Polsky. 2002. Focus on melanoma. Cancer Cell 2, 4 (2002), 275--278.Google ScholarCross Ref
Chamelli Jhappan, Frances P Noonan, and Glenn Merlino. 2003. Ultraviolet radiation and cutaneous malignant melanoma. Oncogene 22, 20 (2003), 3099--3112.Google ScholarCross Ref
Eric Jones, Travis Oliphant, and Pearu Peterson. 2014. SciPy: open source scientific tools for Python. (2014).Google Scholar
Sara Landset, Taghi M Khoshgoftaar, Aaron N Richter, and Tawfiq Hasanin. 2015. A survey of open source tools for machine learning with big data in the Hadoop ecosystem. Journal of Big Data 2, 1 (2015), 24.Google ScholarCross Ref
Vinayak K. Nahar, M. Allison Ford, Robert T. Brodell, Javier F. Boyas, Stephanie K. Jacks, Rizwana Biviji-Sharma, Mary A. Haskins, and Martha A. Bass. 2016. Skin cancer prevention practices among malignant melanoma survivors: a systematic review. Journal of Cancer Research and Clinical Oncology 142, 6 (2016), 1273--1283.Google ScholarCross Ref
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830. Google ScholarDigital Library
Aaron N Richter and Taghi M Khoshgoftaar. 2017. Modernizing Analytics for Melanoma with a Large-Scale Research Dataset. In Information Reuse and Integration (IRI), 2017 IEEE 18th International Conference on. IEEE.Google ScholarCross Ref
Ando Saabas. 2015. TreeInterpreter. https://github.com/andosa/treeinterpreter.Google Scholar
American Cancer Society. 2018. Cancer Facts & Figures 2018.Google Scholar
Wolff T, Tai E, and Miller T. 2009. Screening for skin cancer: An update of the evidence for the u.s. preventive services task force. Annals of Internal Medicine 150, 3 (2009), 194--198. arXiv:/data/journals/aim/20175/0000605--200902030-00009.pdfGoogle ScholarCross Ref
J. A. Usher-Smith, J. Emery, A. P. Kassianos, and F. M. Walter. 2014. Risk Prediction Models for Melanoma: A Systematic Review. Cancer Epidemiology Biomarkers & Prevention 23, 8 (2014), 1450--1463.Google ScholarCross Ref
Jason Van Hulse, Taghi M. Khoshgoftaar, and Amri Napolitano. 2007. Experimental perspectives on learning from imbalanced data. In Proceedings of the 24th international conference on Machine learning. ACM, 935--942. Google ScholarDigital Library
C.G. Watts, M. Dieng, R.L. Morton, G.J. Mann, S.W. Menzies, and A.E. Cust. 2015. Clinical practice guidelines for identification, screening and follow-up of individuals at high risk of primary cutaneous melanoma: a systematic review. British Journal of Dermatology 172, 1 (Jan. 2015), 33--47.Google ScholarCross Ref
Lisa H. Williams, Andrew R. Shors, William E. Barlow, Cam Solomon, and Emily White. 2011. Identifying Persons at Highest Risk of Melanoma Using Self-Assessed Risk Factors. Journal of clinical & experimental dermatology research 2, 6 (2011).Google Scholar
Matei Zaharia, Reynold S Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J Franklin, et al. 2016. Apache spark: a unified engine for big data processing. Commun. ACM 59, 11 (2016), 56--65. Google ScholarDigital Library

Index Terms

Melanoma Risk Prediction with Structured Electronic Health Records
1. Applied computing
  1. Life and medical sciences
    1. Health informatics
2. Computing methodologies
  1. Machine learning
    1. Machine learning approaches

Recommendations

Image Classification of Skin Cancer: Using Deep Learning as a Tool for Skin Self-examinations
Mathematical and Computational Oncology
Abstract
Skin cancer is the most common cancer in the United States, and studies indicate that its incidence is rapidly increasing. Regular skin self-examinations enable early cancer detection and intervention and are recommended in addition to clinician-...
Read More
MelaNet: an effective deep learning framework for melanoma detection using dermoscopic images
Abstract
Skin cancer is considered one of the most dangerous and popular sorts of cancer. The deadliest form of this type of cancer is called melanoma, it happens while pigmented cells named melanocytes begin to subdivide tensely. If early detected ...
Read More
Skin Cancer Classification Using Different Backbones of Convolutional Neural Networks
Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence
Abstract
Melanoma is the deadliest of many different types of skin cancer. Clinical screening is followed by dermoscopic analysis and histopathological examination in the diagnosis of melanoma. Melanoma is a type of skin cancer that is highly curable if ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
August 2018
727 pages
ISBN:9781450357944
DOI:10.1145/3233547
General Chairs:
Amarda Shehu
George Mason University, USA
,
Cathy Wu
University of Delaware, USA
,
Program Chairs:
Christina Boucher
University of Florida, USA
,
Jing Li
Case Western Reserve University, USA
,
Hongfang Liu
Mayo Clinic, USA
,
Mihai Pop
University of Maryland, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 August 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
big data
clinical decision support
machine learning
melanoma
skin cancer
Qualifiers
- short-paper
Conference

Acceptance Rates
BCB '18 Paper Acceptance Rate46of148submissions,31%Overall Acceptance Rate254of885submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 273
  Total Downloads
- Downloads (Last 12 months)26
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Melanoma Risk Prediction with Structured Electronic Health Records

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Image Classification of Skin Cancer: Using Deep Learning as a Tool for Skin Self-examinations

MelaNet: an effective deep learning framework for melanoma detection using dermoscopic images

Skin Cancer Classification Using Different Backbones of Convolutional Neural Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Melanoma Risk Prediction with Structured Electronic Health Records

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Image Classification of Skin Cancer: Using Deep Learning as a Tool for Skin Self-examinations

MelaNet: an effective deep learning framework for melanoma detection using dermoscopic images

Skin Cancer Classification Using Different Backbones of Convolutional Neural Networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media