A Workflow for Creating Multimodal Machine Learning Models for Metastasis Predictions in Melanoma Patients

Rugolon, Franco; Randl, Korbinian; Bampa, Maria; Papapetrou, Panagiotis

doi:10.1007/978-3-031-74640-6_7

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2136))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

7 Accesses

Abstract

Melanoma is the most common form of skin cancer, responsible for thousands of deaths annually. Novel therapies have been developed, but metastases are still a common problem, increasing the mortality rate and decreasing the quality of life of those who experience them. As traditional machine learning models for metastasis prediction have been limited to the use of a single modality, in this study we aim to explore and compare different unimodal and multimodal machine learning models to predict the onset of metastasis in melanoma patients to help clinicians focus their attention on patients at a higher risk of developing metastasis, increasing the likelihood of an earlier diagnosis. We use a patient cohort derived from an Electronic Health Record, and we consider various modalities of data, including static, time series, and clinical text. We formulate the problem and propose a multimodal ML workflow for predicting the onset of metastasis in melanoma patients. We evaluate the performance of the workflow based on various classification metrics and statistical significance. The experimental findings suggest that multimodal models outperform the unimodal ones, demonstrating the potential of multimodal ML to predict the onset of metastasis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Models for Predicting Melanoma Outcome

Prediction of early-stage melanoma recurrence using clinical and histopathologic features

Article Open access 31 October 2022

ebioMelDB: Multi-modal Database for Melanoma and Its Application on Estimating Patient Prognosis

Notes

1.
https://github.com/FoxtrotRomeo/melanoma_metastasis.
2.
This research has been approved by the Regional Ethical Review Board in Stockholm under permission no. 2014/1882-31/5.

References

Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems, software available from tensorflow.org (2015). https://www.tensorflow.org/
Bostrom, A., Bagnall, A.: Binary shapelet transform for multiclass time series classification. Transactions on Large-Scale Data-and Knowledge-Centered Systems XXXII: Special Issue on Big Data Analytics and Knowledge Discovery, pp. 24–46 (2017)
Google Scholar
Braeuer, R.R., et al.: Why is melanoma so metastatic? Pigm. Cell Melanoma Res. 27(1), 19–36 (2014)
Article Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Article MATH Google Scholar
Breiman, L.: Classification and Regression Trees. Routledge (2017)
Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article MATH Google Scholar
Cho, K., van Merrienboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches (2014)
Google Scholar
Dalianis, H., Henriksson, A., Kvist, M., Velupillai, S., Weegar, R.: Health bank-a workbench for data science applications in healthcare. CAiSE Ind. Track 1381, 1–18 (2015)
MATH Google Scholar
Dempster, A., Petitjean, F., Webb, G.I.: Rocket: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min. Knowl. Disc. 34(5), 1454–1495 (2020)
Article MathSciNet MATH Google Scholar
Erdei, E., Torres, S.M.: A new understanding in the epidemiology of melanoma. Expert Rev. Anticancer Ther. 10(11), 1811–1823 (2010)
Article PubMed PubMed Central MATH Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat., 1189–1232 (2001)
Google Scholar
Green, A.C., Pandeya, N., Morton, S., Simonidis, J., Whiteman, D.C.: Early detection of melanoma in specialised primary care practice in Australia. Cancer Epidemiol. 70, 101872 (2021)
Article PubMed Google Scholar
Grossarth, S., et al.: Recent advances in melanoma diagnosis and prognosis using machine learning methods. Curr. Oncol. Rep., 1–11 (2023)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735 (1997)
Article PubMed MATH Google Scholar
Karimkhani, C., et al.: The global burden of melanoma: results from the global burden of disease study 2015. Br. J. Dermatol. 177(1), 134–140 (2017)
Article PubMed PubMed Central MATH Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Leiter, U., Garbe, C.: Epidemiology of melanoma and nonmelanoma skin cancer—the role of sunlight. In: Sunlight, Vitamin D and Skin Cancer, pp. 89–103 (2008)
Google Scholar
Ma, E.Z., Hoegler, K.M., Zhou, A.E.: Bioinformatic and machine learning applications in melanoma risk assessment and prognosis: a literature review. Genes 12(11), 1751 (2021)
Article PubMed PubMed Central MATH Google Scholar
Malke, J.C., et al.: Enhancing case capture, quality, and completeness of primary melanoma pathology records via natural language processing. JCO Clin. Cancer Inf. 3, 1–11 (2019)
Google Scholar
Middlehurst, M., Large, J., Bagnall, A.: The canonical interval forest (CIF) classifier for time series classification. In: 2020 IEEE International Conference on Big Data (Big Data), pp. 188–195. IEEE (2020)
Google Scholar
Nascentes Melo, L.M., et al.: Advancements in melanoma cancer metastasis models. Pigm. Cell Melanoma Res. 36(2), 206–223 (2023)
Article Google Scholar
Nemenyi, P.B.: Distribution-free Multiple Comparisons. Princeton University (1963)
Google Scholar
Noble, W.S.: What is a support vector machine? Nat. Biotechnol. 24(12), 1565–1567 (2006)
Article PubMed MATH Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Pottegård, A., et al.: Use of sildenafil or other phosphodiesterase inhibitors and risk of melanoma. Br. J. Cancer 115(7), 895–900 (2016)
Article PubMed PubMed Central MATH Google Scholar
Purushotham, S., Meng, C., Che, Z., Liu, Y.: Benchmarking deep learning models on large healthcare datasets. J. Biomed. Inform. 83, 112–134 (2018)
Article PubMed MATH Google Scholar
Qiao, Z., Wu, X., Ge, S., Fan, W.: MNN: multimodal attentional neural networks for diagnosis prediction. Extraction 1, A1 (2019)
MATH Google Scholar
Robert, C., et al.: Improved overall survival in melanoma with combined dabrafenib and trametinib. N. Engl. J. Med. 372(1), 30–39 (2015)
Article PubMed MATH Google Scholar
Rossi, K.R., Echeverria, D., Carroll, A., Luse, T., Rennix, C.: Development and evaluation of Perl-based algorithms to classify neoplasms from pathology records in synoptic report format. JCO Clin. Cancer Inf. 5, 295–303 (2021)
Article Google Scholar
Sadetsky, N., Chuo, C.Y., Davidoff, A.J.: Development and evaluation of a proxy for baseline ECOG PS in advanced non-small cell lung cancer, bladder cancer, and melanoma: an electronic health record study. Pharmacoepidemiol. Drug Saf. 30(9), 1233–1241 (2021)
Article PubMed MATH Google Scholar
Schäfer, P., Leser, U.: Multivariate time series classification with weasel muse. arXiv preprint arXiv:1711.11343 (2017)
Siegel, R.L., Miller, K.D., Fuchs, H.E., Jemal, A.: Cancer statistics, 2022. CA Cancer J. Clin. 72(1), 7–33 (2022)
Article PubMed Google Scholar
Suresh, H., Hunt, N., Johnson, A., Celi, L.A., Szolovits, P., Ghassemi, M.: Clinical intervention prediction and understanding with deep neural networks. In: Machine Learning for Healthcare Conference, pp. 322–337. PMLR (2017)
Google Scholar
Vakili, T., Lamproudis, A., Henriksson, A., Dalianis, H.: Downstream task performance of bert models pre-trained using automatically de-identified clinical data. In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), pp. 4245 – 4252 (2022)
Google Scholar
WHO: ICD-10 Version:2016 — icd.who.int (2023). https://icd.who.int/browse10/2016/en#/C43
Xu, Z., So, D.R., Dai, A.M.: Mufasa: multimodal fusion architecture search for electronic health records. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10532–10540 (2021)
Google Scholar
Yin, C., Liu, R., Zhang, D., Zhang, P.: Identifying sepsis subphenotypes via time-aware multi-modal auto-encoder. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 862–872 (2020)
Google Scholar
Zhang, X., et al.: Learning robust patient representations from multi-modal electronic health records: a supervised deep learning approach. In: Proceedings of the 2021 SIAM International Conference on Data Mining (SDM), pp. 585–593. SIAM (2021)
Google Scholar

Download references

Acknowledgements

This work was supported in part by the Digital Futures EXTREMUM project on “Explainable and Ethical Machine Learning for Knowledge Discovery from Medical Data Sources”.

This work has received funding from the Horizon Europe Research and Innovation programme under Grant Agreements No 875351 and 101093026.

Author information

Authors and Affiliations

Department of Computer and Systems Sciences, Stockholm University, Stockholm, Sweden
Franco Rugolon, Korbinian Randl, Maria Bampa & Panagiotis Papapetrou

Authors

Franco Rugolon
View author publications
You can also search for this author in PubMed Google Scholar
Korbinian Randl
View author publications
You can also search for this author in PubMed Google Scholar
Maria Bampa
View author publications
You can also search for this author in PubMed Google Scholar
Panagiotis Papapetrou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Franco Rugolon .

Editor information

Editors and Affiliations

University of Turin, Turin, Italy
Rosa Meo
Sapienza University of Rome, Rome, Italy
Fabrizio Silvestri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rugolon, F., Randl, K., Bampa, M., Papapetrou, P. (2025). A Workflow for Creating Multimodal Machine Learning Models for Metastasis Predictions in Melanoma Patients. In: Meo, R., Silvestri, F. (eds) Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2023. Communications in Computer and Information Science, vol 2136. Springer, Cham. https://doi.org/10.1007/978-3-031-74640-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-74640-6_7
Published: 01 January 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-74639-0
Online ISBN: 978-3-031-74640-6
eBook Packages: Artificial Intelligence (R0)

Publish with us

Policies and ethics

A Workflow for Creating Multimodal Machine Learning Models for Metastasis Predictions in Melanoma Patients

Abstract

Access this chapter

Subscribe and save

Buy Now