Abstract
We present a study about the prediction of long-COVID sequelae through multi-label classification (MLC). Data about more than 300 patients have been collected during a long-COVID study at Ospedale Maggiore of Novara (Italy), considering their baseline situation, as well as their condition on acute COVID-19 onset. The goal is to predict the presence of specific long-COVID sequelae after a one-year follow-up. To amplify the representativeness of the analysis, we carefully investigated the possibility of augmenting the dataset, by considering situations where different levels in the number of complications could arise. MLSmote under six different policies of data augmentation has been considered, and a representative set of MLC approaches have been tested on all the available datasets. Results have been evaluated in terms of Accuracy, Exact match, Hamming Score and macro-averaged AUC; they show that MLC methods can actually be useful for the prediction of specific long-COVID sequelae, under the different conditions represented by the different considered datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Actually the collected data concerned much more hospitalized patients, but we have been able to work only with those patients who decided to partecipate in the study and for which reliable data were available [5].
References
TECNOMED-HUB webpage. https://www.tecnomedhub.it. Accessed 30 June 2023
Atkinson, A.: On the measurement of inequality. J. Econ. Theory 2(3), 244–263 (1970)
Baarts, J., et al.: Multilabel classification of disease prediction in patients presenting with dyspnea. Eur. Respir. J. 58(suppl 65) (2021)
Bellan, M., et al.: Long-term sequelae are highly prevalent one year after hospitalization for severe covid-19. Sci. Rep. 11(1), 22666 (2021)
Bellan, M., Soddu, D., Balbo, P.E., Baricich, A., Zeppegno, P., et al.: Respiratory and psychophysical sequelae among patients with covid-19 four months after hospital discharge. JAMA Netw. 41(1), e2036142 (2021)
Bogatinovski, J., Todorovski, L., Džeroski, S., Kocev, D.: Comprehensive comparative study of multi-label classification methods. Expert Syst. Appl. 203, 117215 (2022)
Charte, F., Rivera, A., delJesus, M., Herrera, F.: MLSMOTE: approaching imbalanced multilabeled learning through synthetic instance generation. Knowl. Based Syst. 89, 385–397 (2015)
Charte, F., Rivera, A., delJesus, M., Herrera, F.: Dealing with difficult minority labels in imbalanced mutilabel data sets. Neurocomputing 326–327, 39–53 (2019)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
Frank, E., Hall, M., Witten, I.: The WEKA workbench. In: Data Mining: Practical Machine Learning Tools and Techniques, 4th ed. (2016). (Online Appendix)
Gibaja, E., Ventura, S.: A tutorial on multilabel learning. ACM Comput. Surv. (CSUR) 47(3), 1–38 (2015)
Guo, Y., Gu, S.: Multi-label classification using conditional dependency networks. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011), pp. 1300–1305 (2011)
Huang, Y., et al.: A multi-label learning prediction model for heart failure in patients with atrial fibrillation based on expert knowledge of disease duration. Appl. Intell., 1–12 (2023)
Madjarov, G., Kocev, D., Gjorgjevikj, D., Džeroski, S.: An extensive experimental comparison of methods for multi-label learning. Pattern Recogn. 45(9), 3084–3104 (2012)
Nalbandian, A., et al.: Post-acute covid-19 syndrome. Nat. Med. 27(4), 601–615 (2021)
Panigutti, C., Guidotti, R., Monreale, A., Pedreschi, D.: Explaining multi-label black-box classifiers for health applications. In: Shaban-Nejad, A., Michalowski, M. (eds.) W3PHAI 2019. SCI, vol. 843, pp. 97–110. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-24409-5_9
Rana, P., Sowmya, A., Meijering, E., Song, Y.: Imbalanced classification for protein subcellular localization with multilabel oversampling. Bioinformatics 39(1), btac841 (2023)
Read, J., Pfahringer, B., Holmes, G., Frank, E.: Classifier chains for multi-label classification. Mach. Learn. 85, 333–359 (2011)
Read, J., Pfahringer, B., Holmes, G.: Multi-label classification using ensembles of pruned sets. In: Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), pp. 995–1000 (2008)
Read, J., Reutemann, P., Pfahringer, B., Holmes, G.: MEKA: a multi-label/multi-target extension to Weka. J. Mach. Learn. Res. 17(21), 1–5 (2016). http://meka.sourceforge.net/
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods, 2nd edn. Springer, Cham (2004). https://doi.org/10.1007/978-1-4757-4145-2
Tabia, K.: Towards explainable multi-label classification. In: 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1088–1095 (2019). https://doi.org/10.1109/ICTAI.2019.00152
Tarekegn, A., Giacobini, M., Michalak, K.: A review of methods for imbalanced multi-label classification. Pattern Recogn. 118, 107965 (2021)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multi-label classification. IEEE Trans. Knowl. Data Eng. 23, 1079–1089 (2011)
Zaragoza, J., Sucar, L., Morales, E., Bielza, C., Larranaga, P.: Bayesian chain classifiers for multidimensional classification. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011), pp. 2192–2197 (2011)
Zhou, L., Zheng, X., Yang, D., Wang, Y., Bai, X., Ye, X.: Application of multi-label classification models for the diagnosis of diabetic complications. BMC Med. Inform. Decis. Making 21(1), 182 (2021)
Acknowledgments
M. Dossena and C. Irwin are supported by the National PhD program in Artificial Intelligence for Healthcare and Life Sciences (Campus Bio-medico University of Rome). We want to thank A. Chiocchetti and M. Bellan for having provided us with the long-COVID data and for several fruitful discussions about the case study. This work was funded by “Piano Riparti Piemonte”, Azione n. 173 “INFRA-P. Realizzazione, rafforzamento e ampliamento infrastrutture di ricerca pubbliche—bando INFRA-P2-TECNOMED-HUB n.378-48”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dossena, M., Irwin, C., Piovesan, L., Portinale, L. (2023). A Multi-label Classification Study for the Prediction of Long-COVID Syndrome. In: Basili, R., Lembo, D., Limongelli, C., Orlandini, A. (eds) AIxIA 2023 – Advances in Artificial Intelligence. AIxIA 2023. Lecture Notes in Computer Science(), vol 14318. Springer, Cham. https://doi.org/10.1007/978-3-031-47546-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-47546-7_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47545-0
Online ISBN: 978-3-031-47546-7
eBook Packages: Computer ScienceComputer Science (R0)