Abstract
In the context of infant mortality risk analyses, the application of Machine Learning techniques, like Feature Selection, can be an efficient way to increase the interpretability of data and explanation of the studied phenomenon. In this paper, we developed a Machine Learning approach to identify the main risk factors that impact the local population studied with regard to infant mortality, aiming to help professionals who deal directly with the event or with the epidemiological guidelines that may be made available from data analysis. First, we integrated the databases of the Live Birth Information System (SINASC) and the Infant Mortality Information System (SIM), between 2006 and 2019, in the city of Vitória, ES, Brazil. Then, we used feature selection methods, such as SHAP, Feature_Importance and SelectKBest, to identify the main risk factors associated with infant mortality and we compared the results obtained from applying these algorithms with the most recent results of a 2018 meta-analysis. We observed that the results achieved by the methods, especially by the SHAP method, match the results of a literature meta-analysis, in which the factors that most influenced the final outcome of mortality were Weight, APGAR, Gestational Age and Presence of Anomalies. Therefore, the use of interpretability techniques, such as SHAP, are very promising for the selection and the identification of population risk factors related to infant mortality, by using existing databases without the need for new population studies and, in addition, this knowledge can be used to help in decision making for public health.
Supported by FAPES (T.O. 179/2019), IFES, EMESCAM and PMV – ES, Brazil.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kassar, S.B., Melo, A.M., Coutinho, S.B., Lima, M.C., Lira, P.I.: Determinants of neonatal death with emphasis on health care during pregnancy, childbirth and reproductive history. J Pediatr. (Rio J) 89(3), 269–77 (2013). https://doi.org/10.1016/j.jped.2012.11.005. PMID: 23680300
Borgesa, T.S., Vayego, S.A.: Risk factors for neonatal mortality in a county in Southern region. Ciência Saúde (Paraná) 8(1), pp. 7–14 (2015). https://doi.org/10.15448/1983-652X.2015.1.21010
Garcia, L.P., Fernandes, C.M., Traebert, J.: Risk factors for neonatal death in the capital city with the lowest infant mortality rate in Brazil. J. Pediatr. (Rio J) 95(2), 194–200 (2019). https://doi.org/10.1016/j.jped.2017.12.007
Gaiva, M.A.M., Fujimori, E., Sato, A.P.S.: Maternal and child risk factors associated with neonatal mortality. Texto Contexto Enferm 25(4), e2290015 (2016). https://doi.org/10.1590/0104-07072016002290015
World health statistics 2020: monitoring health for the SDGs, sustainable development goals. Geneva: World Health Organization (2020). Licence: CC BY-NC-SA 3.0 IGO
Welcome to the SHAP documentation [Internet]. Welcome to the SHAP documentation - SHAP latest documentation. https://shap.readthedocs.io/en/latest/
Lundberg, S.M., Lee, S.-I.: A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), pp. 4768–4777. Curran Associates Inc., Red Hook (2017)
XGBoost Documentation [Internet]. XGBoost Documentation - xgboost 1.5.0-dev documentation. https://xgboost.readthedocs.io/en/latest/
Veloso, F.C.S., Kassar, L.M.L., Oliveira, M.J.C., et al.: Analysis of neonatal mortality risk factors in Brazil: a systematic review and meta-analysis of observational studies. J. Pediatr. (Rio J) 95(5), 519–530 (2019). https://doi.org/10.1016/j.jped.2018.12.014
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. JMLR 12, 2825–2830 (2011)
Batista, A.F.M., Diniz, C.S.G., Bonilha, E.A., Kawachi, I., Chiavegatto Filho, A.D.P.: Neonatal mortality prediction with routinely collected data: a machine learning approach. BMC Pediatr. 21(1), 322 (2021). https://doi.org/10.1186/s12887-021-02788-9
Panch, T., Mattie, H., Celi, L.A.: The “inconvenient truth’’ about AI in healthcare. NPJ Digit Med. 2, 77 (2019). https://doi.org/10.1038/s41746-019-0155-4
Hamet, P., Tremblay, J.: Artificial intelligence in medicine. Metabolism 69S, S36–S40 (2017). https://doi.org/10.1016/j.metabol.2017.01.011
Hernandez, A.V., Marti, K.M., Roman, Y.M.: Meta-analysis. Chest 158(1S), S97–S102 (2020). https://doi.org/10.1016/j.chest.2020.03.003
Fernandes, F.T., de Oliveira, T.A., Teixeira, C.E., et al.: A multipurpose machine learning approach to predict COVID-19 negative prognosis in São Paulo, Brazil. Sci. Rep. 11, 3343 (2021). https://doi.org/10.1038/s41598-021-82885-y
Alaa, A.M., Bolton, T., Di Angelantonio, E., Rudd, J.H.F., van der Schaar, M.: Cardiovascular disease risk prediction using automated machine learning: a prospective study of 423,604 UK Biobank participants. PLoS One 14(5), e0213653 (2019). https://doi.org/10.1371/journal.pone.0213653
Funding
The authors would like to thank the FAPES (Fundação de Amparo à Pesquisa do Espírito Santo) for its sponsorship. We also thank the PMV-ES (Prefeitura Municipal de Vitória do Espírito Santo) for granting us access to their data.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Colodette, A.L., Filho, F.N.B., Pinasco, G.C., de Souza Cruz, S.C., Simões, S.N. (2022). Feature Selection for Identification of Risk Factors Associated with Infant Mortality. In: Bansal, M.S., et al. Computational Advances in Bio and Medical Sciences. ICCABS 2021. Lecture Notes in Computer Science(), vol 13254. Springer, Cham. https://doi.org/10.1007/978-3-031-17531-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-17531-2_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17530-5
Online ISBN: 978-3-031-17531-2
eBook Packages: Computer ScienceComputer Science (R0)