Abstract
The Infant Mortality Rate (IMR) is defined as the number of infants for every thousand infants that do not survive until their first birthday. IMR is an important metric not only because it provides information about infant births in an area, but it also measures the general societal health status. In the United States of America, the IMR is higher than many other developed countries, despite the high level of prosperity. It is important to note here that the U.S.A. exhibits strong and persistent inequalities in the IMR across different racial and ethnic groups (Kochanek et al. in Natl Vital Stat Rep 65(4):1–122, 2006). In this paper, we study predictive models in the problem of infant mortality. We implement traditional machine learning models and state-of-the-art neural network models with various combinations of features extracted from birth certificates. Those combinations include features that can be summed as socio-economic and ethical features related to the mother and the father of the infant and medical measurements during the pregnancy and the delivery. We approach the classification problem of infant mortality, whether an infant will survive until her first birthday or not, both as binary and multi-class based on the time of death. We focus on understanding and exploring the importance of features extracted from the birth certificates. For example, we test the performance of models trained on the general population to models trained in subsets of the population, e.g., for individual races. We show in our experimental evaluation comparisons between different predictive models (including those used by epidemiology researchers), various combinations of features, different distributions in the training set and features’ importance.
Similar content being viewed by others
References
Abrevaya J (2002) The effects of demographics and maternal behavior on the distribution of birth outcomes. In: Economic applications of quantile regression
Acevedo-Garcia D, Soobader M, Berkman L (2007) Low birthweight among U.S. hispanic/latino subgroups: the effect of maternal foreign-born status and education. Soc Sci Med 65(12):2503–2516
Acevedo-Garcia D, Soobader MJ, Berkman LF (2005) The differential effect of foreign-born status on low birth weight by race/ethnicity and education. Pediatrics 115(1):e20–e30
Acevedo-Garcia D, Soobader MJ, Berkman LF (2007) Low birthweight among us hispanic/latino subgroups: the effect of maternal foreign-born status and education. Soc Sci Med 65(12):2503–2516
Almond D, Chay KY, Lee DS (2005) The costs of low birth weight. Q J Econ 120:1031–1083
Callaghan WM, MacDorman MF, Rasmussen SA, Qin C, Lackritz EM (2006) The contribution of preterm birth to infant mortality rates in the united states. Pediatrics 118(4):1566–1573
Casey BM, McIntire DD, Leveno KJ (2001) The continuing value of the Apgar score for the assessment of newborn infants. New Engl J Med 344:467–471
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd SIGKDD 2016. ACM
Doyle JM, Echevarria S, Frisbie WP (2003) Race/ethnicity, Apgar and infant mortality. Springer, Berlin
Finch BK (2003) Early origins of the gradient: the relationship between socioeconomic status and infant mortality in the united states. Demography 40(4):675–699
Health (2006) United States, 2005: with chartbook on trends in the health of Americans. US Department of Health and Human Services, Washington
Hegyi T, Carbone T, Anwar M, Ostfeld B, Hiatt M, Koons A, Pinto-Martin J, Paneth N (1998) The Apgar score and its components in the preterm infant. Pediatrics 101(1 Pt 1):77–81
Hessol NA, Fuentes-Afflick E (2005) Ethnic differences in neonatal and postneonatal mortality. Pediatrics 115(1):e44–e51
Hessol NA, Fuentes-Afflick E, Bacchetti P (1998) Risk of low birth weight infants among black and white parents. Elsevier, Amsterdam
Hummer RA, Biegler M, De Turk PB, Forbes D, Frisbie WP, Hong Y, Pullum SG (1999) Race/ethnicity, nativity, and infant mortality in the United States. Soc Forc 77:1083–1118
John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence, UAI’95
Kochanek KD, Murphy SL, Xu J, Tejada-Vera B (2006) Deaths: final data for 2014. Natl Vital Stat Rep 65(4):1–122
Ma S, Finch BK (2010) Birth outcome measures and infant mortality. Popul Res Policy Rev 29:865
Macinko J, Guanais FC, de Souza M (2006) Evaluation of the impact of the family health program on infant mortality in brazil, 1990–2002. J Epidemiol Commun Health 60(1):13–19
Mathews T, MacDorman MF (2007) Infant mortality statistics from the 2004 period linked birth/infant death data set. Natl Vital Stat Rep 55(14):1–32
McCormick MC (1985) The contribution of low birth weight to infant mortality and childhood morbidity. N Engl J Med 312:82–90
Osypuk TL, Acevedo-Garcia D (2008) Are racial disparities in preterm birth larger in hypersegregated areas? Am J Epidemiol 167(11):1295–1304
Osypuk TL, Acevedo-Garcia D (2008) Are racial disparities in preterm birth larger in hypersegregated areas? Am J Epidemiol 167(11):1295–304
Papile LA (2001) The apgar score in the 21st century. N Engl J Med 344(7):519–520
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
Potash E, Brew J, Loewi A, Majumdar S, Reece A, Walsh J, Rozier E, Jorgenson E, Mansour R, Ghani R (2015) Predictive modeling for public health: Preventing childhood lead poisoning. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’15. ACM
Powers D, Parker F (2006) Race/ethnic differences and age-variation in the effects of birth outcomes on infant mortality in the US. Demograph Res 14(10):179–216
Rinta-Koski OP (2018) Machine learning in neonatal intensive care. Ph.D. Thesis, Aalto University, Helsinki. http://urn.fi/URN:ISBN:978-952-60-8210-3
Rinta-Koski OP, Särkkä S, Hollmén J, Leskinen M, Andersson S (2018) Gaussian process classification for prediction of in-hospital mortality among preterm infants. Neurocomputing 298:134–141
Saravanou A, Noelke C, Huntington N, Acevedo-Garcia D, Gunopulos D (2019) Infant mortality prediction using birth certificate data. DSHealth KDD workshop. arXiv preprint arXiv:1907.08968
Saravanou A, Noelke C, Huntington N, Acevedo-Garcia D, Gunopulos D (2019b) Predicting infant mortality at the time of birth. Population Association Annual Meeting, Austin
Schölkopf B, Williamson RC, Smola AJ, Shawe-Taylor J, Platt JC (2000) Support vector method for novelty detection. In: Advances in neural information processing systems, pp 582–588
Somanchi S, Adhikari S, Lin A, Eneva E, Ghani R (2015) Early prediction of cardiac arrest (code blue) using electronic medical records. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, KDD’15. ACM
Wilcox AJ (2001) On the importance-and the unimportance-of birthweight. Int J Epidemiol 30:1233–1241
Wilcox AJ, Skjaerven R (1992) Birth weight and perinatal mortality: the effect of gestational age. Am J Public Health 82:378–82
Acknowledgements
The authors would like to thank the anonymous reviewers for providing insightful feedback. This research has been financed by a Google Faculty Research Award, the EU Horizon 2020 research and innovation programme under grant agreement No. 734242 (Project LAMBDA), the ESPA Grant under the No. 16521, the Robert Wood Johnson Foundation Grant 71192 and the W.K. Kellogg Foundation Grant P3036220.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Myra Spiliopoulou and Panagiotis Papapetrou.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Saravanou, A., Noelke, C., Huntington, N. et al. Predictive modeling of infant mortality. Data Min Knowl Disc 35, 1785–1807 (2021). https://doi.org/10.1007/s10618-020-00728-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-020-00728-2