Abstract
A newborn with a birth weight above the 90th percentile of same gestational age is termed as large for gestational age. Large for gestational age suffers from serious complications during and after the antepartum period because they do not get earlier identification of the disease. Earlier recognition of large for gestational age infants could slow progression and prevent further complication of the disease. In medical science, prevention and mitigation of disease require examination of biochemical indicators. Machine learning has been evolved and envisioned as a tool to predict large for gestational age infants with most deterministic characteristics. This study aims to identify most deterministic biochemical indicators for large for gestational age prediction with minimal computational overhead. To the best of my knowledge, this is the first time a study is carried out to identify the most deterministic risk factors associated with large for gestational age and to develop large for gestational age prediction model using machine learning techniques. To develop an efficient large for gestational age prediction model, we conducted three group of experiments that considered basic machine learning methods; feature selection; and imbalanced data, respectively. Support vector machine, logistic regression, Naive Bayes and Random Forest were trained using tenfold cross-validation on large for gestational age dataset; we selected precision and area under the curve as a performance evaluation metrics; information gain an entropy-based feature selection method was adopted to rank features; we introduced an ensemble data imbalance technique in the last group of experiments. For each group of experiments, support vector machine performed best compared to other machine learning classifiers by producing the highest prediction precision score of 85%. All of the classifiers performed best with thirty ranked features subset, which validates the applied method to recognize the most deterministic risk factors associated with large for gestational age prediction.
Similar content being viewed by others
References
Battaglia FC, Lubchenco LO (1967) A practical classification of newborn infants by weight and gestational age. J Pediatr 71(2):159–163
Lazer S, Biale Y, Mazor M, Lewenthal H, Insler V (1986) Complications associated with the macrosomic fetus. J Reprod Med 31(6):501–505
Spellacy W, Miller S, Winegar A, Peterson P (1985) Macrosomia-maternal characteristics and infant complications. Obstet Gynecol 66(2):158–161
Xu H, Simonet F, Luo Z-C (2010) Optimal birth weight percentile cut-offs in defining small-or large-for-gestational-age. Acta Paediatr 99(4):550–555
Wikström I, Axelsson O, Bergström R (1991) Maternal factors associated with high birth weight. Acta Obstet Gynecol Scand 70(1):55–61
Meshari A, De Silva S, Rahman I (1990) Fetal macrosomiamaternal risks and fetal outcome. Int J Gynecol Obstet 32(3):215–222
Oral E, Cağdaş A, Gezer A, Kaleli S, Aydinli K, Öçer F (2001) Perinatal and maternal outcomes of fetal macrosomia. Eur J Obstet Gynecol Reprod Biol 99(2):167–171
Cheung T, Leung A, Chang A (1990) Macrosomic babies. Aust N Z J Obstet Gynaecol 30(4):319–322
Whitaker RC, Dietz WH (1998) Role of the prenatal environment in the development of obesity. J Pediatr 132(5):768–776
Michels KB, Trichopoulos D, Robins JM, Rosner BA, Manson JE, Hunter DJ, Colditz GA, Hankinson SE, Speizer FE, Willett WC (1996) Birthweight as a risk factor for breast cancer. Lancet 348(9041):1542–1546
Wang T, Xu J, Zhang W, Gu Z, Zhong H (2018) Self-adaptive cloud monitoring with online anomaly detection. Future Gener Comput Syst 80:89–101
Wang T, Zhang W, Ye C, Wei J, Zhong H, Huang T (2016) Fd4c: automatic fault diagnosis framework for web applications in cloud computing. IEEE Trans Syst Man Cybern Syst 46(1):61–75
Wang T, Wei J, Zhang W, Zhong H, Huang T (2014) Workload-aware anomaly detection for web applications. J Syst Softw 89:19–32
Li J, Wang F (2016) Semi-supervised learning via mean field methods. Neurocomputing 177:385–393
Shmueli A, Nassie DI, Hiersch L, Ashwal E, Wiznitzer A, Yogev Y, Aviram A (2017) 241: prerecognition of large for gestational age (lga) fetus and its consequences. Am J Obstet Gynecol 216(1):S150–S151
Moore GS, Kneitel AW, Walker CK, Gilbert WM, Xing G (2012) Autism risk in small-and large-for-gestational-age infants. Am J Obstet Gynecol 206(4):314-e1
Littner Y, Mandel D, Mimouni FB, Dollberg S (2004) Decreased bone ultrasound velocity in large-for-gestational-age infants. J Perinatol 24(1):21
Luangkwan S, Vetchapanpasat S, Panditpanitcha P, Yimsabai R, Subhaluksuksakorn P, Loyd RA, Uengarporn N (2015) Risk factors of small for gestational age and large for gestational age at buriram hospital. J Med Assoc Thail 98(Suppl 4):S71–S78
Institute of Medicine (2009) Weight gain during pregnancy: reexamining the guidelines. National Academies Press, Washington, DC
Kominiarek MA, Grobman W, Adam E, Buss C, Culhane J, Entringer S, Simhan H, Wadhwa PD, Kim KY, Keenan-Devlin L, Borders A (2018) Stress during pregnancy and gestational weight gain. J Perinatol 38(5):462–467
Chiavaroli V, Castorani V, Guidone P, Derraik JG, Liberati M, Chiarelli F, Mohn A (2016) Incidence of infants born small-and large-for-gestational-age in an italian cohort over a 20-year period and associated risk factors. Ital J Pediatr 42(1):42
Stoean R, Stoean C (2013) Modeling medical decision making by support vector machines, explaining by rules of evolutionary algorithms with feature selection. Expert Syst Appl 40(7):2677–2686
Lu C, Zhu Z, Gu X (2014) An intelligent system for lung cancer diagnosis using a new genetic algorithm based feature selection method. J Med Syst 38(9):97
Azar AT (2014) Neuro-fuzzy feature selection approach based on linguistic hedges for medical diagnosis. Int J Model Identif Control 22(3):195–206
Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 42(22):8520–8532
Li J, Wang F (2017) Towards unsupervised gene selection: a matrix factorization framework. IEEE/ACM Trans Comput Biol Bioinf: TCBB 14(3):514–521
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Raju R (2012) Relative importance of fine needle aspiration features for breast cancer diagnosis: a study using information gain evaluation and machine learning. J Am Soc Cytopathol 1(1):S11
Li J, Liu L, Sun J, Mo H, Yang J, Chen S, Liu H, Wang Q, Pan H (2016) Comparison of different machine learning approaches to predict small for gestational age infants. IEEE Trans Big Data. https://doi.org/10.1109/TBDATA.2016.2620981
Zhang S, Wang Q, Shen H (2015) Design implementation and significance of chinese free pre-pregnancy eugenics checks project. Natl Med J China 95(3):162–165
Li J, Yang J-J, Zhao Y, Liu B, Zhou M, Bi J, Wang Q (2017) Enforcing differential privacy for shared collaborative filtering. IEEE Access 5:35–49
Zhu L, Zhang R, Zhang S, Shi W, Yan W, Wang X, Lyu Q, Liu L, Zhou Q, Qiu Q et al (2015) Chinese neonatal birth weight curve for different gestational age. Chin J Pediatr 53(2):97–103
Li J, Liu C, Liu B, Mao R, Wang Y, Chen S, Yang J-J, Pan H, Wang Q (2015) Diversity-aware retrieval of medical records. Comput Ind 69:81–91
Khashei M, Eftekhari S, Parvizian J (2012) Diagnosing diabetes type ii using a soft intelligent binary classification model. Rev Bioinf Biom 1:9–23
Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28
Bammann K (2006) Statistical models: theory and practice. Biometrics 62(3):943–943
Zhang H, Su J (2004) Naive bayesian classifiers for ranking. In: European Conference on Machine Learning. Springer, pp 501–512
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Corp N IBM (2013) Ibm spss statistics for windows. Version, vol 22
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12(Oct):2825–2830
Zar JH et al (1999) Biostatistical analysis. Pearson Education India, Bengaluru
Acknowledgements
This work is supported by National Key Research and Development Program of China with project No. 2017YFB1400803.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Akhtar, F., Li, J., Azeem, M. et al. Effective large for gestational age prediction using machine learning techniques with monitoring biochemical indicators. J Supercomput 76, 6219–6237 (2020). https://doi.org/10.1007/s11227-018-02738-w
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-02738-w