Skip to main content
Log in

Dynamic weighted ensemble for diarrhoea incidence predictions

  • Published:
Machine Learning Aims and scope Submit manuscript

Abstract

Diarrhoea (DH) disease pose significant threats to national morbidity and mortality in Vietnam, especially on children. Being a climate sensitive disease, it has strong links to various meteorological factors like rainfalls or temperatures. Hence, together with global climate changes, the risk of diarrhoea has been increasing gradually while Vietnam is already a hotspot of diarrhoea worldwide. Thus, having an effective early warning system is becoming an urgent need. However, it has not been paid enough attention with very few research works, mainly focusing on quantilizing the relationships among various climate factors and diarrhoea incidences. Exploring more sophisticated machine learning techniques is therefore an interesting work towards more efficient and effective warning systems. This paper consists of two main contributions. First, many different state-of-the-art prediction models from traditional to most recent advantaged methods, e.g., SARIMA, SARIMAX, LSTM, CNN, Xgboost, SVM, LightGBM, Catboost, LightGBM, N-HiST, BlockRNN, TCN, TFT, or Transformer, are studied for predicting DH rates for a large number of locations (55 provinces) with different climates, geographics and socio-economy factors. It provides a useful view on the overall performances of different ML models on the prediction task, which is extremely useful for other researchers when developing early-warning systems for DH in other places. Second, we introduce a novel ensemble prediction model, called dynamic weighted ensemble (DWE), for further improving the DH prediction performance. DWE is a two layer ensemble approach. The first generates different meta models based on four base component models. The second layer employs a novel approach to predict the performances of all selected meta models and uses these predicted results to dynamically combine these models in a weighted scheme to produce final results. This is totally different to traditional ensemble approaches which only rely on fixed combinations of their components. To the best of our knowledge, DWE is also the first ensemble approach for diarrhoea prediction. Extensive experiments are conducted over all 55 provinces of Vietnam to demonstrate the performance of DWE and to reveal its important characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data availability

Available upon request.

Code availability

Available upon request.

Notes

  1. https://www.who.int/news-room/fact-sheets/detail/diarrhoeal-disease.

  2. The World Bank, Country Climate and Development Report for Vietnam. https://www.worldbank.org/en/country/vietnam/brief/key-highlights-country-climate-and-development-report-for-vietnam.

  3. Germanwatch, Global Climate Risk Index 2020. https://www.germanwatch.org/en/17307.

  4. United States Agency for International Development (USAID), Climate risk profile: Vietnam. https://www.climatelinks.org/countries/vietnam.

References

  • Abdullahi, T., & Nitschke, G. (2021). Predicting disease outbreaks with climate data. In 2021 IEEE congress on evolutionary computation (CEC) (pp. 989–996). IEEE.

  • Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna: A next-generation hyperparameter optimization framework. In KDD (pp. 2623–2631).

  • Ali, M., Kim, D. R., Yunus, M., & Emch, M. (2013). Time series analysis of cholera in matlab, Bangladesh, during 1988–2001. Journal of Health, Population and Nutrition, 31(1), 11.

    Article  Google Scholar 

  • Anders, K. L., Thompson, C. N., Van Thuy, N. T., Nguyet, N. M., Dung, T. T. N., Phat, V. V., Van, N. T. H., Hieu, N. T., Tham, N. T. H., Ha, P. T. T., et al. (2015). The epidemiology and aetiology of diarrhoeal disease in infancy in southern Vietnam: a birth cohort study. International Journal of Infectious Diseases, 35, 3–10.

    Article  Google Scholar 

  • Bai, S., Kolter, J.Z., & Koltun, V. (2018). An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271

  • Brady, O. J., Smith, D. L., Scott, T. W., & Hay, S. I. (2015). Dengue disease outbreak definitions are implicitly variable. Epidemics, 11, 92–102.

    Article  Google Scholar 

  • Censi, A. M., Ienco, D., Gbodjo, Y. J. E., Pensa, R. G., Interdonato, R., & Gaetano, R. (2021). Attentive spatial temporal graph CNN for land cover mapping from multi temporal remote sensing data. IEEE Access, 9, 23070–23082.

    Article  Google Scholar 

  • Challu, C., Olivares, K.G., Oreshkin, B.N., Garza, F., Mergenthaler, M., & Dubrawski, A. (2022). N-hits: Neural hierarchical interpolation for time series forecasting. arXiv preprint arXiv:2201.12886

  • Chen, H., Wang, T., Zhang, Y., Bai, Y., & Chen, X. (2023). Dynamic weighted ensemble of geoscientific models via automated machine learning-based classification. EGUsphere (pp. 1–26).

  • Cheng, J., Bambrick, H., Yakob, L., Devine, G., Frentiu, F. D., Toan, D. T. T., Thai, P. Q., Xu, Z., & Hu, W. (2020). Heatwaves and dengue outbreaks in Hanoi, Vietnam: New evidence on early warning. PLoS Neglected Tropical Diseases, 14(1), e0007997.

    Article  Google Scholar 

  • Colón-González, F. J., Soares Bastos, L., Hofmann, B., Hopkin, A., Harpham, Q., Crocker, T., Amato, R., Ferrario, I., Moschini, F., James, S., et al. (2021). Probabilistic seasonal dengue forecasting in Vietnam: A modelling study using superensembles. PLoS Medicine, 18(3), e1003542.

    Article  Google Scholar 

  • Dorogush, A.V., Ershov, V., & Gulin, A. (2018). Catboost: Gradient boosting with categorical features support. arXiv preprint arXiv:1810.11363

  • D’souza, R., Hall, G., & Becker, N. (2008). Climatic factors associated with hospitalizations for rotavirus diarrhoea in children under 5 years of age. Epidemiology & Infection, 136(1), 56–64.

    Article  Google Scholar 

  • Fang, X., Liu, W., Ai, J., He, M., Wu, Y., Shi, Y., Shen, W., & Bao, C. (2020). Forecasting incidence of infectious diarrhea using random forest in Jiangsu province, China. BMC Infectious Diseases, 20(1), 1–8.

    Article  Google Scholar 

  • Huyen, D. T. T., Hong, D. T., Trung, N. T., Hoa, T. T. N., Oanh, N. K., Thang, H. V., Thao, N. T. T., Iijima, M., et al. (2018). Epidemiology of acute diarrhea caused by rotavirus in sentinel surveillance sites of Vietnam, 2012–2015. Vaccine, 36(51), 7894–7900.

    Article  Google Scholar 

  • Kam, H., Choi, S., Cho, J., Min, Y., & Park, R. (2010). Acute diarrheal syndromic surveillance. Applied Clinical Informatics, 1(02), 79–95.

    Article  Google Scholar 

  • Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T.Y. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems 30.

  • Li, K., Liu, W., Zhao, K., Shao, M., & Liu, L. (2015). A novel dynamic weight neural network ensemble model. International Journal of Distributed Sensor Networks, 11(8), 862056.

    Article  Google Scholar 

  • Lim, B., Arık, S. Ö., Loeff, N., & Pfister, T. (2021). Temporal fusion transformers for interpretable multi-horizon time series forecasting. Int. J. Forecast., 37(4), 1748–1764.

    Article  Google Scholar 

  • Mai, S.T., Phi, H.T., Abubakar, A., Kilpatrick, P., Nguyen, H.Q.V., & Vandierendonck, H. (2022) Dengue fever: From extreme climates to outbreak prediction. In ICDM.

  • McGough, S. F., Clemente, L., Kutz, J. N., & Santillana, M. (2021). A dynamic, ensemble learning approach to forecast dengue fever epidemic years in brazil using weather and population susceptibility cycles. Journal of the Royal Society Interface, 18(179), 20201006.

    Article  Google Scholar 

  • Naga, A.S., & Banerjee, S. (2020). Stock market forecasting using deep learning neural network. International Journal for Research in Engineering and Emerging Trends (IJ REET) 5.

  • Nguyen, T. V., Le Van, P., Le Huy, C., Gia, K. N., & Weintraub, A. (2006). Etiology and epidemiology of diarrhea in children in Hanoi, Vietnam. International Journal of Infectious Diseases, 10(4), 298–308.

    Article  Google Scholar 

  • Nguyen, V. H., Tuyet-Hanh, T. T., Mulhall, J., Minh, H. V., Duong, T. Q., & Chien, N. V. (2022). Deep learning models for forecasting dengue fever based on climate data in Vietnam. PLoS Neglected Tropical Diseases, 16, e0010509.

    Article  Google Scholar 

  • Onozuka, D., & Hashizume, M. (2011). Weather variability and paediatric infectious gastroenteritis. Epidemiology & Infection, 139(9), 1369–1378.

    Article  Google Scholar 

  • Oreshkin, B.N., Carpov, D., Chapados, N., & Bengio, Y. (2019). N-beats: Neural basis expansion analysis for interpretable time series forecasting. arXiv preprint arXiv:1905.10437

  • World Health Organization (2014). Quantitative risk assessment of the effects of climate change on selected causes of death, 2030s and 2050s. World Health Organization.

  • Pangestu, C. J., Piantari, E., & Munir, M. (2020). Prediction of diarrhea sufferers in bandung with seasonal autoregressive integrated moving average (SARIMA). Journal of Computers for Society, 1(1), 61–79.

    Google Scholar 

  • Phung, C., Dung, C., Rutherford, S., Nguyen, H. L. T., Luong, M. A., Do, C. M., & Huang, C. (2017). Heavy rainfall and risk of infectious intestinal diseases in the most populous city in Vietnam. Science of The Total Environment, 580, 805–812.

    Article  Google Scholar 

  • Phung, D., Huang, C., Rutherford, S., Chu, C., Wang, X., Nguyen, M., Nguyen, N., Do, C., & Nguyen, T. (2015). Temporal and spatial patterns of diarrhoea in the Mekong delta area, Vietnam. Epidemiology & Infection, 143(16), 3488–3497.

    Article  Google Scholar 

  • Phung, D., Huang, C., Rutherford, S., Chu, C., Wang, X., Nguyen, M., Nguyen, N. H., Manh, C. D., & Nguyen, T. H. (2015). Association between climate factors and diarrhoea in a Mekong delta area. International Journal of Biometeorology, 59(9), 1321–1331.

    Article  Google Scholar 

  • Phung, D., Nguyen, H. X., Nguyen, H. L. T., Luong, A. M., Do, C. M., Tran, Q. D., & Chu, C. (2018). The effects of socioecological factors on variation of communicable diseases: A multiple-disease study at the national scale of vietnam. PloS One, 13(3), e0193246.

    Article  Google Scholar 

  • Ren, F., Li, Y., & Hu, M. (2018). Multi-classifier ensemble based on dynamic weights. Multimedia Tools and Applications, 77, 21083–21107.

    Article  Google Scholar 

  • Sahai, A., Mandal, R., Joseph, S., Saha, S., Awate, P., Dutta, S., Dey, A., Chattopadhyay, R., et al. (2020). Development of a probabilistic early health warning system based on meteorological parameters. Scientific Reports, 10(1), 1–13.

    Article  Google Scholar 

  • Thompson, C. N., Phan, M. V., Hoang, N. V. M., Minh, P. V., Vinh, N. T., Thuy, C. T., Nga, T. T. T., Rabaa, M. A., Duy, P. T., Dung, T. T. N., et al. (2015). A prospective multi-center observational study of children hospitalized with diarrhea in Ho Chi Minh city, Vietnam. The American Journal of Tropical Medicine and Hygiene, 92(5), 1045–1052.

    Article  Google Scholar 

  • Thompson, C. N., Zelner, J. L., Nhu, T. D. H., Phan, M. V., Le, P. H., Thanh, H. N., Thuy, D. V., Nguyen, N. M., Manh, T. H., Minh, T. V. H., et al. (2015). The impact of environmental and climatic variation on the spatiotemporal trends of hospitalized pediatric diarrhea in ho chi Minh city, Vietnam. Health & place, 35, 147–154.

    Article  Google Scholar 

  • Troeger, C., Blacker, B. F., Khalil, I. A., Rao, P. C., Cao, S., Zimsen, S. R., Albertson, S. B., Stanaway, J. D., Deshpande, A., Abebe, Z., et al. (2018). Estimates of the global, regional, and national morbidity, mortality, and aetiologies of diarrhoea in 195 countries: a systematic analysis for the global burden of disease study 2016. The Lancet Infectious Diseases, 18(11), 1211–1228.

    Article  Google Scholar 

  • Wang, Y., & Gu, J. (2014) Comparative study among three different artificial neural networks to infectious diarrhea forecasting. In BIBM (pp. 40–46).

  • Wang, Y., Li, J., Gu, J., Zhou, Z., & Wang, Z. (2015). Artificial neural networks for infectious diarrhea prediction using meteorological factors in Shanghai (China). Applied Soft Computing, 35, 280–290.

    Article  Google Scholar 

  • Wangdi, K., & Clements, A. C. (2017). Spatial and temporal patterns of diarrhoea in Bhutan 2003–2013. BMC Infectious Diseases, 17(1), 1–9.

    Article  Google Scholar 

  • Yang, X., Xiong, W., Huang, T., & He, J. (2021). Meteorological and social conditions contribute to infectious diarrhea in china. Scientific Reports, 11(1), 1–13.

    Article  Google Scholar 

Download references

Acknowledgements

This research is funded by Vietnam National University HoChiMinh City (VNU-HCM) under Grant Number DS2022-26-03.

Author information

Authors and Affiliations

Authors

Contributions

TDD, TDN, VCT, and STM develop core algorithms and perform experiments. THTT, DP and DTA perform data collection, preprocessing and perform experiments on some traditional models. TDN and STM supervise the project. All the authors participate on paper writing and project discussion.

Corresponding author

Correspondence to Thuan Dinh Nguyen.

Ethics declarations

Conflicts of interest

Not applicable.

Ethics approval

Not applicable.

Consent to participate

TDD, TDN, VCT, STM, THTT, DP and DTA agree to participate.

Consent for publication

TDD, TDN, VCT, STM, THTT, DP and DTA agree that their individual’s data and image are published.1

Additional information

Editors: Dino Ienco, Robert Interdonato, Pascal Poncelet.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Figure 17 shows monthly averaged DH rate and climate factors (from Jan to Dec) for all provinces. Over the whole country, the peak DH rates fall into Mar to September, when rainfall and temperature are both higher.

Fig. 17
figure 17

Geographical representation of monthly average Diarrhea rate and Climate factors from Jan (1) to Dec (12)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Do, T.D., Nguyen, T.D., Ta, V.C. et al. Dynamic weighted ensemble for diarrhoea incidence predictions. Mach Learn 113, 2129–2152 (2024). https://doi.org/10.1007/s10994-023-06465-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10994-023-06465-z

Keywords

Navigation