Skip to main content

Evaluating Imputation Methods for Missing Data in a MCI Dataset

  • Conference paper
  • First Online:
Artificial Intelligence in Neuroscience: Affective Analysis and Health Applications (IWINAC 2022)

Abstract

Missing data is a recurrent problem in experimental studies, mostly in clinical and sociodemographic longitudinal studies due to the dropout and the negative of some subjects to answer or perform some tests. To address this problem different strategies have been designed to deal with missing values, but incorrect treatment of missing data can result in the database being biased in one or more parameters, compromising the viability of the database and future studies. To solve this problem different imputation techniques have been developed over the last decades. However, there are no regulations or clear guidelines to deal with these situations. In this study, we will analyze and impute a real, incomplete database for the early detection of MCI, where the loss of values on 3 main variables is strongly correlated with the years of studies. The imputation will follow two strategies: assuming that those people would have got a bad scoring if they had taken the test, defining a ceiling score, and a multiple imputation by fully conditional specification. To determine if any kind of bias in mean and variance has been introduced during the imputation, the original database was compared with the imputed databases. Taking a p-value = 0.1 threshold, the database imputed by the multiple imputation method is the one that best preserved the information of the original database, making it the more appropriate imputation method for this MCI database.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Nguyen, C.D., Carlin, J.B., Lee, K.J.: Model checking in multiple imputation: an overview and case study. Emerg. Themes Epidemiol. 14(1), 8 (2017)

    Article  Google Scholar 

  2. Sterne, J.A.C.: Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 338, b2393 (2009)

    Google Scholar 

  3. Jakobsen, J.C., Gluud, C., Wetterslev, J., Winkel, P.: When and how should multiple imputation be used for handling missing data in randomised clinical trials - a practical guide with flowcharts. BMC Med. Res. Methodol. 17(1), 162 (2017)

    Article  Google Scholar 

  4. Groenwold, R.H.H., Moons, K.G.M., Vandenbroucke, J.P.: Randomized trials with missing outcome data: how to analyze and what to report. Can. Med. Assoc. J. 186(15), 1153–1157 (2014)

    Article  Google Scholar 

  5. Hughes, R.A., Heron, J., Sterne, J.A.C., Tilling, K.: Accounting for missing data in statistical analyses: multiple imputation is not always the answer. Int. J. Epidemiol. 48(4), 1294–1304 (2019)

    Article  Google Scholar 

  6. Rubin, D.R.: Inference and missing data. Biometrika 63(3), 581–590 (1976)

    Article  Google Scholar 

  7. Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)

    Book  Google Scholar 

  8. Dziura, J.D., Post, L.A., Zhao, Q., Fu, Z., Peduzzi, P.: Strategies for dealing with Missing data in clinical trials: from design to analysis. Yale J. Biol. Med. 86, 343–8358 (2013)

    PubMed  PubMed Central  Google Scholar 

  9. Choi, J., Dekkers, O.M., le Cessie, S.: A comparison of different methods to handle missing data in the context of propensity score analysis. Eur. J. Epidemiol. 34(1), 23–36 (2018). https://doi.org/10.1007/s10654-018-0447-z

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Marlin, B.M., Roweis, S.T., Zemel, R.S.: Unsupervised Learning with Non-Ignorable Missing. AISTATS (2005)

    Google Scholar 

  11. Liu, Y., De, A.: Multiple imputation by fully conditional specification for dealing with missing data in a large epidemiologic study. Int. J. Stat. Med. Res. 4(3), 287–295 (2019)

    Article  Google Scholar 

  12. van Buuren, S.: Multiple imputation of discrete and continuous data by fully conditional specification. Stat. Methods Med. Res. 16(3), 219–242 (2007)

    Article  Google Scholar 

  13. Murray, J.S.: Multiple imputation: a review of practical and theoretical findings. Stat. Sci. 33(2), 142–159 (2018)

    Article  Google Scholar 

  14. Peraita, H., García-Herranz, S., Díaz-Mardomingo, M.C.: Evolution of specific cognitive subprofiles of mild cognitive impairment in a three-year longitudinal study. Curr. Aging Sci. 4, 171–182 (2011)

    Article  CAS  Google Scholar 

  15. García-Herranz, S., Díaz-Mardomingo, M.C., Venero, C., Peraita, H.: Accuracy of verbal fluency tests in the discrimination of mild cognitive impairment and probable Alzheimer’s disease in older Spanish monolingual individuals. Neuropsychol. Dev. Cogn. Section B, Aging, Neuropsychol. Cogn. 27(6), 826–840 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alba Gómez-Valadés Batanero .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Batanero, A.GV., Zamorano, M.R., Tomás, R.M., Martín, J.G. (2022). Evaluating Imputation Methods for Missing Data in a MCI Dataset. In: Ferrández Vicente, J.M., Álvarez-Sánchez, J.R., de la Paz López, F., Adeli, H. (eds) Artificial Intelligence in Neuroscience: Affective Analysis and Health Applications. IWINAC 2022. Lecture Notes in Computer Science, vol 13258. Springer, Cham. https://doi.org/10.1007/978-3-031-06242-1_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-06242-1_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-06241-4

  • Online ISBN: 978-3-031-06242-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics