Skip to main content
Log in

Investigating the Performance of a Variation of Multiple Correspondence Analysis for Multiple Imputation in Categorical Data Sets

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

Non-response in survey data, especially in multivariate categorical variables, is a common problem which often leads to invalid inferences and inefficient estimates. A regularized iterative multiple correspondence analysis (RIMCA) algorithm in single imputation (SI) has been suggested for the handling of missing categorical data in survey analysis. This paper proposes an adapted version of the SI algorithm for multiple imputation (MI). The SI and MI techniques are compared for both simulated and real questionnaire data. A comparison between RIMCA MI and Sequential Regression Multiple Imputation (SRMI) is shown to establish the success of the proposed MI procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • ABDI, H. (2007), “Singular Value Decomposition (SVD) and Generalized Singular Value Decomposition (GSVD)”, in Encyclopedia of Measurement and Statistics, ed. N.J. Salking, Thousand Oaks, CA: Sage Publications, pp. 907–912.

  • ALI, M.W., and SIDDIQUI, O. (2000), “Multiple Imputation Compared With Some Informative Dropout Procedures in the Estimation and Comparison of Rates of Change in Longitudinal Clinical Trials with Dropouts”, Journal of Biopharmaceutical Statistics, 10(2), 165–181.

    Article  MATH  Google Scholar 

  • BUHI, E.R., GOODSON, P., and NEILANDS, T.B. (2008), “Out of Sight, Not Out of Mind: Strategies for Handling Missing Data”, American Journal of Health Behaviour, 32(1), 83–92.

    Article  Google Scholar 

  • GARCίA-LAENCINA, P.J., FIGUEIRAS-VIDAL, A.R., and SANCHO-GÓMEZ, J. (2010), “Pattern Classification With Missing Data: A Review”, Neural Computing and Applications, 19, 263–282.

    Article  Google Scholar 

  • JOSSE, J., CHAVENT, M., LIQUET, B., and HUSSON, F. (2011), “Handling Missing Values With Regularized Iterative Multiple Correspondence Analysis”, ICC Conference Notes, University of St Andrew, United Kingdom.

  • JOSSE, J., CHAVENT, M., LIQUET, B., and HUSSON, F. (2012), “Handling Missing Values With Regularized Iterative Multiple Correspondence Analysis”, Journal of Classification, 29, 91–116.

    Article  MathSciNet  MATH  Google Scholar 

  • KENWARD, M.G., and CARPENTER, J. (2007), “Multiple Imputation: Current Perspectives”, Statistical Methods in Medical Research, 16, 199–218.

    Article  MathSciNet  MATH  Google Scholar 

  • LITTLE, R.J.A., and RUBIN, D.B. (2002), Statistical Analysis with Missing Data, (2nd ed.), Wiley-Interscience, John Wiley and Sons, Inc.

  • NIENKEMPER-SWANEPOEL, J., and VON MALTITZ, M.J. (2015), “Multiple Imputation using Regularised Iterative Multiple Correspondence Analysis”, Proceedings of the 60 th World Statistics Congress of the International Statistical Institute, ISI2015.

  • R CORE TEAM (2015), “R: A Language and Environment for Statistical Computing”, R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/.

    Google Scholar 

  • RAGHUNATHAN, T.E., LEPKOWSKI, J.M., VAN HOEWYK, J. and SOLENBERGER, P. (2001), “A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Models”, Survey Methodology, 27(1), 85–95.

    Google Scholar 

  • RUBIN, D.B. (1987), “Multiple Imputation for Nonresponse in Surveys”, John Wiley and Sons.

  • RUBIN, D.B. (2003), “Discussion on Multiple Imputation”, International Statistical Review, 71(3), 619–625.

    Article  Google Scholar 

  • SCHAFER, J.L., and GRAHAM, J.W. (2002). “Missing Data: Our View of the State of the Art”, American Psychological Association, Inc., 7(2), 147–177.

  • SONG, Q., and SHEPPERD, M. (2007), “A New Imputation Method for Small Software Project Data Sets”, The Journal of Systems and Software, 80, 51–62.

    Article  Google Scholar 

  • VAN BUUREN S. (2012), Flexible Imputation of Missing Data, Interdisciplinary Statistics Series, Boca Raton: Chapman and Hall/CRC.

    Google Scholar 

  • VAN DER HEIJDEN, P.G.M., and ESCOFIER, B. (2003), “Multiple Correspondence Analysis With Missing Data”, in Analyse des Correspondances. Recherches au Coeur de L’Analyse des Donnees, ed. B. Escofier, Rennes: Presses Universitaire de Rennes – Societe Francaise de Statistque, pp. 153–170.

  • ZHANG, P. (2003), “Multiple Imputation: Theory and Method”, International Statistical Review, 71(3), 581–592.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johané Nienkemper-Swanepoel.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nienkemper-Swanepoel, J., von Maltitz, M.J. Investigating the Performance of a Variation of Multiple Correspondence Analysis for Multiple Imputation in Categorical Data Sets. J Classif 34, 384–398 (2017). https://doi.org/10.1007/s00357-017-9238-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-017-9238-6

Keywords

Navigation