Abstract
Non-response in survey data, especially in multivariate categorical variables, is a common problem which often leads to invalid inferences and inefficient estimates. A regularized iterative multiple correspondence analysis (RIMCA) algorithm in single imputation (SI) has been suggested for the handling of missing categorical data in survey analysis. This paper proposes an adapted version of the SI algorithm for multiple imputation (MI). The SI and MI techniques are compared for both simulated and real questionnaire data. A comparison between RIMCA MI and Sequential Regression Multiple Imputation (SRMI) is shown to establish the success of the proposed MI procedure.
Similar content being viewed by others
References
ABDI, H. (2007), “Singular Value Decomposition (SVD) and Generalized Singular Value Decomposition (GSVD)”, in Encyclopedia of Measurement and Statistics, ed. N.J. Salking, Thousand Oaks, CA: Sage Publications, pp. 907–912.
ALI, M.W., and SIDDIQUI, O. (2000), “Multiple Imputation Compared With Some Informative Dropout Procedures in the Estimation and Comparison of Rates of Change in Longitudinal Clinical Trials with Dropouts”, Journal of Biopharmaceutical Statistics, 10(2), 165–181.
BUHI, E.R., GOODSON, P., and NEILANDS, T.B. (2008), “Out of Sight, Not Out of Mind: Strategies for Handling Missing Data”, American Journal of Health Behaviour, 32(1), 83–92.
GARCίA-LAENCINA, P.J., FIGUEIRAS-VIDAL, A.R., and SANCHO-GÓMEZ, J. (2010), “Pattern Classification With Missing Data: A Review”, Neural Computing and Applications, 19, 263–282.
JOSSE, J., CHAVENT, M., LIQUET, B., and HUSSON, F. (2011), “Handling Missing Values With Regularized Iterative Multiple Correspondence Analysis”, ICC Conference Notes, University of St Andrew, United Kingdom.
JOSSE, J., CHAVENT, M., LIQUET, B., and HUSSON, F. (2012), “Handling Missing Values With Regularized Iterative Multiple Correspondence Analysis”, Journal of Classification, 29, 91–116.
KENWARD, M.G., and CARPENTER, J. (2007), “Multiple Imputation: Current Perspectives”, Statistical Methods in Medical Research, 16, 199–218.
LITTLE, R.J.A., and RUBIN, D.B. (2002), Statistical Analysis with Missing Data, (2nd ed.), Wiley-Interscience, John Wiley and Sons, Inc.
NIENKEMPER-SWANEPOEL, J., and VON MALTITZ, M.J. (2015), “Multiple Imputation using Regularised Iterative Multiple Correspondence Analysis”, Proceedings of the 60 th World Statistics Congress of the International Statistical Institute, ISI2015.
R CORE TEAM (2015), “R: A Language and Environment for Statistical Computing”, R Foundation for Statistical Computing, Vienna, Austria, http://www.R-project.org/.
RAGHUNATHAN, T.E., LEPKOWSKI, J.M., VAN HOEWYK, J. and SOLENBERGER, P. (2001), “A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Models”, Survey Methodology, 27(1), 85–95.
RUBIN, D.B. (1987), “Multiple Imputation for Nonresponse in Surveys”, John Wiley and Sons.
RUBIN, D.B. (2003), “Discussion on Multiple Imputation”, International Statistical Review, 71(3), 619–625.
SCHAFER, J.L., and GRAHAM, J.W. (2002). “Missing Data: Our View of the State of the Art”, American Psychological Association, Inc., 7(2), 147–177.
SONG, Q., and SHEPPERD, M. (2007), “A New Imputation Method for Small Software Project Data Sets”, The Journal of Systems and Software, 80, 51–62.
VAN BUUREN S. (2012), Flexible Imputation of Missing Data, Interdisciplinary Statistics Series, Boca Raton: Chapman and Hall/CRC.
VAN DER HEIJDEN, P.G.M., and ESCOFIER, B. (2003), “Multiple Correspondence Analysis With Missing Data”, in Analyse des Correspondances. Recherches au Coeur de L’Analyse des Donnees, ed. B. Escofier, Rennes: Presses Universitaire de Rennes – Societe Francaise de Statistque, pp. 153–170.
ZHANG, P. (2003), “Multiple Imputation: Theory and Method”, International Statistical Review, 71(3), 581–592.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Nienkemper-Swanepoel, J., von Maltitz, M.J. Investigating the Performance of a Variation of Multiple Correspondence Analysis for Multiple Imputation in Categorical Data Sets. J Classif 34, 384–398 (2017). https://doi.org/10.1007/s00357-017-9238-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-017-9238-6