A Further Proposal to Perform Multiple Imputation on a Bunch of Polytomous Items Based on Latent Class Analysis

Sulis, Isabella

doi:10.1007/978-3-319-00032-9_41

Isabella Sulis⁴

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

5076 Accesses
2 Citations

Abstract

This work advances an imputation procedure for categorical scales which relays on the results of Latent Class Analysis and Multiple Imputation Analysis. The procedure allows us to use the information stored in the joint multivariate structure of the data set and to take into account the uncertainty related to the true unobserved values. The accuracy of the results is validated in the Item Response Models framework by assessing the accuracy in estimation of key parameters in a data set in which observations are simulated Missing at Random. The sensitivity of the multiple imputation methods is assessed with respect to the following factors: the number of latent classes set up in the Latent Class Model and the rate of missing observations in each variable. The relative accuracy in estimation is assessed with respect to the Multiple Imputation By Chained Equation missing data handling method for categorical variables.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Linzer, D. A., & Lewis, J. B. (2011). poLCA: An R package for polytomous variable latent class analysis. Journal of Statistical Software, 42(10), 1–29. http://www.jstatsoft.org/v42/i10/.
Google Scholar
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd edn.). New York: Wiley.
MATH Google Scholar
Rizopoulos, D. (2006). ltm: latent trait models under IRT. R package version 0.5–0.
Google Scholar
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Book Google Scholar
Schafer, J. (1997). Analysis of incomplete multivariate data. Boca Raton, FL: Chapman and Hall.
Book MATH Google Scholar
Sulis, I., & Porcu, M. (2008). Assessing the effectiveness of a stochastic regression imputation method for ordered categorical data. Working paper. Quaderni di Ricerca CRENoS, 4. http://crenos.unica.it/crenos/it/node/269.
van Buuren, S., & Groothuis-Oudshoorn, K. (2011). Mice: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 1–67. http://www.jstatsoft.org/v45/i03/.
Google Scholar
Vermunt, J. K., Van Ginkel, J. R., Van der Ark, L. A., & Sijtsma, K. (2008). Multiple imputation of categorical data using latent class analysis. Sociological Methodology, 33, 269–297.
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Scienze Sociali e delle Istituzioni, Università di Cagliari, Viale S. Ignazio 78, Cagliari, Italy
Isabella Sulis

Authors

Isabella Sulis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Isabella Sulis .

Editor information

Editors and Affiliations

Department of Economics, and Management, University of Pavia, Via San Felice 7, Pavia, 27100, Italy
Paolo Giudici
Department of Economics, and Business, University of Catania, Corso Italia 55, Catania, 95129, Italy
Salvatore Ingrassia
, Department of Statistics, University of Rome "La Sapienza", Piazzale Aldo Moro 5, Rome, 00185, Italy
Maurizio Vichi

Appendix: miLCApol Function Written in the R Language

Description: Function to implement the MILCA procedureUse: miLCApol(item, m, K, cl, rep, fs)Arguments:

item: A data frame containing the J categorical variables (the same specified in fs formula) all measured on a categorical scale with K − 1 categories. The categorical variables in item must be coded with consecutive values from 1 to K − 1. All missing values should be coded with NA (see poLCA manual Linzer and Lewis (2011) for details)
fs: A formula expression which uses as responses the items contained in the data frame item e.g. $\mathit{fs} <-\mathit{cbind}(Y _{1},\ldots ,Y _{J}) \sim 1$ (see poLCA manual Linzer and Lewis (2011) for details )
m: The number of M randomly imputed data sets
K: The number of categories of the items plus 1
class: The number of latent classes (see poLCA manual Linzer and Lewis (2011) for details)
rep: The number of times the poLCA procedure has to be iterated in order to avoid local maxima (see poLCA manual)

Function

miLCApol<-function(m,K, cl, rep, fs, item){

replacemiss<-function(item){

itemp<-matrix(NA,nrow(item), ncol(item))

for(i in 1:ncol(item)){

itemp[,i]<-ifelse(is.na(item[,i]),K,item[,i]) }

return(itemp) }

itempr<-replacemiss(item)

library(poLCA)

itempr<-as.data.frame(itempr)

dimnames(itempr)<-dimnames(item)

##see poLCA manual to specify further options in poLCA

msim<-poLCA(fs,nclass=cl, itempr, nrep=rep ,na.rm=FALSE)

pr<-msim$probs

classm<-msim$predclass

n<-nrow(itempr)

R<-length(table(classm))

J<-ncol(itempr)

p<-array(NA,c(J,K, R))

for(r in 1:R){

for(j in 1:J){

p[j,,r]<-pr[[j]][r,] }}

impm<-array(NA, c(n,J,m))

for(t in 1:m){

for(i in 1:n){

r<-classm[i]

for(j in 1:J){impm[i,j,t]<- if(itempr[i,j]==K){

cate<-rmultinom(1, 1, p[j,,r])

for(k in 1:K){

cate[k]<-ifelse(cate[k]==1, k, cate[k])}

label<-sum(cate)

while(label>K-1){ cate<-rmultinom(1, 1, p[j,,r])

for(k in 1:K){

cate[k]<-ifelse(cate[k]==1, k, cate[k])}

label<-sum(cate) }

label }

else(itempr[i,j])}}}

return(impm) }

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sulis, I. (2013). A Further Proposal to Perform Multiple Imputation on a Bunch of Polytomous Items Based on Latent Class Analysis. In: Giudici, P., Ingrassia, S., Vichi, M. (eds) Statistical Models for Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Heidelberg. https://doi.org/10.1007/978-3-319-00032-9_41

Download citation

DOI: https://doi.org/10.1007/978-3-319-00032-9_41
Published: 22 May 2013
Publisher Name: Springer, Heidelberg
Print ISBN: 978-3-319-00031-2
Online ISBN: 978-3-319-00032-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

A Further Proposal to Perform Multiple Imputation on a Bunch of Polytomous Items Based on Latent Class Analysis

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix: miLCApol Function Written in the R Language

Appendix: miLCApol Function Written in the R Language

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation