Abstract
Ranking data appear in everyday life and arise in many fields of study such as marketing, psychology and politics. Very often, the key objective of analyzing and modeling ranking data is to identify underlying factors that affect the individuals’ choice behavior. Factor analysis for ranking data is one of the most widely used methods to tackle the aforementioned problem. Recently, Yu et al. [J R Stat Soc Ser A (Statistics in Society) 168:583–597, 2005] have developed factor models for ranked data in which each individual is asked to rank a set of items. However, paired ranked data may arise when the same set of items are ranked by a pair of judges such as a couple in a family. This paper extended the factor model to accommodate such paired ranked data. The Monte Carlo expectation-maximization algorithm was used for parameter estimation, at which the E-step is implemented via the Gibbs Sampler. For model assessment and selection, a tailor-made method called the bootstrap predictive checks approach was proposed. Simulation studies were conducted to illustrate the proposed estimation and model selection method. The proposed method was applied to analyze a parent–child partially ranked data collected from a value priorities survey carried out in the United States.


Similar content being viewed by others
Notes
The derivation applies the properties of the bivariate normal distribution. See Section 4.3.8 “Moments and Absolute Moments” in Hutchinson and Lai (1990).
In our study, the value of \(h\) is monitored and so far no violation of the condition \(h > 0\) is detected.
If \({\varvec{A}}\) and \({\varvec{D}}\) are symmetric, then
$$\begin{aligned} \begin{pmatrix} {\varvec{A}}&\quad {\varvec{B}} \\ {\varvec{B}}^T&\quad {\varvec{D}} \\ \end{pmatrix} = \begin{pmatrix} {\varvec{A}}^{-1}+{\varvec{F}}{\varvec{E}}{\varvec{F}}^T&\quad -{\varvec{F}}{\varvec{E}}^{-1} \\ -{\varvec{E}}^{-1}{\varvec{F}}^T&{\varvec{E}}^{-1} \\ \end{pmatrix}, \end{aligned}$$where \({\varvec{E}}={\varvec{D}}-{\varvec{B}}^T{\varvec{A}}^{-1}{\varvec{B}}\) and \({\varvec{F}}={\varvec{A}}^{-1}{\varvec{B}}\).
References
Barnes SH, Kaase M, Allerbeck KR, Farah BG, Heunks F, Inglehart R, Jennings MK, Klingemann HD, Marsh A, Rosenmayr L (1979) Political action: mass participation in five western democracies. Sage, Beverly Hills, CA
Barnes SH, Samuel H, Kaase M (1999) Political action: an eight nation study, 1973–1976 (Computer file). ICPSR version. Conducted by University of Michigan, Survey Research Center. ICPSR
Blackwell D (1947) Conditional expectation and unbiased sequential estimation. Ann Math Stat 18:105–110
Bock RD, Böckenholt U (2005) Nominal categories model. In: Kempf-Leonard K (ed) Encyclopedia of social measurement. Elsevier, Amsterdam
Bock RD, Gibbons RD (1996) High-dimensional multivariate probit analysis. Biometrics 52:1183–1194
Böckenholt U (1996) Analysing multiattribute ranking data: joint and conditional approaches. Br J Math Stat Psychol 49:57–78
Booth JG, Hobert JP (1999) Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. J R Stat Soc Ser B (Methodological) 61:265–285
Chan JSK, Kuk AYC (1997) Maximium likelihood estimation for probit-linear mixed models with correlated random effects. Biometrics 53:86–97
Cudeck R (1988) Multiplicative models and MTMM matrices. J Educ Stat 13:131–147
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood for incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodological) 38:1–38
Dunham W (1990) Journey through genius: the great theorems of mathematics. Wiley, New York
Gelfand AE, Smith AFM (1990) Sampling-based approaches to calculating marginal densities. J Am Stat Assoc 85:398–409
Gelman A, Meng XL, Stern HS (1996) Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica 6:733–807
Geman S, Geman D (1984) Stochastic simulation, gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6:721–741
Geweke J (1991) Efficient simulation from the multivariate normal and student-t distributions subject to linear constraints. In: Computer science and statistics: proceedings of the twenty-third symposium on the interface pp 571–578
Hajivassiliou V, McFadden D (1990) The method of simulated scores for the estimation of LDV models with an application to external debt crises. Cowles Foundation Yale University discussion paper 967
Hutchinson TP, Lai CD (1990) Continuous bivariate distributions, emphasising applications. Rumsby Scientific, Adelaide
Inglehart R (1977) The silent revolution: changing values and political styles among western publics. Princeton University Press, Princeton
Keane MP (1994) A computationally practical simulation estimator for panel data. Econometrica 62:95–116
Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B (Methodological) 44:226–233
Maydeu-Olivares A, Böckenholt U (2005) Structural equation modeling of paired comparison and ranking data. Psychol Methods 10:285–304
McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York
Meng XL, Schilling S (1996) Fitting full-information item factor models and an empirical investigation of bridge sampling. J Am Stat Assoc 91:1254–1267
Meng XL, Wong WH (1996) Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Statistica Sinica 6:831–860
Ogasawara H (2009) Asymptotic expansions in the singular value decomposition for cross covariance and correlation under nonnormality. Ann Inst Stat Math 61:995–1017
Rao CR (1965) Linear statistical inference and its applications. Wiley, London
Rubin DB (1984) Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann Stat 12:1151–1172
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Thurstone LL (1947) Multiple factor analysis. University of Chicago Press, Chicago
Tsai RC, Yao G (2000) Testing Thurstonian case V ranking models using posterior predictive checks. Br J Math Stat Psychol 53:275–292
van Dyk D (2000) Nesting EM algorithms for computational efficiency. Statistica Sinica 10:203–225
Wegelin JA, Packer A, Richardson TS (2006) Latent models for cross-covariance. J Multivar Anal 97: 79–102
Wei GCG, Tanner MA (1990) A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. J Am Stat Assoc 85:699–704
Yao KG, Böckenholt U (1999) Bayesian estimation of Thurstonian ranking models based on the Gibbs sampler. Br J Math Statist Psychol 52:79–92
Yu PLH (2000) Bayesian analysis of order-statistics models for ranking data. Psychometrika 65:281–299
Yu PLH, Lam KF, Lo SM (2005) Factor analysis for ranked data with application to a job selection attitude survey. J R Stat Soc Ser A (Statistics in Society) 168:583–597
Author information
Authors and Affiliations
Corresponding author
Additional information
The research of Philip L. H. Yu was supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. HKU 7473/05H). We thank the two anonymous referees for their helpful suggestions for improving this article.
Appendix
Appendix
The complete-data log-likelihood function is:
where
assuming that \({\varvec{\rho }}_{12}\) is a \(d \times d\) diagonal matrix with (\(\ell ,\ell \))th element equals \({\rho }_{12,\ell }\). Then, \(\log |{\varvec{\varSigma }}_{f}|\) = \(\log \prod _{\ell =1}^d (1-\rho ^2_{12,\ell })\) = \(\sum _{\ell =1}^d \log (1-\rho ^2_{12,\ell })\).
Also,Footnote 3
therefore,
Ignoring the terms independent of \({\rho }_{12,\ell }\), this becomes \((1+\frac{{\rho }_{12,\ell }^2}{1-{\rho }_{12,\ell }^2})\sum _{i=1}^n f_{i1\ell }^2\) - \((\frac{2{\rho }_{12,\ell }}{1-{\rho }_{12,\ell }^2})\sum _{i=1}^n f_{i1\ell }f_{i2\ell }\) + \((\frac{1}{1-{\rho }_{12,\ell }^2})\sum _{i=1}^n f_{i2\ell }^2\).
Hence, the derivative of complete log-likelihood with respect to \({\rho }_{12,\ell }\) equals
Note that this is a cubic function and is decreasing on \(\rho _{12\ell }\). Therefore, if there is only one real solution of \(\rho _{12\ell }\), the solution will maximize the complete log-likelihood function.
Rights and permissions
About this article
Cite this article
Yu, P.L.H., Lee, P.H. & Wan, W.M. Factor analysis for paired ranked data with application on parent–child value orientation preference data. Comput Stat 28, 1915–1945 (2013). https://doi.org/10.1007/s00180-012-0387-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-012-0387-0