Factor analysis for paired ranked data with application on parent–child value orientation preference data

Yu, Philip L. H.; Lee, Paul H.; Wan, W. M.

doi:10.1007/s00180-012-0387-0

Factor analysis for paired ranked data with application on parent–child value orientation preference data

Original Paper
Published: 15 December 2012

Volume 28, pages 1915–1945, (2013)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Philip L. H. Yu¹,
Paul H. Lee² &
W. M. Wan³

383 Accesses
4 Citations
Explore all metrics

Abstract

Ranking data appear in everyday life and arise in many fields of study such as marketing, psychology and politics. Very often, the key objective of analyzing and modeling ranking data is to identify underlying factors that affect the individuals’ choice behavior. Factor analysis for ranking data is one of the most widely used methods to tackle the aforementioned problem. Recently, Yu et al. [J R Stat Soc Ser A (Statistics in Society) 168:583–597, 2005] have developed factor models for ranked data in which each individual is asked to rank a set of items. However, paired ranked data may arise when the same set of items are ranked by a pair of judges such as a couple in a family. This paper extended the factor model to accommodate such paired ranked data. The Monte Carlo expectation-maximization algorithm was used for parameter estimation, at which the E-step is implemented via the Gibbs Sampler. For model assessment and selection, a tailor-made method called the bootstrap predictive checks approach was proposed. Simulation studies were conducted to illustrate the proposed estimation and model selection method. The proposed method was applied to analyze a parent–child partially ranked data collected from a value priorities survey carried out in the United States.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A generalization of the Thurstone method for multiple choice and incomplete paired comparisons

Article 10 October 2017

Improving the prediction of ranking data

Article Open access 22 September 2016

Effectiveness of rank correlations in curvilinear relationships

Article 27 March 2017

Notes

The derivation applies the properties of the bivariate normal distribution. See Section 4.3.8 “Moments and Absolute Moments” in Hutchinson and Lai (1990).
In our study, the value of $h$ is monitored and so far no violation of the condition $h > 0$ is detected.
If ${\varvec{A}}$ and ${\varvec{D}}$ are symmetric, then
$$\begin{aligned} \begin{pmatrix} {\varvec{A}}&\quad {\varvec{B}} \\ {\varvec{B}}^T&\quad {\varvec{D}} \\ \end{pmatrix} = \begin{pmatrix} {\varvec{A}}^{-1}+{\varvec{F}}{\varvec{E}}{\varvec{F}}^T&\quad -{\varvec{F}}{\varvec{E}}^{-1} \\ -{\varvec{E}}^{-1}{\varvec{F}}^T&{\varvec{E}}^{-1} \\ \end{pmatrix}, \end{aligned}$$
where ${\varvec{E}}={\varvec{D}}-{\varvec{B}}^T{\varvec{A}}^{-1}{\varvec{B}}$ and ${\varvec{F}}={\varvec{A}}^{-1}{\varvec{B}}$.

References

Barnes SH, Kaase M, Allerbeck KR, Farah BG, Heunks F, Inglehart R, Jennings MK, Klingemann HD, Marsh A, Rosenmayr L (1979) Political action: mass participation in five western democracies. Sage, Beverly Hills, CA
Google Scholar
Barnes SH, Samuel H, Kaase M (1999) Political action: an eight nation study, 1973–1976 (Computer file). ICPSR version. Conducted by University of Michigan, Survey Research Center. ICPSR
Blackwell D (1947) Conditional expectation and unbiased sequential estimation. Ann Math Stat 18:105–110
Article MathSciNet MATH Google Scholar
Bock RD, Böckenholt U (2005) Nominal categories model. In: Kempf-Leonard K (ed) Encyclopedia of social measurement. Elsevier, Amsterdam
Google Scholar
Bock RD, Gibbons RD (1996) High-dimensional multivariate probit analysis. Biometrics 52:1183–1194
Article MathSciNet MATH Google Scholar
Böckenholt U (1996) Analysing multiattribute ranking data: joint and conditional approaches. Br J Math Stat Psychol 49:57–78
Article MATH Google Scholar
Booth JG, Hobert JP (1999) Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. J R Stat Soc Ser B (Methodological) 61:265–285
Article MATH Google Scholar
Chan JSK, Kuk AYC (1997) Maximium likelihood estimation for probit-linear mixed models with correlated random effects. Biometrics 53:86–97
Article MathSciNet MATH Google Scholar
Cudeck R (1988) Multiplicative models and MTMM matrices. J Educ Stat 13:131–147
Article Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood for incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodological) 38:1–38
MathSciNet Google Scholar
Dunham W (1990) Journey through genius: the great theorems of mathematics. Wiley, New York
MATH Google Scholar
Gelfand AE, Smith AFM (1990) Sampling-based approaches to calculating marginal densities. J Am Stat Assoc 85:398–409
Article MathSciNet MATH Google Scholar
Gelman A, Meng XL, Stern HS (1996) Posterior predictive assessment of model fitness via realized discrepancies. Statistica Sinica 6:733–807
MathSciNet MATH Google Scholar
Geman S, Geman D (1984) Stochastic simulation, gibbs distributions, and the Bayesian restoration of images. IEEE Trans Pattern Anal Mach Intell 6:721–741
Google Scholar
Geweke J (1991) Efficient simulation from the multivariate normal and student-t distributions subject to linear constraints. In: Computer science and statistics: proceedings of the twenty-third symposium on the interface pp 571–578
Hajivassiliou V, McFadden D (1990) The method of simulated scores for the estimation of LDV models with an application to external debt crises. Cowles Foundation Yale University discussion paper 967
Hutchinson TP, Lai CD (1990) Continuous bivariate distributions, emphasising applications. Rumsby Scientific, Adelaide
Inglehart R (1977) The silent revolution: changing values and political styles among western publics. Princeton University Press, Princeton
Google Scholar
Keane MP (1994) A computationally practical simulation estimator for panel data. Econometrica 62:95–116
Article MATH Google Scholar
Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B (Methodological) 44:226–233
MathSciNet MATH Google Scholar
Maydeu-Olivares A, Böckenholt U (2005) Structural equation modeling of paired comparison and ranking data. Psychol Methods 10:285–304
Article Google Scholar
McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York
MATH Google Scholar
Meng XL, Schilling S (1996) Fitting full-information item factor models and an empirical investigation of bridge sampling. J Am Stat Assoc 91:1254–1267
Google Scholar
Meng XL, Wong WH (1996) Simulating ratios of normalizing constants via a simple identity: a theoretical exploration. Statistica Sinica 6:831–860
MathSciNet MATH Google Scholar
Ogasawara H (2009) Asymptotic expansions in the singular value decomposition for cross covariance and correlation under nonnormality. Ann Inst Stat Math 61:995–1017
Article MathSciNet Google Scholar
Rao CR (1965) Linear statistical inference and its applications. Wiley, London
MATH Google Scholar
Rubin DB (1984) Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann Stat 12:1151–1172
Article MATH Google Scholar
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article MATH Google Scholar
Thurstone LL (1947) Multiple factor analysis. University of Chicago Press, Chicago
MATH Google Scholar
Tsai RC, Yao G (2000) Testing Thurstonian case V ranking models using posterior predictive checks. Br J Math Stat Psychol 53:275–292
Google Scholar
van Dyk D (2000) Nesting EM algorithms for computational efficiency. Statistica Sinica 10:203–225
MathSciNet MATH Google Scholar
Wegelin JA, Packer A, Richardson TS (2006) Latent models for cross-covariance. J Multivar Anal 97: 79–102
Google Scholar
Wei GCG, Tanner MA (1990) A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. J Am Stat Assoc 85:699–704
Article Google Scholar
Yao KG, Böckenholt U (1999) Bayesian estimation of Thurstonian ranking models based on the Gibbs sampler. Br J Math Statist Psychol 52:79–92
Article Google Scholar
Yu PLH (2000) Bayesian analysis of order-statistics models for ranking data. Psychometrika 65:281–299
Article Google Scholar
Yu PLH, Lam KF, Lo SM (2005) Factor analysis for ranked data with application to a job selection attitude survey. J R Stat Soc Ser A (Statistics in Society) 168:583–597
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

The University of Hong Kong, Room 521, Meng Wah Complex, Pokfulam, Hong Kong
Philip L. H. Yu
The University of Hong Kong, Room 524, William MW Mong Block, 21 Sassoon Road, Pokfulam, Hong Kong
Paul H. Lee
The University of Hong Kong, Room 505, Meng Wah Complex, Pokfulam, Hong Kong
W. M. Wan

Authors

Philip L. H. Yu
View author publications
You can also search for this author in PubMed Google Scholar
Paul H. Lee
View author publications
You can also search for this author in PubMed Google Scholar
W. M. Wan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul H. Lee.

Additional information

The research of Philip L. H. Yu was supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. HKU 7473/05H). We thank the two anonymous referees for their helpful suggestions for improving this article.

Appendix

The complete-data log-likelihood function is:

$$\begin{aligned}&= \! - \frac{n}{2} \; \left\{ {\sum \limits _{c=1}^{2}} \log \left| {\varvec{\varPsi }} \right| + \log \left| {\varvec{\varSigma }}_{f} \right| \right\} \\&\!- \frac{1}{2}{\sum \limits _{i=1}^{n}} \!\left\{ {\sum \limits _{c=1}^{2}} tr \!\left[ {\varvec{\varPsi }}^{-1} ({\varvec{U}}_{ic} \!\!-\!\! {\varvec{\mu }}_{c} \!\!-\!\! {\varvec{\varLambda }}^{T} {\varvec{f}}_{ic}) ({\varvec{U}}_{ic} \!-\! {\varvec{\mu }}_{c} \!-\! {\varvec{\varLambda }}^{T} {\varvec{f}}_{ic})^{T} \right] \!+\! tr \left[ {\varvec{\varSigma }}_{f}^{-1} {\varvec{f}}_{i} {\varvec{f}}_{i}^{T} \right] \right\} , \end{aligned}$$

where

$$\begin{aligned} {\varvec{\varSigma }}_{f}= \begin{pmatrix} {\varvec{I}}_d\quad&\quad {\varvec{\rho }}_{12}\\ {\varvec{\rho }}_{12}&\quad {\varvec{I}}_d\\ \end{pmatrix}, \end{aligned}$$

assuming that ${\varvec{\rho }}_{12}$ is a $d \times d$ diagonal matrix with ($\ell ,\ell $)th element equals ${\rho }_{12,\ell }$. Then, $\log |{\varvec{\varSigma }}_{f}|$ = $\log \prod _{\ell =1}^d (1-\rho ^2_{12,\ell })$ = $\sum _{\ell =1}^d \log (1-\rho ^2_{12,\ell })$.

Also,^{Footnote 3}

$$\begin{aligned} {\varvec{\varSigma }}_{f}^{-1} = \begin{pmatrix} {\varvec{I}}_d+{\varvec{\rho }}_{12}({\varvec{I}}_d-{\varvec{\rho }}^2)^{-1}{\varvec{\rho }}_{12}&\quad -{\varvec{\rho }}_{12}({\varvec{I}}_d-{\varvec{\rho }}_{12}^2)^{-1} \\ -({\varvec{I}}_d-{\varvec{\rho }}_{12}^2)^{-1}{\varvec{\rho }}_{12}&({\varvec{I}}_d-{\varvec{\rho }}_{12}^2)^{-1} \\ \end{pmatrix}, \end{aligned}$$

therefore,

$$\begin{aligned} \sum _{i=1}^n tr \left[ {\varvec{\varSigma }}_{f}^{-1} {\varvec{f}}_{i} {\varvec{f}}_{i}^{T} \right]&= tr\left[({\varvec{I}}_d+{\varvec{\rho }}_{12}({\varvec{I}}_d-{\varvec{\rho }}^2)^{-1}{\varvec{\rho }}_{12})\sum _{i=1}^n {\varvec{f}}_{i1}{\varvec{f}}_{i1}^T \right]\\&-tr\left[{\varvec{\rho }}_{12}({\varvec{I}}_d-{\varvec{\rho }}_{12}^2)^{-1} \sum _{i=1}^n{\varvec{f}}_{i2}{\varvec{f}}_{i1}^T\right]\\&-tr\left[({\varvec{I}}_d-{\varvec{\rho }}_{12}^2)^{-1}{\varvec{\rho }}_{12} \sum _{i=1}^n{\varvec{f}}_{i1}{\varvec{f}}_{i2}^T\right]\\&+tr\left[({\varvec{I}}_d-{\varvec{\rho }}_{12}^2)^{-1} \sum _{i=1}^n{\varvec{f}}_{i2}{\varvec{f}}_{i2}^T\right]. \end{aligned}$$

Ignoring the terms independent of ${\rho }_{12,\ell }$, this becomes $(1+\frac{{\rho }_{12,\ell }^2}{1-{\rho }_{12,\ell }^2})\sum _{i=1}^n f_{i1\ell }^2$ - $(\frac{2{\rho }_{12,\ell }}{1-{\rho }_{12,\ell }^2})\sum _{i=1}^n f_{i1\ell }f_{i2\ell }$ + $(\frac{1}{1-{\rho }_{12,\ell }^2})\sum _{i=1}^n f_{i2\ell }^2$.

Hence, the derivative of complete log-likelihood with respect to ${\rho }_{12,\ell }$ equals

$$\begin{aligned}&\frac{n{\rho }_{12,\ell }}{1-{\rho }_{12,\ell }^2} - \frac{1}{2}\left(\frac{(1-{\rho }_{12,\ell }^2)\times 2{\rho }_{12,\ell }+{\rho }_{12,\ell }^2\times 2{\rho }_{12,\ell }}{(1-{\rho }_{12,\ell }^2)^2}\right.\sum _{i=1}^n f_{i1\ell }^2\\&\left.\quad \quad \quad \quad \quad -\frac{2\left[(1-{\rho }_{12,\ell }^2)+2{\rho }_{12,\ell }^2\right]}{(1-{\rho }_{12,\ell }^2)^2} \sum _{i=1}^n f_{i1\ell }f_{i2\ell } + \frac{2{\rho }_{12,\ell }}{(1-{\rho }_{12,\ell }^2)^2} \sum _{i=1}^n f_{i2\ell }^2 \right) \\&\quad = -\frac{1}{(1-{\rho }_{12,\ell }^2)^2}\left[n \rho _{12\ell }^{3} - \left( \sum _{i=1}^{n}{f_{i1\ell } f_{i2\ell }} \right) \rho _{12\ell }^{2} + \left( \sum _{i=1}^{n}{\left[ f_{i1\ell }^{2} + f_{i2\ell }^{2} \right]} - n \right) \rho _{12\ell }\right.\nonumber \\&\qquad \qquad \qquad \qquad \qquad \left.-\sum _{i=1}^{n}{f_{i1\ell } f_{i2\ell }} \right]. \end{aligned}$$

Note that this is a cubic function and is decreasing on $\rho _{12\ell }$. Therefore, if there is only one real solution of $\rho _{12\ell }$, the solution will maximize the complete log-likelihood function.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, P.L.H., Lee, P.H. & Wan, W.M. Factor analysis for paired ranked data with application on parent–child value orientation preference data. Comput Stat 28, 1915–1945 (2013). https://doi.org/10.1007/s00180-012-0387-0

Download citation

Received: 09 April 2012
Accepted: 21 November 2012
Published: 15 December 2012
Issue Date: October 2013
DOI: https://doi.org/10.1007/s00180-012-0387-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Factor analysis for paired ranked data with application on parent–child value orientation preference data

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A generalization of the Thurstone method for multiple choice and incomplete paired comparisons

Improving the prediction of ranking data

Effectiveness of rank correlations in curvilinear relationships

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now