Abstract
In this study, a recently proposed data-fitting factor analysis (DFFA) procedure is generalized for categorical variable analysis. For generalized DFFA (GDFFA), we develop an alternating least squares algorithm consisting of a multiple quantification step and a model parameters estimation step. The differences between GDFFA and similar statistical methods such as multiple correspondence analysis and FACTALS are also discussed. The developed algorithm and its solution are illustrated with a real data example.
Similar content being viewed by others
References
Adachi K (2012) Some contributions to data-fitting factor analysis with empirical comparisons to covariance-fitting factor analysis. J Jpn Soc Comput Stat 25:25–38
Adachi K, Murakami T (2011) Nonmetric multivariate analysis: MCA, NPCA, and PCA. Asakura Shoten, Tokyo (in Japanese)
Anderson TW, Rubin H (1956) Statistical inference in factor analysis. In: Neyman J (ed) Proceedings of the third Berkeley symposium on mathematical statistics and probability, vol 5. University of California Press, Berkeley, pp 111–150
Benzecri JP (1974) L’analyses des donnees: Tome (VoL) 1. La taxinomie: Tome. 2 La’analyses des correspondances. Dunod, Paris
Benzecri JP (1992) Correspondence analysis handbook. Marcel Dekker, New York
Browne MW (2001) An overview of analytic rotation in exploratory factor analysis. Multivar Behav Res 36:111–150
de Leeuw J (2004) Least squares optimal scaling of partially observed linear systems. In: van Montfort K, Oud J, Satorra A (eds) Recent developments on structural equation models: theory and applications. Kluwer, Dordrecht, pp 121–134
de Leeuw J (2008) Factor analysis as matrix decomposition. Preprint series: Department of Statistics, University of California, Los Angeles
Greenacre MJ (1984) Theory and application of correspondence analysis. Academic Press, London
Gifi A (1990) Nonlinear multivariate analysis. Wiley, Chichester
Harman HH (1976) Mordan factor analysis, 3rd edn. University of Chicago Press, Chicago
Kuroda M, Mori Y, Iizuka M, Sakakihara M (2012) Acceleration of convergence of the alternating least squares algorithm for nonlinear principal components analysis. In Sanguansat P (ed) Principal component analysis. InTech, Winchester
Mulaik SA (2010) Foundation of factor analysis, 2nd edn. Chapman and Hall/CRC, Boca Raton
Murakami T (1999) A psychometrics study on principal component analysis of categorical data. Technical report (in Japanese)
Murakami T (2001) Investigation of justifiability of Likert scaling using nonmetric principal components analysis and multiple correspondence analysis. Technical report (in Japanese)
Murakami T, Kiers HAL, ten Berge JMF (1999) Non-metric principal component analysis for categorical variables with multiple quantifications (unpublished manuscript)
Nishisato S (2006) Multidimensional nonlinear descriptive analysis. Chapman and Hall/CRC, Boca Raton
Schimitt TA, Sass DA (2011) Rotation criteria and hypothesis testing for exploratory factor analysis: implications for factor pattern loadings and interfactor correlations. Educ Psychol Measur 71:95–113
Schneeweiss H, Mathes H (1995) Factor analysis and principal components. J Multivar Anal 55:105–124
Takane Y, de Young FW, Leeuw J (1979) Nonmetric common factor anaysis: an alternating least square method with optimal scaling features. Behaviormetrika 6:45–56
ten Berge JMF (1993) Least squares optimazation in multivariate analysis. DSNO Press, Leiden
Tenenhaus M, Young YW (1985) An analysis and synthesis of multiple correspondence analysis, optimal scaling, dualscaling, homogeneity analysis and other methods for quantifying categorical multivariate data. Psychometrika 50:91–119
Trendafilov NT, Unkel S, Krzanowski W (2013) Exploratory factor and principal component analyses: some new aspects. Stat Comput 23:209–220
Unkel S, Trendafilov NT (2010) Simultaneous parameter estimation in exploratory factor analysis: an expository reviews. Int Stat Rev 78:363–382
Unkel S, Trendafilov NT (2011) Zig-zag exploratory factor analysis with more variables than observations. Comput Stat 28:107–125
van der Burg E, de Leeuw Y, Verdegaal R (1988) Homogeneity analysis with k sets of variables: An alternating least squares method with optimal scaling features. Psychometrika 53:177–197
Yanai H, Ichikawa M (2007) Factor analysis. In: Rao CR, Sinharay S (eds) HandBook of statistics vol 26: Psychometrics. Elsevier, Amsterdam, pp 257–296
Young FW (1981) Quantitative analysis of qualitative date. Psychometrika 46:357–388
Acknowledgments
The author would like to thank Prof. Kohei Adachi for his very helpful comments on previous versions of this paper.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix
Japanese baseball data in 2010
The first six variables are the same as those reported in Adachi (2012), while the additional nominal variable was obtained from http://www.baseball-data.com/10/lineup/. The first six variables are as follows: [1] batting average (BA), defined as the proportion of hits to at-bats multiplied by one thousand; [2] runs (R), referring to the number of times a batter scored; [3] doubles (D), indicating the number of two-base hits; [4] home runs (HR), i.e., the number of homers hit; [5] runs batted in (RBI), referring to the number of times the batter was responsible for runs scored; and [6] strikeouts (SO) denoting the number of times the batter struck out. The six variables are dichotomized into lower (1) and higher (2). For batting order (BO), batters are categorized into the following three groups: those hitting in the first two slots (1), those hitting in the middle three slots (2), and those hitting in the bottom four slots (3) (Table 8).
Rights and permissions
About this article
Cite this article
Makino, N. Generalized data-fitting factor analysis with multiple quantification of categorical variables. Comput Stat 30, 279–292 (2015). https://doi.org/10.1007/s00180-014-0536-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-014-0536-8