Skip to main content
Log in

Generalized data-fitting factor analysis with multiple quantification of categorical variables

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In this study, a recently proposed data-fitting factor analysis (DFFA) procedure is generalized for categorical variable analysis. For generalized DFFA (GDFFA), we develop an alternating least squares algorithm consisting of a multiple quantification step and a model parameters estimation step. The differences between GDFFA and similar statistical methods such as multiple correspondence analysis and FACTALS are also discussed. The developed algorithm and its solution are illustrated with a real data example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Adachi K (2012) Some contributions to data-fitting factor analysis with empirical comparisons to covariance-fitting factor analysis. J Jpn Soc Comput Stat 25:25–38

    Article  Google Scholar 

  • Adachi K, Murakami T (2011) Nonmetric multivariate analysis: MCA, NPCA, and PCA. Asakura Shoten, Tokyo (in Japanese)

    Google Scholar 

  • Anderson TW, Rubin H (1956) Statistical inference in factor analysis. In: Neyman J (ed) Proceedings of the third Berkeley symposium on mathematical statistics and probability, vol 5. University of California Press, Berkeley, pp 111–150

  • Benzecri JP (1974) L’analyses des donnees: Tome (VoL) 1. La taxinomie: Tome. 2 La’analyses des correspondances. Dunod, Paris

    Google Scholar 

  • Benzecri JP (1992) Correspondence analysis handbook. Marcel Dekker, New York

    MATH  Google Scholar 

  • Browne MW (2001) An overview of analytic rotation in exploratory factor analysis. Multivar Behav Res 36:111–150

    Article  Google Scholar 

  • de Leeuw J (2004) Least squares optimal scaling of partially observed linear systems. In: van Montfort K, Oud J, Satorra A (eds) Recent developments on structural equation models: theory and applications. Kluwer, Dordrecht, pp 121–134

    Chapter  Google Scholar 

  • de Leeuw J (2008) Factor analysis as matrix decomposition. Preprint series: Department of Statistics, University of California, Los Angeles

    Google Scholar 

  • Greenacre MJ (1984) Theory and application of correspondence analysis. Academic Press, London

    Google Scholar 

  • Gifi A (1990) Nonlinear multivariate analysis. Wiley, Chichester

    MATH  Google Scholar 

  • Harman HH (1976) Mordan factor analysis, 3rd edn. University of Chicago Press, Chicago

    Google Scholar 

  • Kuroda M, Mori Y, Iizuka M, Sakakihara M (2012) Acceleration of convergence of the alternating least squares algorithm for nonlinear principal components analysis. In Sanguansat P (ed) Principal component analysis. InTech, Winchester

  • Mulaik SA (2010) Foundation of factor analysis, 2nd edn. Chapman and Hall/CRC, Boca Raton

    Google Scholar 

  • Murakami T (1999) A psychometrics study on principal component analysis of categorical data. Technical report (in Japanese)

  • Murakami T (2001) Investigation of justifiability of Likert scaling using nonmetric principal components analysis and multiple correspondence analysis. Technical report (in Japanese)

  • Murakami T, Kiers HAL, ten Berge JMF (1999) Non-metric principal component analysis for categorical variables with multiple quantifications (unpublished manuscript)

  • Nishisato S (2006) Multidimensional nonlinear descriptive analysis. Chapman and Hall/CRC, Boca Raton

  • Schimitt TA, Sass DA (2011) Rotation criteria and hypothesis testing for exploratory factor analysis: implications for factor pattern loadings and interfactor correlations. Educ Psychol Measur 71:95–113

    Article  Google Scholar 

  • Schneeweiss H, Mathes H (1995) Factor analysis and principal components. J Multivar Anal 55:105–124

    Article  MATH  MathSciNet  Google Scholar 

  • Takane Y, de Young FW, Leeuw J (1979) Nonmetric common factor anaysis: an alternating least square method with optimal scaling features. Behaviormetrika 6:45–56

    Article  Google Scholar 

  • ten Berge JMF (1993) Least squares optimazation in multivariate analysis. DSNO Press, Leiden

    Google Scholar 

  • Tenenhaus M, Young YW (1985) An analysis and synthesis of multiple correspondence analysis, optimal scaling, dualscaling, homogeneity analysis and other methods for quantifying categorical multivariate data. Psychometrika 50:91–119

    Article  MATH  MathSciNet  Google Scholar 

  • Trendafilov NT, Unkel S, Krzanowski W (2013) Exploratory factor and principal component analyses: some new aspects. Stat Comput 23:209–220

  • Unkel S, Trendafilov NT (2010) Simultaneous parameter estimation in exploratory factor analysis: an expository reviews. Int Stat Rev 78:363–382

    Article  Google Scholar 

  • Unkel S, Trendafilov NT (2011) Zig-zag exploratory factor analysis with more variables than observations. Comput Stat 28:107–125

  • van der Burg E, de Leeuw Y, Verdegaal R (1988) Homogeneity analysis with k sets of variables: An alternating least squares method with optimal scaling features. Psychometrika 53:177–197

    Article  MATH  MathSciNet  Google Scholar 

  • Yanai H, Ichikawa M (2007) Factor analysis. In: Rao CR, Sinharay S (eds) HandBook of statistics vol 26: Psychometrics. Elsevier, Amsterdam, pp 257–296

    Google Scholar 

  • Young FW (1981) Quantitative analysis of qualitative date. Psychometrika 46:357–388

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

The author would like to thank Prof. Kohei Adachi for his very helpful comments on previous versions of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Naomichi Makino.

Appendices

Appendix

Japanese baseball data in 2010

The first six variables are the same as those reported in Adachi (2012), while the additional nominal variable was obtained from http://www.baseball-data.com/10/lineup/. The first six variables are as follows: [1] batting average (BA), defined as the proportion of hits to at-bats multiplied by one thousand; [2] runs (R), referring to the number of times a batter scored; [3] doubles (D), indicating the number of two-base hits; [4] home runs (HR), i.e., the number of homers hit; [5] runs batted in (RBI), referring to the number of times the batter was responsible for runs scored; and [6] strikeouts (SO) denoting the number of times the batter struck out. The six variables are dichotomized into lower (1) and higher (2). For batting order (BO), batters are categorized into the following three groups: those hitting in the first two slots (1), those hitting in the middle three slots (2), and those hitting in the bottom four slots (3) (Table 8).

Table 8 Six scores achieved by 62 batters and their batting order in Japanese professional baseball, 2010

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Makino, N. Generalized data-fitting factor analysis with multiple quantification of categorical variables. Comput Stat 30, 279–292 (2015). https://doi.org/10.1007/s00180-014-0536-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-014-0536-8

Keywords

Navigation