Skip to main content
Log in

Reduced \(k\)-means clustering with MCA in a low-dimensional space

  • Original Paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

In the two-step sequential approach called tandem analysis, we focus on applying a clustering algorithm on estimated object scores after dimensional reduction of variables. In this approach, reduction may obscure or mask taxonomic information (Arabie and Hubert in Handbook of marketing research. Blackwell, Oxford, 1994). As an alternative to tandem analysis, an approach combining two methods for categorical data is proposed by Hwang et al. (Psychometrika 71:161–171, 2006); however, this method does not consider the removal of object scores estimated as a vector of \(1\) that has no meaning in the first dimension. In this study, we propose a method for clustering objects consisting of categorical variables in a low-dimensional space. Our proposed method uses simultaneous analysis of multi-dimensional nonmetric principal component analysis and \(k\)-means clustering for categorical data; that is, we reduce dimensions with category quantifications, thus clustering object scores. We display object scores and variable categories, and therefore, every relationship between objects and categories can be interpreted for each cluster. Using simulated data, this method has been compared with tandem clustering and applied to real world data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Adachi K, Murakami T (2011) Nonmetric multivariate analysis. Asakura-Shoten, Tokyo (in Japanese)

    Google Scholar 

  • Arabie P, Hubert L (1994) Cluster analysis in marketing research. In: Bagozzi RP (ed) Handbook of marketing research. Blackwell, Oxford

    Google Scholar 

  • De Soete G, Carroll JD (1994) K-means clustering in a low-dimensional Euclidean space. In: Diday E, Lechevallier Y, Schader M, Bertrand P, Burtschy B (eds) New approaches in classification and data analysis. Springer, Heidelberg, pp 212–219

  • Gifi A (1990) Nonlinear multivariate analysis. Wiley, Chichester

    MATH  Google Scholar 

  • Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218

    Article  Google Scholar 

  • Hwang H, Dillon WR (2010) Simultaneous two-way clustering of multiple correspondence analysis. Multivar Behav Res 45:186–208

    Article  Google Scholar 

  • Hwang H, Dillon WR, Takane Y (2006) An extension of multiple correspondence analysis for identifying heterogeneous subgroups of respondents. Psychometrika 71:161–171

    Article  MATH  MathSciNet  Google Scholar 

  • Hwang H, Dillon WR, Takane Y (2010) Fuzzy cluster multiple correspondence analysis. Behaviormetrika 37:111–133

    Article  MATH  Google Scholar 

  • Iodice D’ Enza A, Palumbo F (2013) Iterative factor clustering of binary data. Comput Stat 28:1–19

    Article  MathSciNet  Google Scholar 

  • Iodice D’ Enza A, Van de Velden M, Palumbo F (2014) On joint dimension reduction and clustering of categorical data. In: Vicari D, Okada A, Ragozini G, Weihs C (eds) Analysis and modeling of complex data in behavioral and social sciences. Springer, Switzerland, pp 161–169

    Google Scholar 

  • Lineoff GH (1981) The Audubon Society field guide to North American mushrooms. Alfred A. Knopf, New York

    Google Scholar 

  • MacQueen J (1967) Some methods for classification and analysis of multivariate observations. Proc Fifth Berkeley Symp Math Stat Probab 1:281–297

    MathSciNet  Google Scholar 

  • Rocci R, Garrone SA, Vichi M (2011) A new dimension reduction method: factor discriminant \(k\)-means. J Classif 28:210–226

  • ten Berge JM (1993) Least squares optimization in multivariate analysis. DSWO Press, Leiden University, Leiden

    Google Scholar 

  • Timmerman ME, Ceulemans E, Kiers HAL, Vichi M (2010) Factorial and reduced \(K\)-means reconsidered. Comput Stat Data Anal 54:1858–1871

    Article  MATH  MathSciNet  Google Scholar 

  • Van Buuren S, Heiser WJ (1989) Clustering \(N\) objects into \(K\) groups under optimal scaling of variables. Psychometrika 54:699–706

  • Van de Velden M, Iodice D’ Enza A, Palumbo F (2012) On joint dimension reduction and clustering. In: JCS-CLADAG, analysis and modeling of complex data in behavioural and social sciences, September 3–4, 2012, Capri, Italy

  • Vichi M, Kiers HAL (2001) Factorial \(k\)-means analysis for two-way data. Comput Stat Data Anal 37:49–64

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masaki Mitsuhiro.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mitsuhiro, M., Yadohisa, H. Reduced \(k\)-means clustering with MCA in a low-dimensional space. Comput Stat 30, 463–475 (2015). https://doi.org/10.1007/s00180-014-0544-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-014-0544-8

Keywords

Navigation