A New Dimension Reduction Method: Factor Discriminant K-means

Rocci, Roberto; Gattone, Stefano Antonio; Vichi, Maurizio

doi:10.1007/s00357-011-9085-9

A New Dimension Reduction Method: Factor Discriminant K-means

Published: 05 July 2011

Volume 28, pages 210–226, (2011)
Cite this article

Journal of Classification Aims and scope Submit manuscript

Roberto Rocci¹,
Stefano Antonio Gattone¹ &
Maurizio Vichi²

498 Accesses
1 Altmetric
Explore all metrics

Abstract

Reduced K-means (RKM) and Factorial K-means (FKM) are two data reduction techniques incorporating principal component analysis and K-means into a unified methodology to obtain a reduced set of components for variables and an optimal partition for objects. RKM finds clusters in a reduced space by maximizing the between-clusters deviance without imposing any condition on the within-clusters deviance, so that clusters are isolated but they might be heterogeneous. On the other hand, FKM identifies clusters in a reduced space by minimizing the within-clusters deviance without imposing any condition on the between-clusters deviance. Thus, clusters are homogeneous, but they might not be isolated. The two techniques give different results because the total deviance in the reduced space for the two methodologies is not constant; hence the minimization of the within-clusters deviance is not equivalent to the maximization of the between-clusters deviance. In this paper a modification of the two techniques is introduced to avoid the afore mentioned weaknesses. It is shown that the two modified methods give the same results, thus merging RKM and FKM into a new methodology. It is called Factor Discriminant K-means (FDKM), because it combines Linear Discriminant Analysis and K-means. The paper examines several theoretical properties of FDKM and its performances with a simulation study. An application on real-world data is presented to show the features of FDKM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bock, H.H. (1987), “On the Interface Between Cluster Analysis, Principal Components, and Multidimensional Scaling”, in: Multivariate Statistical Modeling and Data Analysis, Proceedings of Advances Symposium on Multivariate Modeling and Data Analysis, Knoxville, Tennessee, May 15–16, 1986, eds. H. Bozdogan and A.J. Gupta, Dordrecht: Reidel Publishing Co., pp. 17–34.
Caliński, T., and Harabasz, J. (1974), “A Dendrite Method for Cluster Analysis”, Communications in Statistics, 3, 1–27.
Article Google Scholar
Cormack, R. M. (1971), “A Review of Classification”, Journal of the Royal Statistical Society. Series A, 134(3), 321–367.
Article MathSciNet Google Scholar
De Soete, G., and Carroll, J.D., (1994). “K-means Clustering in a Low-dimensional Euclidean Space”, in New Approaches in Classification and Data Analysis, eds. E. Diday et al., Heidelberg: Springer, pp. 212–219.
Diday, E., et al. (1979), “Optimisation en classification automatique”, INRIA (Vol. 1), Rocquencourt, France.
Forina, M., Lear, R., Armanino, C., and Lauter, S. (1988), PARVUS – An Extendible Package for Data Exploration, Classification and Correlation, Institute of Pharmaceutical and Food Analysis and Technologies, Genoa, Italy.
Hubert, L., and Arabie, P. (1985), Comparing Partitions. Journal of Classification, 2, 193–218.
Article Google Scholar
Timmerman, M., Ceulemans, E., Kiers, H.A.L., and Vichi, M. (2010), Factorial and Reduced K-means Reconsidered, Computational Statistics and Data Analysis, 54, 1858–1871.
Article Google Scholar
Van Buuren, S., and Heiser, W.J. (1989), “Clustering n Objects into k groups Under Optimal Scaling of Variables”, Psychometrika, 54, 699–706.
Article MathSciNet Google Scholar
Vichi, M., and Kiers, H.A.L. (2001), “Factorial K-means Analysis for Two-way Data”, Computational Statistics and Data Analysis, 37, 49–64.
Article MathSciNet MATH Google Scholar
Vichi, M., Rocci, R., and Kiers, H.A.L. (2007), “Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches”, Journal of Classification, 24(1), 71–98.
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Dept. SEFeMeQ, University of “Tor Vergata”, Tor Vergata, Rome
Roberto Rocci & Stefano Antonio Gattone
Dept. Statistics, Probability and Applied Statistics, University “La Sapienza”, La Sapienza, Rome
Maurizio Vichi

Authors

Roberto Rocci
View author publications
You can also search for this author inPubMed Google Scholar
Stefano Antonio Gattone
View author publications
You can also search for this author inPubMed Google Scholar
Maurizio Vichi
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Roberto Rocci.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rocci, R., Gattone, S.A. & Vichi, M. A New Dimension Reduction Method: Factor Discriminant K-means. J Classif 28, 210–226 (2011). https://doi.org/10.1007/s00357-011-9085-9

Download citation

Published: 05 July 2011
Issue Date: July 2011
DOI: https://doi.org/10.1007/s00357-011-9085-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Dimension Reduction Method: Factor Discriminant K-means

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An effective clustering scheme for high-dimensional data

Clustering and dimension reduction for mixed variables

Braverman’s Spectrum and Matrix Diagonalization Versus iK-Means: A Unified Framework for Clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

A New Dimension Reduction Method: Factor Discriminant K-means

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An effective clustering scheme for high-dimensional data

Clustering and dimension reduction for mixed variables

Braverman’s Spectrum and Matrix Diagonalization Versus iK-Means: A Unified Framework for Clustering

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now