Skip to main content
Log in

Block-Relaxation Approaches for Fitting the INDCLUS Model

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

A well-known clustering model to represent I × I × J data blocks, the J frontal slices of which consist of I × I object by object similarity matrices, is the INDCLUS model. This model implies a grouping of the I objects into a prespecified number of overlapping clusters, with each cluster having a slice-specific positive weight. An INDCLUS model is fitted to a given data set by means of minimizing a least squares loss function. The minimization of this loss function has appeared to be a difficult problem for which several algorithmic strategies have been proposed. At present, the best available option seems to be the SYMPRES algorithm, which minimizes the loss function by means of a block-relaxation algorithm. Yet, SYMPRES is conjectured to suffer from a severe local optima problem. As a way out, based on theoretical results with respect to optimally designing block-relaxation algorithms, five alternative block-relaxation algorithms are proposed. In a simulation study it appears that the alternative algorithms with overlapping parameter subsets perform best and clearly outperform SYMPRES in terms of optimization performance and cluster recovery.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • ARABIE, P., and CARROLL, J.D. (1980), “MAPCLUS: A Mathematical Programming Approach to Fitting the ADCLUS Model,” Psychometrika, 45, 211–235.

    Article  MATH  Google Scholar 

  • BAKEMAN, R. (2005), “Recommended Effect Size Statistics for Repeated Measures Designs,” Behavior Research Methods, 37, 379–384.

    Article  Google Scholar 

  • BRO, R., and DE JONG, S. (1997), “A Fast Non-Negativity-Constrained Least Squares Algorithm,” Journal of Chemometrics, 11, 393–401.

    Article  Google Scholar 

  • CARROLL, J.D., and ARABIE, P. (1983), “INDCLUS: An Individual Differences Generalization of the ADCLUS Model and the MAPCLUS Algorithm,” Psychometrika, 48, 157–169.

    Article  Google Scholar 

  • CEULEMANS, E., VAN MECHELEN, I., and LEENEN, I. (2007), “The Local Minima Problem in Hierarchical Classes Analysis: An Evaluation of a Simulated Annealing Algorithm and Various Multistart Procedures,” Psychometrika, 72, 377–391.

    Article  MathSciNet  MATH  Google Scholar 

  • CHATURVEDI, A., and CARROLL, J.D. (1994), “An Alternating Combinatorial Optimization Approach to Fitting the INDCLUS and Generalized INDCLUSModels,” Journal of Classification, 11, 155–170.

    Article  MATH  Google Scholar 

  • COHEN, J. (1960), “A Coefficient of Agreement for Nominal Scales,” Educational and Psychological Measurement, 20, 37–46.

    Article  Google Scholar 

  • DE LEEUW, J. (1994), “Block-Relaxation Algorithms in Statistics”, in: Information Systems and Data Analysis, eds. H.H. Bock, W. Lenski, and M.M. Richter, Berlin: Springer-Verlag, pp. 308–325.

  • DEPRIL, D., VAN MECHELEN, I., and MIRKIN, B.G. (2008), “Algorithms for Additive Clustering of Rectangular Data Tables,” Computational Statistics and Data Analysis, 52, 4923–4938.

    Article  MathSciNet  MATH  Google Scholar 

  • DHILLON, I.S., GUAN, Y., and KOGAN, J. (2002), “Refining Clusters in High-Dimensional Text Data”, in: Proceedings of the Workshop on Clustering High Dimensional Data and its Applications at the Second SIAM International Conference on Data Mining, eds. I.S. Dhillon, and J. Kogan, SIAM 2002, pp. 71–82.

  • GELMAN, A., CARLIN, J.B., STERN, H.S., and RUBIN, D.B. (1995), Bayesian Data Analysis, London: Chapman and Hall.

    Google Scholar 

  • GOODMAN, J., and SOKAL, A.D. (1989), “Multigrid Monte Carlo Method: Conceptual Foundations,” Physical Review D, 40, 2035–2071.

    Article  Google Scholar 

  • HARSHMAN, R.A., and LUNDY, M.E. (1984), “The PARAFAC Model”, in: Research Methods for Multimode Data Analysis, eds. H.G. Law, C.W. Snyder, Jr., J.A. Hattie, and R.P. McDonald, New York: Praeger, pp. 122–215.

  • KIERS, H.A.L. (1997), “A Modification of the SINDCLUS Algorithm for Fitting the ADCLUS and INDCLUS Models,” Journal of Classification, 14, 297–310.

    Article  MATH  Google Scholar 

  • LARSEN, B., and AONE, C. (1999), “Fast and Effective Text Mining Using Linear-Time Document Clustering”, in: Proceedings of the Fifth ACM SIGKDD, San Diego, CA, pp. 16–22.

  • LAWSON, C.L., and HANSON, R.J. (1974), Solving Least Squares Problems, Englewood Cliffs, NJ: Prentice-Hall Inc.

    MATH  Google Scholar 

  • MILLIGAN, G.W. (1980), “An Examination of the Effect of Six Types of Error Perturbation on Fifteen Clustering Algorithms,” Psychometrika, 45, 325–342.

    Article  Google Scholar 

  • MIRKIN, B.G. (1987), “The Method of Principal Clusters,” Automation and Remote Control, 10, 131–143.

    MathSciNet  Google Scholar 

  • MIRKIN, B.G. (1990), “A Sequential Fitting Procedure for Linear Data Analysis Models,” Journal of Classification, 7, 167–195.

    Article  MathSciNet  MATH  Google Scholar 

  • MIRKIN, B.G. (1996), Mathematical Classification and Clustering (Nonconvex Optimization and its Applications), Boston-Dordrecht: Kluwer Academic Press.

    Google Scholar 

  • ROBERTS, G.O., and SAHU, S.K. (1997), “Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler,” Journal of the Royal Statistical Society: Series B, 59, 291–317.

    Article  MathSciNet  MATH  Google Scholar 

  • SCHEPERS, J., VAN MECHELEN, I., and CEULEMANS, E. (2006), “Three-Mode Partitioning,” Computational Statistics and Data Analysis, 51, 1623–1642.

    Article  MathSciNet  MATH  Google Scholar 

  • SEEWALD, W. (1992), “Discussion on Parameterization Issues in Bayesian Inference (by S. E. Hills and A. F. M. Smith)”, in: Bayesian Statistics 4, eds. J.M. Bernardo, J.O. Berger, A.P. Dawid, and A.F.M. Smith, Oxford: Oxford University Press, pp. 241–243.

  • SHEPARD, R.N., and ARABIE, P. (1979), “Additive Clustering: Representation of Similarities as Combinations of Discrete Overlapping Properties,” Psychological Review, 86, 87–123.

    Article  Google Scholar 

  • SMITH, A.F.M., and ROBERTS, G.O. (1993), “Bayesian Computation via the Gibbs Sampler and Related Markov Chain Monte Carlo Methods,” Journal of the Royal Statistical Society: Series B, 55, 3–23.

    MathSciNet  MATH  Google Scholar 

  • STEINBACH, M., KARYPIS, G., and KUMAR, V. (2000), “A Comparison of Document Clustering Techniques”, in: Proceedings of the Sixth ACM SIGKDD, World Text Mining Conference, Boston, MA.

  • TEN BERGE, J.M.F., and KIERS, H.A.L. (2005), “A Comparison of Two Methods for Fitting the INDCLUS Model,” Journal of Classification, 22, 273–286.

    Article  MathSciNet  Google Scholar 

  • TUKEY, J.W. (1953), “The Problem of Multiple Comparisons”, Mimeographed monograph.

  • TVERSKY, A. (1977), “Features of Similarity,” Psychological Review, 84, 327–352.

    Article  Google Scholar 

  • WILDERJANS, T.F., CEULEMANS, E., VAN MECHELEN, I., and VAN DEN BERG, R.A. (2011), “Simultaneous Analysis of Coupled Data Matrices Subject to Different Amounts of Noise,” British Journal of Mathematical and Statistical Psychology, 64, 277–290.

    Article  MathSciNet  Google Scholar 

  • WILDERJANS, T.F., CEULEMANS, E., and VAN MECHELEN, I. (2009), “Simultaneous Analysis of Coupled Data Blocks Differing in Size: A Comparison of Two Weighting Schemes,” Computational Statistics and Data Analysis, 53, 1086–1098.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tom F. Wilderjans.

Additional information

The research in this paper was partially supported by the Research Fund of KU Leuven (PDM-kort project 3 H100377, dr. Tom F. Wilderjans; GOA 2005/04, Prof. dr. Iven Van Mechelen), by the Belgian Science Policy (IAP P6/03, Prof. dr. Iven Van Mechelen), and by the Fund of Scientific Research (FWO)-Flanders (project G.0546.09, Prof. dr. I. Van Mechelen). The simulation study was conducted using high performance computational resources provided by KU Leuven (http://ludit.kuleuven.be/hpc). We would like to thank Prof. dr. J. de Leeuw for his helpful advice on the topic of block-relaxation algorithms. We further also would like to thank three anonymous reviewers for their useful comments and suggestions which considerably improved earlier versions of this manuscript.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wilderjans, T.F., Depril, D. & Van Mechelen, I. Block-Relaxation Approaches for Fitting the INDCLUS Model. J Classif 29, 277–296 (2012). https://doi.org/10.1007/s00357-012-9113-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-012-9113-4

Keywords

Navigation