Abstract
Many research questions pertain to a regression problem assuming that the population under study is not homogeneous with respect to the underlying model. In this setting, we propose an original method called Combined Information criterion CLUSterwise elastic-net regression (Ciclus). This method handles several methodological and application-related challenges. It is derived from both the information theory and the microeconomic utility theory and maximizes a well-defined criterion combining three weighted sub-criteria, each being related to a specific aim: getting a parsimonious partition, compact clusters for a better prediction of cluster-membership, and a good within-cluster regression fit. The solving algorithm is monotonously convergent, under mild assumptions. The Ciclus principle provides an innovative solution to two key issues: (i) the automatic optimization of the number of clusters, (ii) the proposal of a prediction model. We applied it to elastic-net regression in order to be able to manage high-dimensional data involving redundant explanatory variables. Ciclus is illustrated through both a simulation study and a real example in the field of omic data, showing how it improves the quality of the prediction and facilitates the interpretation. It should therefore prove useful whenever the data involve a population mixture as for example in biology, social sciences, economics or marketing.
Similar content being viewed by others
References
Ahonen I, Nevalainen J, Larocque D (2019) Prediction with a flexible finite mixture-of-regressions. Comput Stat Data Anal 132:212–224
Aldana-Bobadilla E, Kuri-Morales A (2015) A clustering method based on the maximum entropy principle. Entropy 151–180
Baudat G, Anouar F (2000) Generalized discriminant analysis using a kernel approach. Neural Comput 12:2385–2404
Beck G, Azzag H, Bougeard S, Lebbah M, Niang N (2018) A new micro-batch approach for partial least square clusterwise regression. Procedia Comput Sci 144:239–250
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE T Pattern Anal 22:719–725
Biernacki C, Garcia-Escudero L, S I (2020) Special issue on innovations on model based clustering and classification. Adv Data Anal Classif 14(2):231–234
Bock H (1969) The equivalence of two extremal problems and its application to the iterative classification of multivariate data. In: Vortragsausarbeitung, Tagung. Mathematisches Forschungsinstitut Oberwolfach
Bougeard S, Abdi H, Saporta G, Niang N (2017) Clusterwise analysis for multiblock component methods. Adv Data Anal Classif 12(2):285–313
Bougeard S, Cariou V, Saporta G, Niang N (2018) Prediction for regularized clusterwise multiblock regression. Appl Stoch Model Bus 34(6):852–867
Brusco M, Cradit J, Taschian A (2003) Multicriterion clusterwise regression for joint segmentation settings: an application to customer value. J Mark Res 40:225–234
Brusco M, Cradit J, Steinley D, Fox G (2008) Cautionary remarks on the use of clusterwise regression. Multivar Behav Res 43:29–49
Bry X, Verron T, Redont P, Cazes P (2012) THEME-SEER: a multidimensional exploratory technique to analyze a structural model using an extended covariance criterion. J Chemom 26:158–169
Bry X, Trottier C, Mortier F, Cornu T, Verron T (2016) Supervised component generalized linear regression with multiple explanatory blocks: THEME-SCGLR. In: Vinzi V, Russolillo G, Saporta G, Trinchera L, Abdi H (eds) The multiple facets of partial least squares and related methods, Springer proceedings in mathematics and statistics, pp 141–154
Bushel P, Wolfinger R, Gibson G (2007) Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes. BMC Syst Biol 1–15
Charles C (1977) Régression typologique et reconnaissance des formes. PhD thesis, University of Paris IX, France
Charrad M, Ghazzali N, Boiteau V, Niknafs A (2014) Nbclust: an r package for determining the relevant number of clusters in a data set. J Stat Softw 61:1–36
Cheng C, Fu A, Zhang Y (1999) Entropy-based subspace clustering for mining numerical data. In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining, San Diego, CA, USA, pp 84–93
Cover T, Thomas J (2006) Elements of Information Theory, 2nd edn. Wiley
DeSarbo W, Cron W (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5:249–282
DeSarbo W, Grisaffe D (1998) Combinatorial optimization approaches to constrained market segmentation: an application to industrial market segmentation. Mark Lett 9:115–134
Devijver E (2015) Finite mixture regression: a sparse variable selection by model selection for clustering. Electron J Stat 9:2642–2674
Diday E (1976) Classification et sélection de paramètres sous contraintes. Tech. rep, IRIA-LABORIA
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22
Gitman I, Chen J, Lei E, Dubrawski A (2018) Novel prediction techniques based on clusterwise linear regression. arXiv arXiv:1804.10742
Heinloth A, Irwin R, Boorman G, Nettesheim P, Fannin R, Sieber S, Snell M, Tucker C, Li L, Travlos G, Vansant G, Blackshear P, Tennant R, Cunningham M, Paules R (2004) Gene expression profiling of rat livers reveals indicators of potential adverse effects. Toxicol Sci 80:193–202
Heller R, Stanley D, Yekutieli D, Rubin N, Benjamini Y (2006) Cluster-based analysis of FMRI data. NeuroImage 33:599–608
Hubert H, Arabie P (1985) Comparing partitions. J Classif 193–218
Hwang H, DeSarbo S, Takane Y (2007) Fuzzy clusterwise generalized structured component analysis. Psychometrika 72:181–198
Le Cao K, Rossouw D, Robert-Granie C, Besse P (2008) A sparse PLS for variable selection when integrating omics data. Stat Appl Genet Mol 7:1
Leisch F (2004) FlexMix: a general framework for finite mixture models and latent class regression in R. J Stat Softw 11:1
Mortier F, Ouedraogo D, Claeys F, Tadesse M, Cornu G, Baya F, Benedet F, Freycon V, Gourlet-Fleury S, Picard N (2015) Mixture of inhomogeneous matrix models for species-rich ecosystems. Environmetrics 26:39–51
Nadaraya E (1964) On estimating regression. Theory of probability and its applications. Theory Probab Appl 9:141–142
Preda C, Saporta G (2005) Clusterwise PLS regression on a stochastic process. Comput Stat Data Anal 49:99–108
R Core Team (2017) R: A Language and Environment for Statistical Computing (version 3.6.1). R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
Rand W (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850
Rohart F, Gautier B, Singh A, Le Cao KA (2017) mixomics: an r package for ’omics feature selection and multiple data integration. PLoS computational biology 13(11):e1005752
Shannon C (1948) A mathematical theory of communication. L’Institut d’electronique et d’informatique Gaspard-Monge (Reprinted with corrections from The Bell System Technical Journal) 27:379–423
Späth H (1979) Clusterwise linear regression. Computing 22:367–373
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc B 36:111–147
Suk HW, Hwang H (2010) Regularized fuzzy clusterwise ridge regression. Adv Data Anal Classif 4:35–51
Vinzi V, Lauro C, Amato S (2005) PLS typological regression. In: Monari P, Mignani S, Montanari A, Vichi M (eds) New developments in classification and data analysis. Springer, pp 133–140
Vinzi V, Trinchera L, Squillacciotti S, Tenenhaus M (2009) REBUS-PLS: a response-based procedure for detecting unit segments in PLS path modeling. Appl Stochastic Models Bus Ind 24:439–458
Watson G (1964) Smooth regression analysis. Sankhya: Indian J Stat Ser A 64:359–372
Wilderjans T, Ceulemans E (2013) Clusterwise Parafac to identify heterogeneity in three-way data. Chemometr Intell Lab 129:87–97
Wilderjans T, Vande Gaer E, Kiers H, Van Mechelen I, Ceulemans E (2017) Principal covariates clusterwise regression (PCCR): Accounting for multicollinearity and population heterogeneity in hierarchically organized data. Psychometrika 82:86–111
Woo CW, Krishnan A, Wager T (2014) Cluster-extent based thresholding in fMRI analyses: Pitfalls and recommendations. Neuroimage 91:412–419
Xiang S, Yao W (2020) Semi parametric mixtures of regressions with single-index for model based clustering. Adv Data Anal Classif 14:261–292
Yuan M, Lin Y (2005) Model selection and estimation in regression with grouped variables. J R Stat Soc B 68:49–67
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc B 67:301–320
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bry, X., Niang, N., Verron, T. et al. Clusterwise elastic-net regression based on a combined information criterion. Adv Data Anal Classif 17, 75–107 (2023). https://doi.org/10.1007/s11634-021-00489-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-021-00489-w
Keywords
- Clusterwise regression
- Typological regression
- Lasso regularization
- Multicollinearity
- Ridge regression
- Elastic-net regularization