Abstract
Cluster-weighted models (CWMs) are useful tools for identifying latent functional relationships between response variables and covariates. However, owing to excess distributional assumptions made on the covariates, these models can suffer misspecifications of component distributions, which could also undermine the estimation accuracy and render the model structure complicated for interpretation. To address this issue, we consider CWMs with univariate responses and propose a novel CWM by modelling each cluster as a finite mixture to enhance flexibility while retaining parsimony. We prove that the proposed method can provide more meaningful clusters in the data than those of existing methods. Additionally, we present a procedure to construct such a proposed CWM and a feasible expectation-maximization algorithm to estimate the model parameters. Numerical demonstrations, including simulations and real data analysis, are also provided.












Similar content being viewed by others
Data Availability
The datasets analyzed during this study are available from fpc package (Hennig & Imports, 2015) in R at https://CRAN.R-project.org/package=fpc and UCI machine learning repository at https://archive.ics.uci.edu/ml/datasets/abalone.
References
Bai, X., Yao, W., & Boyer, J. E. (2012). Robust fitting of mixture regression models. Computational Statistics & Data Analysis, 56(7), 2347–2359.
Biernacki, C., Celeux, G., & Govaert, G. (2000). Assessing a mixture model for clustering with the integrated completed likelihood. IEEE transactions on pattern analysis and machine intelligence, 22(7), 719–725.
Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.
Celeux, G., & Govaert, G. (1995). Gaussian parsimonious clustering models. Pattern recognition, 28(5), 781–793.
Chamroukhi, F. (2016). Robust mixture of experts modeling using the t distribution. Neural Networks, 79, 20–36.
Cohen, E. (1980). Inharmonic tone perception, Unpublished Ph D Dissertation, Stanford University.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273–297.
Dang, U. J., Punzo, A., McNicholas, P. D., Ingrassia, S., & Browne, R. P. (2017). Multivariate response and parsimony for Gaussian cluster-weighted models. Journal of Classification, 34(1), 4–34.
Day, N. E. (1969). Estimating the components of a mixture of normal distributions. Biometrika, 56(3), 463–474.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–22.
Dua, D., & Graff, C. (2017). UCI machine learning repository. http://archive.ics.uci.edu/ml.
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1), 119–139.
Friedman, J. (1999). Greedy function approximation: A gradient boosting machine 1 function estimation 2 numerical optimization in function space. North, 1(3), 1–10.
Friedman, J., Hastie, T., Tibshirani, R., & et.al (2000). Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). Annals of statistics, 28(2), 337–407.
Gallaugher, M. P., Tomarchio, S. D., McNicholas, P. D., & Punzo, A. (2022). Multivariate cluster weighted models using skewed distributions. Advances in Data Analysis and Classification, 16(1), 93–124.
Gershenfeld, N. (1997). Nonlinear inference and cluster-weighted modeling. Annals of the New York Academy of Sciences, 808(1), 18–24.
Gupta, S., & Chintagunta, P. K. (1994). On using demographic variables to determine segment membership in logit mixture models. Journal of Marketing Research, 31(1), 128–136.
Hennig, C. (2000). Identifiablity of models for clusterwise linear regression. Journal of Classification, 17(2), 273–296.
Hennig, C. (2010). Methods for merging Gaussian mixture components. Advances in Data Analysis and Classification, 4(1), 3–34.
Hennig, C., & Imports, M. (2015). Package ‘fpc’. CRAN.
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Ingrassia, S., Minotti, S. C., & Punzo, A. (2014). Model-based clustering via linear cluster-weighted models. Computational Statistics & Data Analysis, 71, 159–182.
Ingrassia, S., Minotti, S. C., & Vittadini, G. (2012). Local statistical modeling via a cluster-weighted approach with elliptical distributions. Journal of Classification, 29(3), 363–401.
Ingrassia, S., Punzo, A., Vittadini, G., & Minotti, S. C. (2015). The generalized linear mixed cluster-weighted model. Journal of Classification, 32(2), 327–355.
Jacobs, R. A., Jordan, M. I., Nowlan, S. J., & Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural computation, 3(1), 79–87.
Jordan, M. I., & Jacobs, R. A. (1994). Hierarchical mixtures of experts and the em algorithm. Neural computation, 6(2), 181–214.
Kamakura, W. A., Wedel, M., & Agrawal, J. (1994). Concomitant variable latent class models for conjoint analysis. International Journal of Research in Marketing, 11(5), 451–464.
Kim, D., & Seo, B. (2014). Assessment of the number of components in Gaussian mixture models in the presence of multiple local maximizers. Journal of Multivariate Analysis, 125, 100–120.
LeDell, E., Gill, N., Aiello, S., Fu, A., Candel, A., Click, C., Kraljevic, T., Nykodym, T., Aboyoun, P., Kurka, M., & et.al. (2018). Package ‘h2o’. CRAN.
Leisch, F., & Dimitriadou, E. (2009). Package ‘mlbench’. CRAN.
MacQueen, J., et al. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth berkeley symposium on mathematical statistics and probability, Oakland, CA, USA, vol. 1, pp. 281-297.
Mazza, A., Punzo, A., & Ingrassia, S. (2018). flexcwm: A flexible framework for cluster-weighted models. Journal of Statistical Software, 86(2), 1–30.
McLachlan, G. J., & Peel, D. (2004). Finite mixture models, Wiley & Sons.
McNicholas, P. D. (2016). Model-based clustering. Journal of Classification, 33(3), 331–373.
Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F., Chang, C. C., Lin, C. C., & Meyer, M. D. (2019). Package ‘e1071’. CRAN.
Murphy, K., & Murphy, T. B. (2022). Package ‘MoEClust’. CRAN.
Murphy, K., & Murphy, T. B. (2020). Gaussian parsimonious clustering models with covariates and a noise component. Advances in Data Analysis and Classification, 14(2), 293–325.
Nash, W. J., Sellers, T. L., Talbot, S. R., Cawthorn, A. J., & Ford, W. B. (1994). The population biology of abalone (haliotis species) in tasmania. i. blacklip abalone (h. rubra) from the north coast and islands of bass strait. Sea Fisheries Division. Technical Report, 48, 411.
Punzo, A., & McNicholas, P. D. (2017). Robust clustering in regression analysis via the contaminated Gaussian cluster-weighted model. Journal of Classification, 34(2), 249–293.
Quandt, R. E. (1972). A new approach to estimating switching regressions. Journal of the American statistical association, 67(338), 306–310.
Schwarz, G. (1978). Estimating the dimension of a model. Annals of statistics, 6(2), 461–464.
Seo, B., & Kim, D. (2012). Root selection in normal mixture models. Computational Statistics & Data Analysis, 56(8), 2454–2470.
Song, W., Yao, W., & Xing, Y. (2014). Robust mixture regression model fitting by Laplace distribution. Computational Statistics & Data Analysis, 71, 128–137.
Vinh, N. X., Epps, J., & Bailey, J. (2010). Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. The Journal of Machine Learning Research, 11, 2837–2854.
Wolfe, J. H. (1963). Object cluster analysis of social areas, PhD thesis, University of California.
Xu, L., Jordan, M., & Hinton, G. E. (1994). An alternative model for mixtures of experts, Advances in neural information processing systems, 7.
Yao, W., Wei, Y., & Yu, C. (2014). Robust mixture regression using the t-distribution. Computational Statistics & Data Analysis, 71, 116–127.
Young, D., Benaglia, T., Chauveau, D., Hunter, D., Elmore, R., Hettmansperger, T., Thomas, H., Xuan, f., & Young, M. D. (2020). Package ‘mixtools’. CRAN.
Zarei, S., Mohammadpour, A., Ingrassia, S., & Punzo, A. (2019). On the use of the sub-Gaussian α-stable distribution in the cluster-weighted model. Iranian Journal of Science and Technology, Transactions A: Science, 43(3), 1059–1069.
Zhang, B. (2003). Regression clustering. In Third IEEE international conference on data mining, IEEE, pp. 451–458.
Funding
The research of Byungtae Seo is supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (NRF-2022R1A2C1006462).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Oh, S., Seo, B. Merging Components in Linear Gaussian Cluster-Weighted Models. J Classif 40, 25–51 (2023). https://doi.org/10.1007/s00357-022-09424-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-022-09424-w