Abstract
Change point estimation in standard process observed over time is an important problem in literature with applications in various fields. We study this problem in a heterogeneous population. A model-based clustering procedure relying on skewed matrix-variate mixture is proposed. It is capable of capturing the heterogeneity pattern and estimating change points from all data groups simultaneously. The appeal of such approach also lies in its flexibility to model the skewness and dependence in data with good interpretability. Two novel algorithms called matrix power mixture with abrupt change model and matrix power mixture with gradual change model are developed. The approaches are illustrated by simulation studies across a variety of settings. The models are then tested on the US crime data with promising results.






Similar content being viewed by others
Change history
16 September 2021
An Erratum to this paper has been published: https://doi.org/10.1007/s00357-021-09400-w
References
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.
Atkinson, A.C., Riani, M., Cerioli, A. (2003). Exploring multivariate data with the forward search. Oxford: Clarendon Press.
Bouveyron, C., & Brunet-Saumard, C. (2014). Model-based clustering of high-dimensional data: a review. Computational Statistics and Data Analysis, 71, 52–78.
Box, G.E., & Cox, D.R. (1964). An analysis of transformations. Journal of the Royal Statistical Society Series B, 26(2), 211–252.
Chen, J., & Gupta, A. (2007). A Bayesian approach to the statistical analysis of a smooth-abrupt change point model. Advances and Applications in Statistics, 7(1), 115–126.
Chen, J. (2012). Parametric statistical change point analysis. Basel: Birkhäuser.
Chen, J., & Gupta, A.K. (1997). Testing and locating variance change points with application to stock prices. Journal of the American Statistical Association, 92, 739–747.
Coppin, P., Jonckheere, I., Nackaerts, B., Muys, B., Lambin, E. (2004a). Review articlegigital change detection methods in ecosystem monitoring: a review. International Journal of Remote Sensing, 25, 1565–1596.
Chen, J., & Gupta, A. (2004b). Review articlegigital change detection methods in ecosystem monitoring: a review. International Journal of Remote Sensing, 25, 1565–1596.
Dempster, A.P., Laird, N.M., Rubin, D.B. (1977). Maximum likelihood for incomplete data via the EM algorithm (with discussion). Jounal of the Royal Statistical Society Series B, 39, 1–38.
Doğru, F.Z., Bulut, Y.M., Arslan, O. (2016). Finite mixtures of matrix variate t distributions. Journal of Science, 25, 335–341.
Forgy, E. (1965). Cluster analysis of multivariate data: efficiency vs. interpretability of classifications. Biometrics, 21, 768–780.
Fraley, C., & Raftery, A.E. (2002). Model-Based Clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631.
Gallaugher, M., & McNicholas, P. (2019). Mixtures of skewed matrix variate bilinear factor analyzers. In Advances in data analysis and classification (pp. 1–20).
Gallaugher, M., & McNicholas, P. (2020). Parsimonious mixtures of matrix variate bilinear factor analyzers. In Advanced Studies in Behaviormetrics and Data Science:, Essays in Honor of Akinori Okada (pp. 177–196).
Grubesic, T.H., & Murray, A.T. (2001). Detecting hot spots using cluster analysis and GIS. In Fifth annual international crime mapping research conference.
Guild, L.S., Cohen, W.B., Kauffman, J.B. (2004). Detection of deforestation and land conversion in Rondania, Brazil using change detection techniques. International Journal of Remote Sensing, 25, 731–750.
Gupta, A.K., & Nagar, D.K. (1999). Matrix variate distributions. Boca Raton: Chapman & Hall/CRC.
Hsu, D. (1977). Tests for variance shifts at an unknown time point. Journal of the Royal Statistical Society Series C, 26(3), 279–284.
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.
Huth, R., Beck, C., Philipp, A., Demuzere, M., Ustrnul, Z., Cahynová, M., Kyselý, J. (2008). Classifications of atmospheric circulation patterns. In ANNALS of the New York Academy of sciences, (Vol. 1146 pp. 105–152).
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth berkeley symposium, 1, 281–297.
Manly, B.F.J. (1976). Exponential data transformations. Biometrics Unit, 25, 37–42.
McLachlan, G.J., & Peel, D. (2000). Finite mixture models. New York: Wiley.
McNicholas, P.D. (2016). Model-based clustering. Journal of Classification, 33(3), 331–373.
Melnykov, V., & Zhu, X. (2018a). Manly transformation in finite mixture modeling. Computational Statistics and Data Analysis, 190–208.
Melnykov, V., & Zhu, X. (2018b). On model-based clustering of skewed matrix data. Journal of Multivariate Analysis, 181–194.
Melnykov, V., & Zhu, X. (2019). Studying crime trends in the USA over the years 2000–2012. Advances in Data Analysis and Classification, 13(1), 325–341.
Page, E.S. (1957). On problem in which a change in parameter occurs at an unknown points. Biometrika, 42, 248–252.
Perry, M.B., & Pignatiello, J.J. (2008). A change point model for the location parameter of exponential family densities. IIE Transactions, 40, 947–956.
Pettitt, A.N. (1979). A non-parametric approach to the change point problem. Journal of the American Statistical Association, 28, 126–135.
Sarkar, S., Zhu, X., Melnykov, V., & Ingrassia, S. (2020). On parsimonious models for modeling matrix data. Computational Statistics & Data Analysis, 142, 106822.
Schwarz, G. (1978). Estimating the dimensions of a model. Annals of Statistics, 6, 461–464.
Sneath, P. (1957). The application of computers to taxonomy. Journal of General Microbiology, 17, 201–226.
Sunaga, D.Y., Nievola, J.C., Ramos, M.P. (2007). Statistical and biological validation methods in cluster analysis of gene expression. In Sixth international conference on machine learning and applications (pp. 494–499).
Tomarchio, S.D., Punzo, A., Bagnato, L. (2020). Two new matrix-varaite distributions with application in model-based clustering. Computational Statistics and Data Analysis, 152(C).
Velilla, S. (1993). A note on the multivariate box-Cox transformation to normality. Statistics & Probability Letters, 17(4), 259–263.
Vilasuso, J. (1996). Changes in the duration of economic expansions and contractions in the United States. Applied Economics Letters, 3(12), 803–806.
Viroli, C. (2011a). Finite mixtures of matrix normal distributions for classifying three-way data. Statistics and Computing, 21, 511–522.
Viroli, C. (2011b). Model based clustering for three-way data structures. Bayesian Analysis, 6, 573–602.
Viroli, C. (2011c). On matrix-variate regression analysis. Journal of Multivariate Analysis, 111, 296–309.
Ward, J.H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236–244.
Worsley, K.J. (1979). On the likelihood ratio test for a shift in location of normal populations. Journal of the American Statistical Association, 74, 365–367.
Yeo, I., & Johnson, R.A. (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87, 954–959.
Zhao, L.C., Krishnaiah, P.R., Bai, Z.D. (1986a). On detection of the number of signals in presence of white noise. Journal of Multivariate Analysis, 20, 1–25.
Zhao, L.C., Krishnaiah, P.R., Bai, Z.D. (1986b). On detection of the number of signals when the noise covariance matrix is arbitrary. Journal of Multivariate Analysis, 20, 26–49.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Zhu, X., Melnykov, Y. On Finite Mixture Modeling of Change-point Processes. J Classif 39, 3–22 (2022). https://doi.org/10.1007/s00357-021-09385-6
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-021-09385-6