Abstract
The paper provides a practical guide on initialization of the recursive mixture-based clustering of non-negative data. For modeling the non-negative data, mixtures of uniform, exponential, gamma and other distributions can be used. Initialization is known to be an important task for a start of the mixture estimation algorithm. Within the considered recursive approach, the key point of initialization is a choice of initial statistics of the involved prior distributions. The paper describes several initialization techniques for the mentioned types of components that can be beneficial primarily from a practical point of view.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Kárný, M., Kadlec, J., Sutanto, E.L.: Quasi-Bayes estimation applied to normal mixture. In: Rojíček, J., Valečková, M., Kárný, M., Warwick K. (eds.) Preprints of the 3rd European IEEE Workshop on Computer-Intensive Methods in Control and Data Processing, CMP’98 /3./, Prague, CZ, pp. 77–82 (1998)
Kárný, M., et al.: Optimized Bayesian Dynamic Advising: Theory and Algorithms. Springer, London (2006)
Nagy, I., Suzdaleva, E., Kárný, M., Mlynářová, T.: Bayesian estimation of dynamic finite mixtures. Int. J. Adapt. Control. Signal Process. 25(9), 765–787 (2011)
Nagy, I., Suzdaleva, E.: Algorithms and Programs of Dynamic Mixture Estimation. Unified Approach to Different Types of Components. SpringerBriefs in Statistics. Springer International Publishing, Cham (2017)
Roy, A., Pal, A., Garain, U.: JCLMM: a finite mixture model for clustering of circular-linear data and its application to psoriatic plaque segmentation. Pattern Recognit. 66, (2017). https://doi.org/10.1016/j.patcog.2016.12.016
Bouveyron, C., Brunet-Saumard, C.: Model-based clustering of high-dimensional data: a review. Comput. Stat. Data Anal. 71, 52–78 (2014)
Scrucca, L.: Genetic algorithms for subset selection in model-based clustering. Unsupervised Learning Algorithms, pp. 55–70. Springer International Publishing, Cham (2016)
Fernández, D., Arnold, R., Pledger, S.: Mixture-based clustering for the ordered stereotype model. Comput. Stat. Data Anal. 93, 46–75 (2016)
Suzdaleva, E., Nagy, I., Mlynářová, T.: Recursive estimation of mixtures of exponential and normal distributions. In: Proceedings of the 8th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, Warsaw, Poland, 24–26 Sept 2015, pp. 137–142 (2015)
Browne, R.P., McNicholas, P.D.: A mixture of generalized hyperbolic distributions. Can. J. Stat. 43(2), 176–198 (2015)
Morris, K., McNicholas, P.D.: Clustering, classification, discriminant analysis, and dimension reduction via generalized hyperbolic mixtures. Comput. Stat. Data Anal. 97, 133–150 (2016)
Malsiner-Walli, G., Frühwirth-Schnatter, S., Grün, B.: Model-based clustering based on sparse finite Gaussian mixtures. Stat. Comput. 26(1–2), 303–324 (2016)
Li, R., Wang, Z., Gu, C., Li, F., Wu, H.: A novel time-of-use tariff design based on Gaussian Mixture Model. Appl. Energy 162, 1530–1536 (2016)
O’Hagan, A., Murphy, T.B., Gormley, I.C., McNicholas, P.D., Karlis, D.: Clustering with the multivariate normal inverse Gaussian distribution. Comput. Stat. Data Anal. 93, 18–30 (2016)
Suzdaleva, E., Nagy, I., Pecherková, P., Likhonina, R.: Initialization of recursive mixture-based clustering with uniform components. In: Proceedings of the 14th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2017), Madrid, Spain, 26–28 July 2017, pp. 449–458 (2017)
Scrucca, L., Raftery, A.E.: Improved initialisation of model-based clustering using Gaussian hierarchical partitions. Adv. Data Anal. Classif. 9(4), 447–460 (2015)
Melnykov, V., Melnykov, I.: Initializing the EM algorithm in Gaussian mixture models with an unknown number of components. Comput. Stat. Data Anal. 56(6), 1381–1395 (2012)
Kwedlo, W.: A new method for random initialization of the EM algorithm for multivariate Gaussian mixture learning. In: Burduk, R., Jackowski, K., Kurzynski, M., Wozniak, M., Zolnierek, A. (eds.) Proceedings of the 8th International Conference on Computer Recognition Systems CORES 2013, pp. 81–90. Springer International Publishing, Heidelberg (2013)
Shireman, E., Steinley, D., Brusco, M.J.: Examining the effect of initialization strategies on the performance of Gaussian mixture modeling. Behav. Res. Methods 1–12 (2015)
Maitra, R.: Initializing partition-optimization algorithms. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 6(1), 144–157 (2009)
Gupta, M.R., Chen, Y.: Theory and use of the EM method. Found. Trends Signal Process. 4(3), 223–296 (2011)
Peterka, V.: Bayesian system identification. In: Eykhoff, P. (ed.) Trends and Progress in System Identification, pp. 239–304. Pergamon Press, Oxford (1981)
Nagy, I., Suzdaleva, E., Mlynářová, T.: Mixture-based clustering non-Gaussian data with fixed bounds. In: Proceedings of the IEEE International Conference Intelligent Systems IS’16, pp. 265–271 (2016)
Suzdaleva, E., Nagy, I., Mlynářová, T.: Expert-based initialization of recursive mixture estimation. In: Proceedings of the IEEE International Conference Intelligent Systems IS’16, pp. 308–315 (2016)
Kárný, M., Nedoma, P., Khailova, N., Pavelková, L.: Prior information in structure estimation. IEE Proc. Control. Theory Appl. 150(6), 643–653 (2003)
Nagy, I., Suzdaleva, E., Pecherková, P.: Comparison of various definitions of proximity in mixture estimation. In: Proceedings of the 13th International Conference on Informatics in Control, Automation and Robotics (ICINCO), pp. 527–534 (2016)
Casella, G., Berger R.L.: Statistical Inference, 2nd edn. Duxbury Press (2001)
Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 31(8), 651–666 (2010)
DeGroot, M.: Optimal Statistical Decisions. McGraw-Hill, New York (1970)
Spiegel, M.R.: Theory and Problems of Probability and Statistics. McGraw-Hill, New York (1992)
Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis. Pearson Prentice Hall, Upper Saddle River (2007)
Elfessi, A., Reineke, D.M.: A Bayesian look at classical estimation: the exponential distribution. J. Stat. Educ. 9(1) (2001)
Minka, T.P.: Estimating a gamma distribution. Microsoft Res. (2002)
Acknowledgements
The paper was supported by project GAČR GA15-03564S.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Suzdaleva, E., Nagy, I. (2020). Practical Initialization of Recursive Mixture-Based Clustering for Non-negative Data. In: Gusikhin, O., Madani, K. (eds) Informatics in Control, Automation and Robotics . ICINCO 2017. Lecture Notes in Electrical Engineering, vol 495. Springer, Cham. https://doi.org/10.1007/978-3-030-11292-9_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-11292-9_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11291-2
Online ISBN: 978-3-030-11292-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)