Abstract
This paper deals with the unsupervised classification of univariate observations. Given a set of observations originating from a K-component mixture, we focus on the estimation of the component expectations. We propose an algorithm based on the minimization of the “K-product” (KP) criterion we introduced in a previous work. We show that the global minimum of this criterion can be reached by first solving a linear system then calculating the roots of some polynomial of order K. The KP global minimum provides a first raw estimate of the component expectations, then a nearest-neighbour classification enables to refine this estimation. Our method’s relevance is finally illustrated through simulations of various mixtures. When the mixture components do not strongly overlap, the KP algorithm provides better estimates than the Expectation-Maximization algorithm.
Similar content being viewed by others
References
Berkin P (2006). A survey of clustering data mining techniques. In: Kogan, J, Nicholas, C and Teboulle, M (eds) Grouping multidimensional data: recent advances in clustering, pp 25–71. Springer, Berlin
Bradley PS, Fayyad UM (1998) Refining initial points for K-means clustering. In: Proceedings of the 15th international conference on Machine Learning. Morgan Kaufmann, San-Fransisco, pp 91–99
Bojanczyk AW, Brent RP and Hoog FR (1995). Stability analysis of a general Toeplitz system solver. Numer Algorithms 10: 225–244
Celeux G, Chauveau D, Diebolt J (1995) On stochastic versions of the EM algorithm. INRIA research report no 2514, available http://www.inria.fr/rrrt/rr-2514.html
Dempster A, Laird N and Rubin D (1977). Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39: 1–38
Fisher WD (1958). On grouping for maximum homogeneity. J Am Stat Assoc 53(284): 789–798
Fitzgibbon LJ, Allison L and Dowe DL (2000). Minimum message length grouping of ordered data. In: Arimura, H and Jain, S (eds) Proceedings of the 11th international conference on algorithmic learning theory, Sydney, Australia, pp 56–70. LNAI, Springer, Berlin
Hartigan J and Wong M (1979). A K-means clustering algorithm. J Appl Stat 28: 100–108
Krishna K and Narasimha Murty M (1999). Genetic K-means algorithm. IEEE Trans Syst Man Cybern B Cybern 29(3): 433–439
Lindsay B and Furman D (1994). Measuring the relative effectiveness of moment estimators as starting values in maximizing likelihoods. Comput Stat Data Anal 17(5): 493–507
McLachlan G and Peel D (2000). Finite mixture models. Wiley, New York
Parzen E (1962). On estimation of a probability density function and mode. Ann Math Stat 33: 1065–1076
Paul N, Terre M, Fety L (2006) The K-product criterion for Gaussian mixture estimation. In: Proceedings of the 7th Nordic signal processing symposium, Reykjavik, Iceland, pp 334–337, doi:10.1109/NORSIG.2006.275248
Pernkopf F and Bouchaffra D (2005). Genetic-based EM algorithm for learning Gaussian mixture models. IEEE Trans Pattern Anal Mach Intell 27(8): 1344–1348
Uhlig F (1999). General polynomial roots and their multiplicities in O(n) memory and O(n 2) time. Linear Multilinear Algebra 46(4): 327–359
Xu R and Wunsch II D (2005). Survey of clustering algorithms. IEEE Trans Neural Netw 16(3): 645–678
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Paul, N., Terre, M. & Fety, L. A global algorithm to estimate the expectations of the components of an observed univariate mixture. ADAC 1, 201–219 (2007). https://doi.org/10.1007/s11634-007-0014-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-007-0014-z