Definition
Model-based clustering is a statistical approach to data clustering. The observed (multivariate) data is assumed to have been generated from a finite mixture of component models. Each component model is a probability distribution, typically a parametric multivariate distribution. For example, in a multivariate Gaussian mixture model, each component is a multivariate Gaussian distribution. The component responsible for generating a particular observation determines the cluster to which the observation belongs. However, the component generating each observation as well as the parameters for each of the component distributions are unknown. The key learning task is to determine the component responsible for generating each observation, which in turn gives the clustering of the data. Ideally, observations generated from the same component are inferred to belong to the same cluster. In addition to inferring the component assignment of observations, most popular learning approaches...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Banerjee A, Merugu S, Dhillon I, Ghosh J (2005) Clustering with Bregman divergences. J Mach Learn Res 6:1705–1749
Bilmes J (1997) A gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models. Technical Report ICSI-TR-97-02, University of Berkeley
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Dasgupta S (1999) Learning mixtures of Gaussians. In: IEEE symposium on foundations of Computer Science (FOCS). IEEE Press, Washington, DC
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38
Kannan R, Salmasian H, Vempala S (2005) The spectral method for general mixture models. In: Conference on learning theory (COLT)
McLachlan GJ, Krishnan T (1996) The EM algorithm and extensions. Wiley-Interscience, New York
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley series in probability and mathematical statistics: applied probability and statistics section. Wiley, New York
Neal RM, Hinton GE (1998) A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Jordan MI (ed) Learning in graphical models (pp 355–368). MIT Press, Cambridge, MA
Redner R, Walker H (1984) Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev 26(2):195–239
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this entry
Cite this entry
Banerjee, A., Shan, H. (2017). Model-Based Clustering. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_554
Download citation
DOI: https://doi.org/10.1007/978-1-4899-7687-1_554
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4899-7685-7
Online ISBN: 978-1-4899-7687-1
eBook Packages: Computer ScienceReference Module Computer Science and Engineering