Abstract
This paper deals with the Bayesian analysis of finite mixture models with a fixed number of component distributions from natural exponential families with quadratic variance function (NEF-QVF). A unified Bayesian framework addressing the two main difficulties in this context is presented, i.e., the prior distribution choice and the parameter unidentifiability problem. In order to deal with the first issue, conjugate prior distributions are used. An algorithm to calculate the parameters in the prior distribution to obtain the least informative one into the class of conjugate distributions is developed. Regarding the second issue, a general algorithm to solve the label-switching problem is presented. These techniques are easily applied in practice as it is shown with an illustrative example.






Similar content being viewed by others
References
Barndorff-Nielsen O (1978) Information and exponential families in statistical theory. Wiley, Newyork
Bernardo JM (1979) Reference posterior distributions. for Bayesian inference. J R Stat Soc B 41:113–147
Böhning D, Seidel W (2003) ‘Editorial: recent developments in mixture models. Comput Stat Data Anal 41:349–357
Celeux G, Hurn M, Robert CP (2000) Computational and inferential difficulties with mixture posterior distributions. J Am Stat Assoc 95:957–970
Consonni G, Veronese P (1992) Conjugate priors for exponential families having quadratic variance functions. J Am Stat Assoc 87(420):1123–1127
Diebolt J, Robert CP (1994) Estimation of finite mixture distributions through bayesian sampling. J R Stat Soc Ser B 56:363–375
Fernández C, Green P (2002)Modelling spatially correlated data via mixtures: a Bayesian approach. J R Stat Soc Ser B 64:805–826
Gelfand AE, Smith AFM (1990) Sampling-based approaches to calculating marginal densities. J Am Stat Assoc 85(410):398–409
Gilks WR, Richardson S, Spiegelhalter DJ (1998) Markov chain Monte Carlo in practice. Chapman and Hall, London
Gutiérrez-Peña E, Smith AFM (1997) Exponential and Bayesian conjugate families: review and extensions (with discussion). Test 6:1–90
Johnson DH, Gruner CM, Baggerly K, Seshagiri C (2001) Information-theoretic analysis of neural coding. J Comput Neurosci 10:47–69
McLachlan G, Peel D (2000) Finite mixture models. Wiley, Newyork
Morris CN (1982) Natural exponential families with quadratic variance functions. Ann stat 10:65–80
Morris CN (1983) Natural exponential families with quadratic variance functions: statistical theory. Ann Stat 11:515–529
Nobile A (1994) Bayesian analysis of finite mixture distributions. PhD dissertation, Department of Statistics, Carnegie Mellon University
Richarson S, Green PJ (1997) On Bayesian analysis of mixtures with an unknown number of components. J R Stat Soc 59(4):731–792
Roeder K, Wasserman L (1997) Practical Bayesian density estimation using mixtures of normals. J R Stat Soc 92:894–902
Stephens M (1997) Bayesian methods for mixtures of normal distributions. PhD thesis, University of Oxford
Stephens M (2000) Dealing with label-switching in mixture models. J R Stat Soc Ser B 62:795–809
Student (1906) On the error of counting with a haemocytometer. Biometrika 5:351–360
Titterington DM, Smith AFM, Makov UE (1985) Statistical analysis of finite mixture distributions. Wiley, New York
Viallefont V, Richarson S, Green PJ (2002) Bayesian analysis of Poisson mixtures. J Nonparametr Stat 14:181–202
Acknowledgments
Comments from David Ríos-Insua are gratefully acknowledged. This research was partially supported by Ministerio de Educatión y Ciencia, Spain (Project TSI2004-06801-C04-03).
Author information
Authors and Affiliations
Corresponding author
Appendix: Theoretical results
Appendix: Theoretical results
Lemma 1 is required for Proposition 1. In Algorithm 3, step 1 follows from Proposition 2 and step 2 follows from Proposition 1.
Lemma 1 The following expression holds:
Proof
Proposition 1 The following expression holds:
where C is a function that does not depend on the permutation v or on \(\widehat{\boldsymbol{\phi}}=(\widehat{\boldsymbol{\omega}}, \widehat{\boldsymbol{\mu}})\).
Proof
[By Lemma 1]
Proposition 2 The minimum over \(\widehat{\boldsymbol{\phi}}=(\widehat{\boldsymbol{\omega}}, \widehat{\boldsymbol{\mu}})\) of \(D=\sum\nolimits_{t=1}^{N} D\left[v_{t}\left(\boldsymbol{\phi}^{(t)}\right) \| \widehat{\boldsymbol{\phi}}\right]\) is achieved at:
Proof By Proposition 1, the problem can be divided into the following steps:
-
1.
Choose \(\widehat{\omega}_{j}(j=1, \ldots, k)\) to maximize
$$\sum\limits_{t=1}^{N} \omega_{v_{t}(j)}^{(t)} \log\ \widehat{\omega}_{j}+\left(1-\omega_{v_{t}(j)}^{(t)}\right) \log \left(1-\widehat{\omega}_{j}\right).$$(5) -
2.
Choose \(\widehat{\mu}_{j}(j=1, \ldots, k)\) to minimize
$$\sum\limits_{t=1}^{N}-\omega_{v_{t}(j)}^{(t)} \widehat{\theta}_{j}\left(\widehat{\mu}_{j}\right) \mu_{v_{t}(j)}^{(t)}+\omega_{v_{t}(j)}^{(t)} M\left(\widehat{\theta_{j}}\left(\widehat{\mu}_{j}\right)\right).$$(6)
Dividing (5) by N, it follows:
Deriving and equaling to zero, it follows:
To solve (6), it is calculated:
and making the derivative equal to zero, it follows:
Rights and permissions
About this article
Cite this article
Rufo, M.J., Martín, J. & Pérez, C.J. Bayesian analysis of finite mixture models of distributions from exponential families. Computational Statistics 21, 621–637 (2006). https://doi.org/10.1007/s00180-006-0018-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-006-0018-8