Abstract
Clustering, classification and Pattern Recognition in a set of data are between the most important tasks in statistical researches and in many applications. In this paper, we propose to use a mixture of Student-t distribution model for the data via a hierarchical graphical model and the Bayesian framework to do these tasks. The main advantages of this model is that the model accounts for the uncertainties of variances and covariances and we can use the Variational Bayesian Approximation (VBA) methods to obtain fast algorithms to be able to handle large data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Forgy, E.W.: Cluster analysis of multivariate data: efficiency vs interpretability of classifications. Biometrics 21, 768–769 (1965)
MacKay, D.J.C.: A practical Bayesian framework for backpropagation networks. Neural Comput. 4, 448–472 (1992)
Redner, R., Walker, H.: Mixture densities, maximum likelihood and the em algorithm. SIAM Rev. 26(2), 195–239 (1984)
Husmeier, D., Penny, W., Roberts, S.: Empirical evaluation of Bayesian sampling for neural classifiers. In: Niklason, M.L., Ziemke, T. (eds.): ICANN 98: Proceedings of the 8th International Conference on Artificial Neural Networks (1998)
Lee, T., Lewicki, M., Sejnowski, T.: Independent component analysis using an extended infomax algorithm for mixed sub-gaussian and super-gaussian sources. Neural Comput. 11, 409–433 (1999)
Hyvärinen, A., Oja, E.: Independent component analysis: algorithms and applications. Neural Netw. 13, 411–430 (2000)
Ma, J., Xu, L., Jordan, M.I.: Asymptotic convergence rate of the EM algorithm for Gaussian mixtures. Neural Comput. 12, 2881–2907 (2001)
Nielsen, F., Nock, R.: Clustering multivariate normal distributions. In: Nielsen, F. (ed.) ETVC 2008. LNCS, vol. 5416, pp. 164–174. Springer, Heidelberg (2009)
Quandt, R., Ramsey, J.: Estimating mixtures of normal distributions and switching regressions. J. Am. Stat. Assoc. 73, 2 (1978)
Hathaway, R.: A constrained formulation of maximum-likelihood estimation for normal mixture distributions. Ann. Stat. 13(2), 795–800 (1985)
Hastie, T., Tibshirani, R.: Discriminant analysis by gaussian mixture. J. Roy. Stat. Soc. 58, 155–176 (1996)
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B 39(1), 1–38 (1977)
Neal, R., Hinton, G.: A view of the EM algorithm that justifies incremental, sparse, and other variants. Learn. Graph. Models 89, 355–368 (1998)
Tipping, M.E., Bishop, C.M.: Mixtures of probabilistic principal components analysis. Neural Comput. 11, 443–482 (1999)
Jordan, M., Ghahramani, Z., Jaakkola, T., Saul, L.: An introduction to variational methods for graphical models. Mach. Learn. 37, 183–233 (2006)
Jaakkola, T.S., Jordan, M.I.: Bayesian parameter estimation via variational methods. Stat. Comput. 10, 25–37 (2000)
Friston, K., Penny, W.: Bayesian inference and posterior probability maps. In: Proceedings of the 9th International Conference on Neural Information Processing (ICONIP 2002), pp. 413–417 (2002)
Winn, J., Bishop, C.M., Jaakkola, T.: Variational message passing. J. Mach. Learn. Res. 6, 661–694 (2005)
David, M.B., Michael, I.J.: Variational inference for the dirichlet process mixtures. Bayesian Anal. 1(1), 121–144 (2006)
Beal, M.: Variational Algorithms for Approximate Bayesian Inference. Ph.D. thesis, Gatsby Computational Neuroscience Unit, University College London (2003)
Beal, M., Ghahramani, Z.: Variational Bayesian learning of directed graphical models with hidden variables. Bayesian Stat. 1(4), 793–832 (2006)
Kim, H., Ghahramani, Z.: Bayesian gaussian process classification with the em-ep algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1948–1959 (2006)
Nasios, N., Bors, A.: Variational learning for gaussian mixture models. IEEE Trans. Syst. Man Cybern. Part B 36, 849–862 (2006)
Ghahramani, Z., Griffiths, T., Sollich, P.: Bayesian nonparametric latent feature models. Bayesian Stat. 8, 1–25 (2007)
McGrory, C., Titterington, D.: Variational approximations in Bayesian model selection for finite mixture distributions. Comput. Stat. Data Anal. 51, 5352–5367 (2007)
Qiao, Z., Zhou, L., Huang, J.Z.: Sparse linear discriminant analysis with applications to high dimensional low sample size data. Int. J. Appl. Math. 39, 48–60 (2008)
Acknowledgment
This work has been supported partially by the C5SYS (https://www.erasysbio.net/index.php?index=272) project of ERASYSBIO.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mohammad-Djafari, A. (2015). Variational Bayesian Approximation Method for Classification and Clustering with a Mixture of Student-t Model. In: Nielsen, F., Barbaresco, F. (eds) Geometric Science of Information. GSI 2015. Lecture Notes in Computer Science(), vol 9389. Springer, Cham. https://doi.org/10.1007/978-3-319-25040-3_77
Download citation
DOI: https://doi.org/10.1007/978-3-319-25040-3_77
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25039-7
Online ISBN: 978-3-319-25040-3
eBook Packages: Computer ScienceComputer Science (R0)