Abstract
In this paper we present an infinite mixture model based on inverted Dirichlet distributions. The proposed mixture is learned using a fully Bayesian approach and allows to overcome a challenging issue when dealing with data clustering namely the automatic selection of the number of clusters. We explore the performance of the proposed approach on the challenging problem of text categorization. The results show that the proposed approach is effective for positive data modeling when compared to those reported using infinite Gaussian mixture.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)
Bouguila, N., Ziou, D.: MML-Based Approach for Finite Dirichlet Mixture Estimation and Selection. In: Perner, P., Imiya, A. (eds.) MLDM 2005. LNCS (LNAI), vol. 3587, pp. 42–51. Springer, Heidelberg (2005)
Bouguila, N., Ziou, D.: On Fitting Finite Dirichlet Mixture Using ECM and MML. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds.) ICAPR 2005. LNCS, vol. 3686, pp. 172–182. Springer, Heidelberg (2005)
Rasmussen, C.E.: The Infinite Gaussian Mixture Model. In: Advances in Neural Information Processing Systems (NIPS), pp. 554–560 (2000)
Biernacki, C., Celleux, G., Govaert, G., Langrognet, F.: Model-Based Cluster and Discriminant Analysis with the MIXMOD Software. Computational Statistics and Data Analysis 51, 587–600 (2006)
Boutemedjet, S., Ziou, D., Bouguila, N.: Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data. In: Advances in Neural Information Processing Systems (NIPS), pp. 177–184 (2007)
Bdiri, T., Bouguila, N.: Learning Inverted Dirichlet Mixtures for Positive Data Clustering. In: Kuznetsov, S.O., Ślęzak, D., Hepting, D.H., Mirkin, B.G. (eds.) RSFDGrC 2011. LNCS, vol. 6743, pp. 265–272. Springer, Heidelberg (2011)
Antoniak, C.E.: Mixtures of Dirichlet Processes With Applications to Bayesian Nonparametric Problems. The Annals of Statistics 2(6), 1152–1174 (1974)
Dunson, D.B., Pillai, N., Park, J.-H.: Bayesian Density Regression. Journal of the Royal Statistical Society (B) 69(2), 163–183 (2007)
Duan, J.A., Guindani, M., Gelfand, A.E.: Generalized Spatial Dirichlet Process Models. Biometrika 94(4), 809–825 (2007)
Bouguila, N., Ziou, D.: A Dirichlet Process Mixture of Dirichlet Distributions for Classification and Prediction. In: Proc. of the IEEE Workshop on Machine Learning for Signal Processing (MLSP), pp. 297–302 (2008)
Bouguila, N., Ziou, D.: A Nonparametric Bayesian Learning Model: Application to Text and Image Categorization. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS, vol. 5476, pp. 463–474. Springer, Heidelberg (2009)
Rodriguez, A., Dunson, D.B., Gelfand, A.E.: Bayesian Nonparametric Functional Data Analysis Through Density Estimation. Biometrika 96(1), 149–162 (2009)
Bouguila, N., Ziou, D.: A Dirichlet Process Mixture of Generalized Dirichlet Distributions for Proportional Data Modeling. IEEE Transactions on Neural Networks 21(1), 107–122 (2010)
Geman, S., Geman, D.: Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6, 721–741 (1984)
Gelfand, A.E., Smith, A.F.M.: Sampling-Based Approaches to Calculating Marginal Densities. Journal of the American Statistical Association 85, 398–409 (1990)
Hastings, W.K.: Monte Carlo Sampling Methods using Markov Chains and their Applications. Biometrika 57, 97–109 (1970)
Tiao, G.G., Cuttman, I.: The Inverted Dirichlet Distribution with Applications. Journal of the American Statistical Association 60(311), 793–805 (1965)
Stoica, P., Selen, Y., Li, J.: Multi-Model Approach to Model Selection. Digital Signal Processing 14, 399–412 (2004)
Neal, R.M.: Markov Chain Sampling Methods for Dirichlet Process Mixture Models. Journal of Computational and Graphical Statistics 9, 249–265 (2000)
Gilks, W.R., Wild, P.: Algorithm AS 287: Adaptive Rejection Sampling from Log-Concave Density Functions. Applied Statistics 42(4), 701–709 (1993)
Joachims, T.: Estimating the Generalization Performance of an SVM Efficiently. In: Proc. of ICML, pp. 431–438 (2000)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bdiri, T., Bouguila, N. (2011). An Infinite Mixture of Inverted Dirichlet Distributions. In: Lu, BL., Zhang, L., Kwok, J. (eds) Neural Information Processing. ICONIP 2011. Lecture Notes in Computer Science, vol 7063. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24958-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-24958-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24957-0
Online ISBN: 978-3-642-24958-7
eBook Packages: Computer ScienceComputer Science (R0)