Abstract
This paper presents a new Bayesian sparse learning approach to select salient lexical features for sparse topic modeling. The Bayesian learning based on latent Dirichlet allocation (LDA) is performed by incorporating the spike-and-slab priors. According to this sparse LDA (sLDA), the spike distribution is used to select salient words while the slab distribution is applied to establish the latent topic model based on those selected relevant words. The variational inference procedure is developed to estimate prior parameters for sLDA. In the experiments on document modeling using LDA and sLDA, we find that the proposed sLDA does not only reduce the model perplexity but also reduce the memory and computation costs. Bayesian feature selection method does effectively identify relevant topic words for building sparse topic model.
Similar content being viewed by others
References
Babacan, S.D., Molina, R., Katsaggelos, A.K. (2010). Bayesian compressive sensing using Laplace priors. IEEE Transactions on Image Processing, 19(1), 53–63.
Bishop, C.M. (2006). Pattern recognition and machine learning. New York: Springer Science.
Blei, D., Ng, A., Jordan, M.I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(5), 993–1022.
Chang, Y.-L., & Chien, J.-T. (2009). Latent Dirichlet learning for document summarization. In Proceedings of international conference on acoustics, speech, and signal processing (ICASSP) (1689–1692).
Chang, Y.-L., Lee, K.-F., Chien, J.-T. (2011). Bayesian feature selection for sparse topic model. In Proceedings of IEEE international workshop on machine learning for signal processing (pp. 1–6).
Chien, J.-T., & Chueh, C.-H. (2011). Dirichlet class language models for speech recognition. IEEE Transactions on Audio, Speech and Language Processing, 19(3), 482–495.
Chueh, C.-H., & Chien, J.-T. (2008). Reliable feature selection for language model adaptation. In Proceedings of international conference on acoustics, speech, and signal processing (ICASSP) (pp. 5089–5092).
Doshi-Velez, F., Miller, K.T., Van Gael, J., Teh, Y.W. (2009). Variational inference for the Indian buffet process. In Proceedings of artificial intelligence and statistics.
Ghahramani, Z., Griffiths, T.L., Sollich, P. (2007). Bayesian nonparametric latent feature models. Bayesian Statistics, 8, 201–225.
Gorur, D., Jakel, F., Rasmussen, C.E. (2006). A choice model with infinitely many latent features. In Proceedings of the international conference on machine learning (pp. 361–368).
Griffiths, T.L., & Ghahramani, Z. (2005). Infinite latent feature models and the Indian buffet process. In Advances in neural information processing systems (NIPS) (Vol. 18).
Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of international ACM SIGIR conference on research and development in information retrieval 50–57.
Ishwaran, H., & Rao, J.S. (2005). Spike and slab variable selection: frequentist and Bayesian strategies. Annals of Statistics, 33(2), 730–773.
Meeds, E., Ghahramani, Z., Neal, R.M., Roweis, S.T. (2007). Modeling dyadic data with binary latent factors. In Advances in neural information processing systems (NIPS) (Vol. 19).
Mimno, D., Hoffman, M.D., Blei, D.M. (2012). Sparse stochastic inference for latent Dirichlet allocation. International Conference on Machine Learning.
Mitchell, T.J., & Beauchamp, J.J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association, 83(404), 1023–1032.
Mohamed, S., Heller, K., Ghahramani, Z. (2010). Sparse exponential family latent variable models. In Proceedings of NIPS workshop on practical applications of sparse modeling: Open issues and new directions.
O‘Hara, R.B., & Sillanpaa, M.J. (2009). A review of Bayesian variable selection methods: what, how and which. Bayesian Analysis, 4(1), 85–118.
Saon, G., & Chien, J.-T. (2012). Bayesian sensing hidden Markov models. IEEE Transactions on Audio, Speech and Language Processing, 20(1), 43–54.
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M. (2006). Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476), 1566–1581.
Teh, Y.W., & Gorur, D. (2009). Indian buffet processes with power-law behavior. In Advances in neural information processing systems (Vol. 22, pp. 1838–1846).
Tipping, M.E. (2001). Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1, 211–244.
Wang, C., & Blei, D.M. (2009). Decoupling sparsity and smoothness in the discrete hierarchical Dirichlet process. In Advances in neural information processing systems (NIPS) (Vol. 22).
Williamson, S., Wang, C., Heller, K.A., Blei, D.M. (2010). The IBP compound Dirichlet process and its application to focused topic modeling. In Proceedings of international conference on machine learning.
Acknowledgments
This work was supported in part by the National Science Council, Taiwan, under Contract NSC 100-2221-E-009-153-MY3.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chien, JT., Chang, YL. Bayesian Sparse Topic Model. J Sign Process Syst 74, 375–389 (2014). https://doi.org/10.1007/s11265-013-0759-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-013-0759-x