Skip to main content
Log in

Bayesian Sparse Topic Model

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

This paper presents a new Bayesian sparse learning approach to select salient lexical features for sparse topic modeling. The Bayesian learning based on latent Dirichlet allocation (LDA) is performed by incorporating the spike-and-slab priors. According to this sparse LDA (sLDA), the spike distribution is used to select salient words while the slab distribution is applied to establish the latent topic model based on those selected relevant words. The variational inference procedure is developed to estimate prior parameters for sLDA. In the experiments on document modeling using LDA and sLDA, we find that the proposed sLDA does not only reduce the model perplexity but also reduce the memory and computation costs. Bayesian feature selection method does effectively identify relevant topic words for building sparse topic model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19

Similar content being viewed by others

References

  1. Babacan, S.D., Molina, R., Katsaggelos, A.K. (2010). Bayesian compressive sensing using Laplace priors. IEEE Transactions on Image Processing, 19(1), 53–63.

    Article  MathSciNet  Google Scholar 

  2. Bishop, C.M. (2006). Pattern recognition and machine learning. New York: Springer Science.

  3. Blei, D., Ng, A., Jordan, M.I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3(5), 993–1022.

    MATH  Google Scholar 

  4. Chang, Y.-L., & Chien, J.-T. (2009). Latent Dirichlet learning for document summarization. In Proceedings of international conference on acoustics, speech, and signal processing (ICASSP) (1689–1692).

  5. Chang, Y.-L., Lee, K.-F., Chien, J.-T. (2011). Bayesian feature selection for sparse topic model. In Proceedings of IEEE international workshop on machine learning for signal processing (pp. 1–6).

  6. Chien, J.-T., & Chueh, C.-H. (2011). Dirichlet class language models for speech recognition. IEEE Transactions on Audio, Speech and Language Processing, 19(3), 482–495.

    Article  Google Scholar 

  7. Chueh, C.-H., & Chien, J.-T. (2008). Reliable feature selection for language model adaptation. In Proceedings of international conference on acoustics, speech, and signal processing (ICASSP) (pp. 5089–5092).

  8. Doshi-Velez, F., Miller, K.T., Van Gael, J., Teh, Y.W. (2009). Variational inference for the Indian buffet process. In Proceedings of artificial intelligence and statistics.

  9. Ghahramani, Z., Griffiths, T.L., Sollich, P. (2007). Bayesian nonparametric latent feature models. Bayesian Statistics, 8, 201–225.

    MathSciNet  Google Scholar 

  10. Gorur, D., Jakel, F., Rasmussen, C.E. (2006). A choice model with infinitely many latent features. In Proceedings of the international conference on machine learning (pp. 361–368).

  11. Griffiths, T.L., & Ghahramani, Z. (2005). Infinite latent feature models and the Indian buffet process. In Advances in neural information processing systems (NIPS) (Vol. 18).

  12. Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of international ACM SIGIR conference on research and development in information retrieval 50–57.

  13. Ishwaran, H., & Rao, J.S. (2005). Spike and slab variable selection: frequentist and Bayesian strategies. Annals of Statistics, 33(2), 730–773.

    Article  MATH  MathSciNet  Google Scholar 

  14. Meeds, E., Ghahramani, Z., Neal, R.M., Roweis, S.T. (2007). Modeling dyadic data with binary latent factors. In Advances in neural information processing systems (NIPS) (Vol. 19).

  15. Mimno, D., Hoffman, M.D., Blei, D.M. (2012). Sparse stochastic inference for latent Dirichlet allocation. International Conference on Machine Learning.

  16. Mitchell, T.J., & Beauchamp, J.J. (1988). Bayesian variable selection in linear regression. Journal of the American Statistical Association, 83(404), 1023–1032.

    Article  MATH  MathSciNet  Google Scholar 

  17. Mohamed, S., Heller, K., Ghahramani, Z. (2010). Sparse exponential family latent variable models. In Proceedings of NIPS workshop on practical applications of sparse modeling: Open issues and new directions.

  18. O‘Hara, R.B., & Sillanpaa, M.J. (2009). A review of Bayesian variable selection methods: what, how and which. Bayesian Analysis, 4(1), 85–118.

    Article  MathSciNet  Google Scholar 

  19. Saon, G., & Chien, J.-T. (2012). Bayesian sensing hidden Markov models. IEEE Transactions on Audio, Speech and Language Processing, 20(1), 43–54.

    Article  Google Scholar 

  20. Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M. (2006). Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476), 1566–1581.

    Article  MATH  MathSciNet  Google Scholar 

  21. Teh, Y.W., & Gorur, D. (2009). Indian buffet processes with power-law behavior. In Advances in neural information processing systems (Vol. 22, pp. 1838–1846).

  22. Tipping, M.E. (2001). Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research, 1, 211–244.

    MATH  MathSciNet  Google Scholar 

  23. Wang, C., & Blei, D.M. (2009). Decoupling sparsity and smoothness in the discrete hierarchical Dirichlet process. In Advances in neural information processing systems (NIPS) (Vol. 22).

  24. Williamson, S., Wang, C., Heller, K.A., Blei, D.M. (2010). The IBP compound Dirichlet process and its application to focused topic modeling. In Proceedings of international conference on machine learning.

Download references

Acknowledgments

This work was supported in part by the National Science Council, Taiwan, under Contract NSC 100-2221-E-009-153-MY3.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jen-Tzung Chien.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chien, JT., Chang, YL. Bayesian Sparse Topic Model. J Sign Process Syst 74, 375–389 (2014). https://doi.org/10.1007/s11265-013-0759-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-013-0759-x

Keywords

Navigation