Abstract
This paper proposes a new inference for the latent Dirichlet allocation (LDA) [4]. Our proposal is an instance of the stochastic gradient variational Bayes (SGVB) [9, 13]. SGVB is a general framework for devising posterior inferences for Bayesian probabilistic models. Our aim is to show the effectiveness of SGVB by presenting an example of SGVB-type inference for LDA, the best-known Bayesian model in text mining. The inference proposed in this paper is easy to implement from scratch. A special feature of the proposed inference is that the logistic normal distribution is used to approximate the true posterior. This is counterintuitive, because we obtain the Dirichlet distribution by taking the functional derivative when we lower bound the log evidence of LDA after applying a mean field approximation. However, our experiment showed that the proposed inference gave a better predictive performance in terms of test set perplexity than the inference using the Dirichlet distribution for posterior approximation. While the logistic normal is more complicated than the Dirichlet, SGVB makes the manipulation of the expectations with respect to the posterior relatively easy. The proposed inference was better even than the collapsed Gibbs sampling [6] for not all but many settings consulted in our experiment. It must be worthwhile future work to devise a new inference based on SGVB also for other Bayesian models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Precisely speaking, the VB presented in [4] performs a point estimation for the per-topic word multinomial distributions. In the VB we call standard here, a Bayesian inference is performed also for the per-topic word multinomial distributions, not only for the per-document topic multinomial distributions.
- 2.
- 3.
- 4.
We used the XML files from medline14n0770.xml to medline14n0774.xml.
References
Aitchison, J., Shen, S.-M.: Logistic-normal distributions: some properties and uses. Biometrika 67(2), 261–272 (1980)
Asuncion, A., Welling, M., Smyth, P., Teh, Y.W.: On smoothing and inference for topic models. In: UAI, pp. 27–34 (2009)
Blei, D.M., Lafferty, J.D.: Correlated topic models. In: NIPS, pp. 147–154 (2005)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. JMLR 3, 993–1022 (2003)
Brooks, S., Gelman, A., Jones, G., Meng, X.-L.: Handbook of Markov Chain Monte Carlo. CRC Press, Boca Raton (2011)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. PNAS 101(Suppl 1), 5228–5235 (2004)
Kang, J.-H., Lerman, K., Getoor, L.: LA-LDA: a limited attention topic model for social recommendation. In: Greenberg, A.M., Kennedy, W.G., Bos, N.D. (eds.) SBP 2013. LNCS, vol. 7812, pp. 211–220. Springer, Heidelberg (2013)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
Kingma, D.P., Welling, M.: Stochastic gradient VB and the variational auto-encoder. In: ICLR (2014)
Lin, T.-Y., Tian, W.-T., Mei, Q.-Z., Cheng, H.: The dual-sparse topic model: mining focused topics and focused terms in short text. In: WWW, pp. 539–550 (2014)
Mnih, A., Gregor, K.: Neural variational inference and learning in belief networks. In: ICML, pp. 1791–1799 (2014)
O’Connor, B., Stewart, B.M., Smith, N.A.: Learning to extract international relations from political context. In: ACL, pp. 1094–1104 (2013)
Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: ICML, pp. 1278–1286 (2014)
Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, New York (2004)
Sasaki, K., Yoshikawa, T., Furuhashi, T.: Online topic model for Twitter considering dynamics of user interests and topic trends. In: EMNLP, pp. 1977–1985 (2014)
Teh, Y.-W., Newman, D., Welling, M.: A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In: NIPS, pp. 1353–1360 (2007)
Vosecky, J., Leung, KW.-T., Ng, W.: Collaborative personalized Twitter search with topic-language models. In: SIGIR, pp. 53–62 (2014)
Yan, F., Xu, N.-Y., Qi, Y.: Parallel inference for latent Dirichlet allocation on graphics processing units. In: NIPS, pp. 2134–2142 (2009)
Zhao, H.-S., Jiang, B.-Y., Canny, J.F., Jaros, B.: SAME but different: fast and high quality Gibbs parameter estimation. In: KDD, pp. 1495–1502 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Masada, T., Takasu, A. (2016). A Simple Stochastic Gradient Variational Bayes for Latent Dirichlet Allocation. In: Gervasi, O., et al. Computational Science and Its Applications -- ICCSA 2016. ICCSA 2016. Lecture Notes in Computer Science(), vol 9789. Springer, Cham. https://doi.org/10.1007/978-3-319-42089-9_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-42089-9_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42088-2
Online ISBN: 978-3-319-42089-9
eBook Packages: Computer ScienceComputer Science (R0)