Abstract
This paper develops a novel online algorithm, namely moving average stochastic variational inference (MASVI), which applies the results obtained by previous iterations to smooth out noisy natural gradients. We analyze the convergence property of the proposed algorithm and conduct a set of experiments on two large-scale collections that contain millions of documents. Experimental results indicate that in contrast to algorithms named ‘stochastic variational inference’ and ‘SGRLD’, our algorithm achieves a faster convergence rate and better performance.
Similar content being viewed by others
References
Amari, S., 1998. Natural gradient works efficiently in learning. Neur. Comput., 10(2):251–276. [doi:10.1162/089976698300017746]
Andrieu, C., de Freitas, N., Doucet, A., et al., 2003. An introduction to MCMC for machine learning. Mach. Learn., 50(1–2):5–43. [doi:10.1023/A:1020281327116]
Blatt, D., Hero, A.O., Gauchman, H., 2007. A convergent incremental gradient method with a constant step size. SIAM J. Optim., 18(1):29–51. [doi:10.1137/040615961]
Blei, D.M., 2012. Probabilistic topic models. Commun. ACM, 55(4):77–84. [doi:10.1145/2133806.2133826]
Blei, D.M., Ng, A.Y., Jordan, M.I., 2003. Latent Dirichlet allocation. J. Mach. Learn. Res., 3:993–1022.
Canini, K.R., Shi, L., Griffiths, T.L., 2009. Online inference of topics with latent Dirichlet allocation. J. Mach. Learn. Res., 5(2):65–72.
Griffiths, T.L., Steyvers, M., 2004. Finding scientific topics. PNAS, 101(suppl 1):5228–5235. [doi:10.1073/pnas.0307752101]
Hoffman, M., Bach, F.R., Blei, D.M., 2010. Online learning for latent Dirichlet allocation. Advances in Neural Information Processing Systems, p.856–864.
Hoffman, M., Blei, D.M., Wang, C., et al., 2013. Stochastic variational inference. J. Mach. Learn. Res., 14(1): 1303–1347.
Liu, Z., Zhang, Y., Chang, E.Y., et al., 2011. PLDA+: parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Trans. Intell. Syst. Technol., 2(3), Article 26.
Newman, D., Asuncion, A., Smyth, P., et al., 2009. Distributed algorithms for topic models. J. Mach. Learn. Res., 10:1801–1828.
Ouyang, J., Lu, Y., Li, X., 2014. Momentum online LDA for large-scale datasets. Proc. 21st European Conf. on Artificial Intelligence, p.1075–1076.
Patterson, S., Teh, Y.W., 2013. Stochastic gradient Riemannian Langevin dynamics on the probability simplex. Advances in Neural Information Processing Systems, p.3102–3110.
Ranganath, R., Wang, C., Blei, D.M., et al., 2013. An adaptive learning rate for stochastic variational inferencen. J. Mach. Learn. Res., 28(2):298–306.
Schaul, T., Zhang, S., LeCun, Y., 2013. No more pesky learning rates. arXiv preprint, arXiv:1206:1106v2.
Song, X., Lin, C.Y., Tseng, B.L., et al., 2005. Modeling and predicting personal information dissemination behavior. Proc. 11th ACM SIGKDD Int. Conf. on Knowledge Discovery in Data Mining, p.479–488. [doi:10.1145/1081870.1081925]
Tadić, V.B., 2009. Convergence rate of stochastic gradient search in the case of multiple and non-isolated minima. arXiv preprint, arXiv:0904.4229v2.
Teh, Y.W., Newman, D., Welling, M., 2007. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. Advances in Neural Information Processing Systems, p.1353–1360.
Wang, C., Chen, X., Smola, A.J., et al., 2013. Variance reduction for stochastic gradient optimization. Advances in Neural Information Processing Systems, p.181–189.
Wang, Y., Bai, H., Stanton, M., et al., 2009. PLDA: parallel latent Dirichlet allocation for large-scale applications. Proc. 5th Int. Conf. on Algorithmic Aspects in Information and Management, p.301–314. [doi:10.1007/978-3-642-02158-9_26]
Yan, F., Xu, N., Qi, Y., 2009. Parallel inference for latent Dirichlet allocation on graphics processing units. Advances in Neural Information Processing Systems, p.2134–2142.
Ye, Y., Gong, S., Liu, C., et al., 2013. Online belief propagation algorithm for probabilistic latent semantic analysis. Front. Comput. Sci., 7(5):526–535. [doi:10.1007/s11704-013-2360-7]
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Natural Science Foundation of China (Nos. 61170092, 61133011, and 61103091)
ORCID: Xi-ming LI, http://orcid.org/0000-0001-8190-5087
Rights and permissions
About this article
Cite this article
Li, Xm., Ouyang, Jh. & Lu, Y. Topic modeling for large-scale text data. Frontiers Inf Technol Electronic Eng 16, 457–465 (2015). https://doi.org/10.1631/FITEE.1400352
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1400352