Abstract
Exploiting prior knowledge in the Bayesian learning process is one way to improve the quality of Bayesian model. To the best of our knowledge, however, there is no formal research about the influence of prior in streaming environment. In this paper, we address the problem of using prior knowledge in streaming Bayesian learning, and develop a framework for keeping priors in streaming learning (KPS) that maintains knowledge from the prior through each minibatch of streaming data. We demonstrate the performance of our framework in two scenarios: streaming learning for latent Dirichlet allocation and streaming text classification in comparison with methods that do not keep prior.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmed, A., Xing, E.P.: Staying informed: supervised and semi-supervised multi-view topical analysis of ideological perspective. In: Empirical Methods in Natural Language Processing, pp. 1140–1150 (2010)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Broderick, T., Boyd, N., Wibisono, A., Wilson, A.C., Jordan, M.I.: Streaming variational Bayes. In: Advances in Neural Information Processing Systems, pp. 1727–1735 (2013)
Jo, Y., Oh, A.H.: Aspect and sentiment unification model for online review analysis. In: ACM international Conference on Web Search and Data Mining, pp. 815–824 (2011)
Kluckhohn, C.: Human behavior and the principle of least effort. George Kingsley Zipf. Am. Anthropol. 52(2), 268–270 (1950)
Le, V., Phung, C., Vu, C., Linh, N.V., Than, K.: Streaming sentiment-aspect analysis. In: RIVF, pp. 181–186 (2016)
Lin, C., He, Y., Pedrinaci, C., Domingue, J.: Feature LDA: a supervised topic model for automatic detection of web API documentations from the web. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 328–343. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35176-1_21
McInerney, J., Ranganath, R., Blei, D.M.: The population posterior and Bayesian inference on streams. In: International Conference on Neural Information Processing Systems (2015)
Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Empirical Methods in Natural Language Processing, pp. 262–272 (2011)
Newman, M.E.J.: Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46(5), 323–351 (2005)
Piantadosi, S.T.: Zipfs word frequency law in natural language: a critical review and future directions. Psychon. Bull. Rev. 21(5), 1112–1130 (2014)
Sato, I., Nakagawa, H.: Topic models with power-law using Pitman-Yor process. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 673–682 (2010)
Van Linh, N., Anh, N.K., Than, K., Dang, C.N.: An effective and interpretable method for document classification. Knowl. Inf. Syst. 50(3), 763–793 (2016)
Viet, H., Phung, D., Venkatesh, S.: Streaming variational inference for dirichlet process mixtures. In: Asian Conference on Machine Learning (2015)
Acknowledgement
This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.05-2014.28, and by the Air Force Office of Scientific Research (AFOSR), Asian Office of Aerospace Research & Development (AOARD), and US Army International Technology Center, Pacific (ITC-PAC) under award number FA2386-15-1-4011.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Duc, A.N., Van Linh, N., Kim, A.N., Than, K. (2017). Keeping Priors in Streaming Bayesian Learning. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-57529-2_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57528-5
Online ISBN: 978-3-319-57529-2
eBook Packages: Computer ScienceComputer Science (R0)