Keeping Priors in Streaming Bayesian Learning

Duc, Anh Nguyen; Van Linh, Ngo; Kim, Anh Nguyen; Than, Khoat

doi:10.1007/978-3-319-57529-2_20

Anh Nguyen Duc¹⁹,
Ngo Van Linh¹⁹,
Anh Nguyen Kim¹⁹ &
…
Khoat Than¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10235))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2981 Accesses
4 Citations

Abstract

Exploiting prior knowledge in the Bayesian learning process is one way to improve the quality of Bayesian model. To the best of our knowledge, however, there is no formal research about the influence of prior in streaming environment. In this paper, we address the problem of using prior knowledge in streaming Bayesian learning, and develop a framework for keeping priors in streaming learning (KPS) that maintains knowledge from the prior through each minibatch of streaming data. We demonstrate the performance of our framework in two scenarios: streaming learning for latent Dirichlet allocation and streaming text classification in comparison with methods that do not keep prior.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Ahmed, A., Xing, E.P.: Staying informed: supervised and semi-supervised multi-view topical analysis of ideological perspective. In: Empirical Methods in Natural Language Processing, pp. 1140–1150 (2010)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Broderick, T., Boyd, N., Wibisono, A., Wilson, A.C., Jordan, M.I.: Streaming variational Bayes. In: Advances in Neural Information Processing Systems, pp. 1727–1735 (2013)
Google Scholar
Jo, Y., Oh, A.H.: Aspect and sentiment unification model for online review analysis. In: ACM international Conference on Web Search and Data Mining, pp. 815–824 (2011)
Google Scholar
Kluckhohn, C.: Human behavior and the principle of least effort. George Kingsley Zipf. Am. Anthropol. 52(2), 268–270 (1950)
Article Google Scholar
Le, V., Phung, C., Vu, C., Linh, N.V., Than, K.: Streaming sentiment-aspect analysis. In: RIVF, pp. 181–186 (2016)
Google Scholar
Lin, C., He, Y., Pedrinaci, C., Domingue, J.: Feature LDA: a supervised topic model for automatic detection of web API documentations from the web. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 328–343. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35176-1_21
Chapter Google Scholar
McInerney, J., Ranganath, R., Blei, D.M.: The population posterior and Bayesian inference on streams. In: International Conference on Neural Information Processing Systems (2015)
Google Scholar
Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Empirical Methods in Natural Language Processing, pp. 262–272 (2011)
Google Scholar
Newman, M.E.J.: Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46(5), 323–351 (2005)
Article Google Scholar
Piantadosi, S.T.: Zipfs word frequency law in natural language: a critical review and future directions. Psychon. Bull. Rev. 21(5), 1112–1130 (2014)
Article Google Scholar
Sato, I., Nakagawa, H.: Topic models with power-law using Pitman-Yor process. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 673–682 (2010)
Google Scholar
Van Linh, N., Anh, N.K., Than, K., Dang, C.N.: An effective and interpretable method for document classification. Knowl. Inf. Syst. 50(3), 763–793 (2016)
Google Scholar
Viet, H., Phung, D., Venkatesh, S.: Streaming variational inference for dirichlet process mixtures. In: Asian Conference on Machine Learning (2015)
Google Scholar

Download references

Acknowledgement

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.05-2014.28, and by the Air Force Office of Scientific Research (AFOSR), Asian Office of Aerospace Research & Development (AOARD), and US Army International Technology Center, Pacific (ITC-PAC) under award number FA2386-15-1-4011.

Author information

Authors and Affiliations

Hanoi University of Science and Technology, No. 1, Dai Co Viet Road, Hanoi, Vietnam
Anh Nguyen Duc, Ngo Van Linh, Anh Nguyen Kim & Khoat Than

Authors

Anh Nguyen Duc
View author publications
You can also search for this author in PubMed Google Scholar
Ngo Van Linh
View author publications
You can also search for this author in PubMed Google Scholar
Anh Nguyen Kim
View author publications
You can also search for this author in PubMed Google Scholar
Khoat Than
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Khoat Than .

Editor information

Editors and Affiliations

Kangwon National University, Chuncheon, Korea (Republic of)
Jinho Kim
Seoul National University, Seoul, Korea (Republic of)
Kyuseok Shim
University of Technology Sydney, Sydney, New South Wales, Australia
Longbing Cao
KAIST, Daejeon, Korea (Republic of)
Jae-Gil Lee
University of New South Wales, Sydney, New South Wales, Australia
Xuemin Lin
Kangwon National University, Chuncheon, Korea (Republic of)
Yang-Sae Moon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Duc, A.N., Van Linh, N., Kim, A.N., Than, K. (2017). Keeping Priors in Streaming Bayesian Learning. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-57529-2_20
Published: 23 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57528-5
Online ISBN: 978-3-319-57529-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics