Skip to main content

Keeping Priors in Streaming Bayesian Learning

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10235))

Included in the following conference series:

Abstract

Exploiting prior knowledge in the Bayesian learning process is one way to improve the quality of Bayesian model. To the best of our knowledge, however, there is no formal research about the influence of prior in streaming environment. In this paper, we address the problem of using prior knowledge in streaming Bayesian learning, and develop a framework for keeping priors in streaming learning (KPS) that maintains knowledge from the prior through each minibatch of streaming data. We demonstrate the performance of our framework in two scenarios: streaming learning for latent Dirichlet allocation and streaming text classification in comparison with methods that do not keep prior.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://gist.github.com/h3xx/1976236.

  2. 2.

    http://ana.cachopo.org/datasets-for-single-label-text-categorization.

References

  1. Ahmed, A., Xing, E.P.: Staying informed: supervised and semi-supervised multi-view topical analysis of ideological perspective. In: Empirical Methods in Natural Language Processing, pp. 1140–1150 (2010)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  3. Broderick, T., Boyd, N., Wibisono, A., Wilson, A.C., Jordan, M.I.: Streaming variational Bayes. In: Advances in Neural Information Processing Systems, pp. 1727–1735 (2013)

    Google Scholar 

  4. Jo, Y., Oh, A.H.: Aspect and sentiment unification model for online review analysis. In: ACM international Conference on Web Search and Data Mining, pp. 815–824 (2011)

    Google Scholar 

  5. Kluckhohn, C.: Human behavior and the principle of least effort. George Kingsley Zipf. Am. Anthropol. 52(2), 268–270 (1950)

    Article  Google Scholar 

  6. Le, V., Phung, C., Vu, C., Linh, N.V., Than, K.: Streaming sentiment-aspect analysis. In: RIVF, pp. 181–186 (2016)

    Google Scholar 

  7. Lin, C., He, Y., Pedrinaci, C., Domingue, J.: Feature LDA: a supervised topic model for automatic detection of web API documentations from the web. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 328–343. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35176-1_21

    Chapter  Google Scholar 

  8. McInerney, J., Ranganath, R., Blei, D.M.: The population posterior and Bayesian inference on streams. In: International Conference on Neural Information Processing Systems (2015)

    Google Scholar 

  9. Mimno, D., Wallach, H.M., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Empirical Methods in Natural Language Processing, pp. 262–272 (2011)

    Google Scholar 

  10. Newman, M.E.J.: Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46(5), 323–351 (2005)

    Article  Google Scholar 

  11. Piantadosi, S.T.: Zipfs word frequency law in natural language: a critical review and future directions. Psychon. Bull. Rev. 21(5), 1112–1130 (2014)

    Article  Google Scholar 

  12. Sato, I., Nakagawa, H.: Topic models with power-law using Pitman-Yor process. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 673–682 (2010)

    Google Scholar 

  13. Van Linh, N., Anh, N.K., Than, K., Dang, C.N.: An effective and interpretable method for document classification. Knowl. Inf. Syst. 50(3), 763–793 (2016)

    Google Scholar 

  14. Viet, H., Phung, D., Venkatesh, S.: Streaming variational inference for dirichlet process mixtures. In: Asian Conference on Machine Learning (2015)

    Google Scholar 

Download references

Acknowledgement

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.05-2014.28, and by the Air Force Office of Scientific Research (AFOSR), Asian Office of Aerospace Research & Development (AOARD), and US Army International Technology Center, Pacific (ITC-PAC) under award number FA2386-15-1-4011.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khoat Than .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Duc, A.N., Van Linh, N., Kim, A.N., Than, K. (2017). Keeping Priors in Streaming Bayesian Learning. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-57529-2_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-57528-5

  • Online ISBN: 978-3-319-57529-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics