Skip to main content

A Novel Topic Number Selecting Algorithm for Topic Model

  • Conference paper
  • First Online:
Genetic and Evolutionary Computing (ICGEC 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1107))

Included in the following conference series:

  • 818 Accesses

Abstract

A novel algorithm named the MTN (Multiple-Topic-Number) algorithm is introduced to deal with the problem of topic number selecting in topic model issue. The purpose of our algorithm is to build the LDA (Latent Dirichlet Allocation) matrices of different topic numbers to make the LDA matrices and machine learning algorithm combined better. So it can be used to solve the traditional problem of selecting topic number: under-size or over-size. The method here is to use different levels of machine learning tree structure to complete the combination. Experimental results show the efficiency of our proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Anandkumar, A., Ge, R., Hsu, D., et al.: Tensor decompositions for learning latent variable models. J. Mach. Learn. Res. 15(1), 2773–2832 (2012)

    MathSciNet  MATH  Google Scholar 

  2. Wang, Y., Bai, H., Stanton, M., et al.: PLDA: parallel latent Dirichlet allocation for large-scale applications. In: Algorithmic Aspects in Information and Management, pp. 301–314. Springer, Heidelberg (2009)

    Google Scholar 

  3. Teh, Y.W., Newman, D., Welling, M.: A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In: Advances in Neural Information Processing Systems, pp. 1353–1360 (2007)

    Google Scholar 

  4. Foulds, J., et al.: Stochastic collapsed variational Bayesian inference for latent Dirichlet allocation. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 446–454. ACM (2013)

    Google Scholar 

  5. Feuerriegel, S., Ratku, A., Neumann, D.: Analysis of how underlying topics in financial news affect stock prices using latent Dirichlet allocation. In: 2016 49th Hawaii International Conference on System Sciences (HICSS), pp. 1072–1081. IEEE (2016)

    Google Scholar 

  6. Philbin, J., Sivic, J., Zisserman, A.: Geometric latent Dirichlet allocation on a matching graph for large-scale image datasets. Int. J. Comput. Vis. 95(2), 138–153 (2011)

    Article  MathSciNet  Google Scholar 

  7. Do, K.-A., Qin, Z.S., Vannucci, M.: Predicting cancer subtypes using survival-supervised latent Dirichlet allocation models. In: Advances in Statistical Bioinformatics, pp. 366–381 (2013)

    Google Scholar 

  8. Al-Salemi, B., Ab Aziz, M.J., Noah, S.A.: LDA-AdaBoost. MH: accelerated AdaBoost. MH based on latent Dirichlet allocation for text categorization. J. Inf. Sci. 41(1), 27–40 (2015)

    Article  Google Scholar 

  9. Xie, P., Xing, E.P.: Integrating document clustering and topic modeling. In: Twenty-Ninth Conference on Uncertainty in Artificial Intelligence, pp. 694–703. AUAI Press (2013)

    Google Scholar 

  10. Lei, L., Qiao, G., Qimin, C., Qitao, L.: LDA boost classification: boosting by topics. EURASIP J. Adv. Signal Process. 2012, 1–14 (2012)

    Article  Google Scholar 

  11. Qin, Z., Cong, Y., Wan, T.: Topic modeling of Chinese language beyond a bag-of-words. Comput. Speech Lang. 40, 60–78 (2016)

    Article  Google Scholar 

Download references

Acknowledgement

This work was supported by Shenzhen Science and Technology Plan under grant number JCYJ20180306171938767 and the Shenzhen Foundational Research Funding JCYJ20180507183527919.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Linlin Tang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tang, L., Zhao, L. (2020). A Novel Topic Number Selecting Algorithm for Topic Model. In: Pan, JS., Lin, JW., Liang, Y., Chu, SC. (eds) Genetic and Evolutionary Computing. ICGEC 2019. Advances in Intelligent Systems and Computing, vol 1107. Springer, Singapore. https://doi.org/10.1007/978-981-15-3308-2_53

Download citation

Publish with us

Policies and ethics