Skip to main content

Research on Hot Topic Discovery Technology of Micro-blog Based on Biterm Topic Model

  • Conference paper
  • First Online:
Book cover Geo-Spatial Knowledge and Intelligence (GRMSE 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 699))

Abstract

In order to overcome data sparsity and expression diversity problems of short text and to improve the quality of clustering, this paper proposes a text feature enhancement method based on biterm topic model (BTM). First, we obtain the high frequency word matrix of underlying topic based on the extraction on the corpus using BTM and then strengthen the traditional vector space model (VSM) selectively with this matrix to reduce vector dimension and highlight the main features. Also, we propose a heat calculation equation combining with propagation characteristic and time effect of micro-blogs so that we can better demonstrate the evolution of a topic and analyze it. Experiments show that our method has achieved good results in improving the clustering quality and the heat calculation equation is also beneficial to the discovery and evolution of hot topics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Allan, J.: Introduction to topic detection and tracking. In: Allan, J. (ed.) Topic Detection and Tracking, pp. 1–16. Springer US, New York (2002)

    Chapter  Google Scholar 

  2. Yan, X., Guo, J., Lan, Y.: A biterm topic model for short texts. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1445–1456. ACM (2013)

    Google Scholar 

  3. Beil, F., Ester, M., Xu, X.: Frequent term-based text clustering. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 436–442. ACM (2002)

    Google Scholar 

  4. Hu, J., Xu, H., Liu, Y.: Algorithm of repeats-based term extraction and its application in text clustering. Comput. Eng. 33, 65–67 (2007)

    Google Scholar 

  5. Gabrilovich, E.: Feature generation for textual information retrieval using world knowledge. ACM SIGIR Forum 41, 123 (2007)

    Article  Google Scholar 

  6. Hotho, A., Staab, S., Stumme, G.: Ontologies improve text document clustering. In: Third IEEE International Conference on Data Mining, pp. 541–544 (2003)

    Google Scholar 

  7. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  8. Song, L., Zhang, P.: System design of micro-blog public opinion based on LDA topic modeling method. Netw. Secur. Technol. Appl. 4, 5–6 (2014). (in Chinese)

    Google Scholar 

  9. Tang, Q.: Short text clustering method based on BTM. Anhui University, Hefei (2014). (in Chinese)

    Google Scholar 

  10. Zhang, Y.: A short text similarity calculation method based on feature extension using BTM topic mode. Anhui University, Hefei (2014). (in Chinese)

    Google Scholar 

  11. Wang, Y.: Topic model based on mixture LDA model in microblogging services. Nanjing University of Posts and Telecommunications, Nanjing (2015). (in Chinese)

    Google Scholar 

  12. Wu, W., Wu, Q., Gu, J.: Hot topic extraction from E-commerce microblog based on EM-LDA integrated model. Mod. Libr. Inf. Technol. 11, 33–40 (2015). (in Chinese)

    Google Scholar 

  13. Wang, H., Peng, Y.: Public opinion hotspots discovery based on topic model and ARIMA algorithm. Technology Square (2016). (in Chinese)

    Google Scholar 

  14. Jiang, H.: Characteristics of micro blog and its influence on public opinion. News Lovers First Half 5, 85–86 (2011). (in Chinese)

    Google Scholar 

  15. O’Connor, B., Balasubramanyan, R., Routledge, B.R.: From tweets to polls: linking text sentiment to public opinion time series. In: ICWSM, vol. 11, pp. 122–129 (2010)

    Google Scholar 

  16. Cheng, J., Sun, A.R., Hu, D.: An information diffusion based recommendation framework for micro-blogging. J. Assoc. Inf. 12, 463 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Feng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Feng, J., Fang, Y. (2017). Research on Hot Topic Discovery Technology of Micro-blog Based on Biterm Topic Model. In: Yuan, H., Geng, J., Bian, F. (eds) Geo-Spatial Knowledge and Intelligence. GRMSE 2016. Communications in Computer and Information Science, vol 699. Springer, Singapore. https://doi.org/10.1007/978-981-10-3969-0_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3969-0_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3968-3

  • Online ISBN: 978-981-10-3969-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics