Skip to main content

Cease with Bass: A Framework for Real-Time Topic Detection and Popularity Prediction Based on Long-Text Contents

  • Conference paper
  • First Online:
  • 1836 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11280))

Abstract

Nowadays, social network has become a powerful information source. At the advent of new services like WeChat Official Account, long-text contents have been embedded into social network. Compared with tweet-style contents, long-text contents are better-organized and less prone to noise. However, existing methods for real-time topic detection leveraging long-textual data do not produce satisfactory performance on sensitivity and scalability, and long-text based trend prediction methods are also facing absence of stronger rationales. In this paper, we propose a framework specifically adapted for long-text based topic analysis, covering both topic detection and popularity prediction. For topic detection, we design a novel real-time topic model dubbed as a Cost-Effective And Scalable Embedding model (CEASE) based on improved GloVe Models and keyword frequency clustering algorithm. We then propose strategies for topic tracking and renewal by taking topic abortion, mergence and neologies into account. For popularity prediction, we propose Feature-Combined Bass model with Association Analysis (FCA-Bass) with a strong rationale transplanted from economic fields. Our methods are validated by experiments on real-world dataset from WeChat and are proved to outperform several currently existing mainstream methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Bass, F.M.: A new product growth for model consumer durables. MS 15(5), 215–227 (1969)

    Article  Google Scholar 

  2. Becker, H., Naaman, M., Gravano, L.: Beyond trending topics: real-world event identification on twitter. ICWS 11, 438–441 (2011)

    Google Scholar 

  3. Brants, T., Chen, F.: A system for new event detection. In: SIGIR, pp. 330–337 (2003)

    Google Scholar 

  4. Elshamy, W.: Continuous-time infinite dynamic topic models. arXiv:1302.7088 (2013)

  5. Figueiredo, F., Almeida, J.M., Gonçalves, M.A., Benevenuto, F.: TrendLearner: early prediction of popularity trends of user generated content. IS 349, 172–187 (2016)

    Article  Google Scholar 

  6. Gao, S., Ma, J., Chen, Z.: Effective and effortless features for popularity prediction in microblogging network. In: WWW, pp. 269–270 (2014)

    Google Scholar 

  7. Kasiviswanathan, S., Melville, P., Banerjee, A., Sindhwani, V.: Emerging topic detection using dictionary learning. In: CIKM, pp. 745–754 (2011)

    Google Scholar 

  8. Kong, S., Mei, Q., Feng, L., Ye, F., Zhao, Z.: Predicting bursts and popularity of hashtags in real-time. In: SIGIR, pp. 927–930 (2014)

    Google Scholar 

  9. Kong, S., Ye, F., Feng, L., Zhao, Z.: Towards the prediction problems of bursting hashtags on twitter. JASIST 66(12), 2566–2579 (2015)

    Google Scholar 

  10. Kupavskii, A., et al.: Prediction of retweet cascade size over time. In: CIKM, pp. 2335–2338 (2012)

    Google Scholar 

  11. Ma, X., Gao, X., Chen, G.: Beep: a Bayesian perspective early stage event prediction model for online social networks. In: ICDM, pp. 973–978 (2017)

    Google Scholar 

  12. Mihalcea, R., Tarau, P.: TextRank: bringing order into text. In: EMNLP, pp. 1–8 (2004)

    Google Scholar 

  13. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv:1301.3781 (2013)

  14. Naaman, M., Becker, H., Gravano, L.: Hip and Trendy: characterizing emerging trends on twitter. JASIST 62(5), 902–918 (2011)

    Article  Google Scholar 

  15. Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)

    Google Scholar 

  16. Proskurnia, J., Mavlyutov, R., Castillo, C., Aberer, K., Mauroux, P.: Efficient document filtering using vector space topic expansion and pattern-mining: the case of event detection in microposts. In: CIKM, pp. 457–466 (2017)

    Google Scholar 

  17. Rosenfeld, N., Nitzan, M., Globerson, A.: Discriminative learning of infection models. In: WSDM, pp. 563–572 (2016)

    Google Scholar 

  18. Tang, X., Yang, C.: Tut: a statistical model for detecting trends, topics and user interests in social media. In: CIKM, pp. 972–981 (2012)

    Google Scholar 

  19. Wang, C., Paisley, J., Blei, D.: Online variational inference for the hierarchical Dirichlet process. In: AISTATS, pp. 752–760 (2011)

    Google Scholar 

  20. Yan, Y., Tan, Z., Gao, X., Tang, S., Chen, G.: STH-Bass: a spatial-temporal heterogeneous bass model to predict single-tweet popularity. In: DASFAA, pp. 18–32 (2016)

    Google Scholar 

  21. Zhao, Q., Erdogdu, M.A., He, H.Y., Rajaraman, A., Leskovec, J.: SEISMIC: a self-exciting point process model for predicting tweet popularity. In: KDD, pp. 1513–1522 (2015)

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Key R&D Program of China (2018YFB1004703), the National Natural Science Foundation of China (61872238, 61672348, 61672353), the Shanghai Science and Technology Fund (17510740200), the CCF-Tencent Open Research Fund (RAGR20170114), and Huawei Innovation Research Program (HO2018085286), and the National Key Research of China (2018YFB1003800). Quanquan Chu finished the experiments in this paper when he was an intern at Tencent Shenzhen. The authors also would like to thank Chunxia Jia, Yiming Zhang, Chao Wang, and Tianxiang Gao for their contributions on this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaofeng Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chu, Q., Cao, Z., Gao, X., He, P., Deng, Q., Chen, G. (2018). Cease with Bass: A Framework for Real-Time Topic Detection and Popularity Prediction Based on Long-Text Contents. In: Chen, X., Sen, A., Li, W., Thai, M. (eds) Computational Data and Social Networks. CSoNet 2018. Lecture Notes in Computer Science(), vol 11280. Springer, Cham. https://doi.org/10.1007/978-3-030-04648-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-04648-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-04647-7

  • Online ISBN: 978-3-030-04648-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics