Skip to main content
Log in

Topic dynamics in Weibo: a comprehensive study

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

The tremendous development of online social media has changed people’s life fundamentally in recent years. Weibo, a Twitter-like service in China, has attracted more than 500 million users in less than 5 years and produces more than 100 million Chinese tweets everyday. In these massive tweets, different user interests and daily trends are reflected by different topics. To our best knowledge, a systematic investigation of topic dynamics in Weibo is still missing. Aiming at filling this vital gap, we try to comprehensively disclose the topic dynamics from the perspective of time, geography, demographics, emotion, retweeting and correlation. An incremental learning framework is first established to probe more than 200 million streaming tweets and an interaction network constituted by around 90,000 active users. Many interesting patterns are then revealed, which could provide insights for topic-related applications in online social media, such as user profiling, event detection, trend tracking or content recommendation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. http://statuscalendar.com.

  2. http://code.google.com/p/plda/.

References

  • Ardon S, Bagchi A, Mahanti A, Ruhela A, Seth A, Tripathy RM, Triukose S (2013) Spatio-temporal and events based analysis of topic popularity in Twitter. In: Proceedings of the 22nd ACM international conference on conference on information & knowledge management (CIKM), San Francisco, CA, ACM, pp 219–228

  • Banerjee S, Ramanathan K, Gupta A (2007) Clustering short texts using Wikipedia.In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 787–788

  • Barabasi AL (2005) The origin of bursts and heavy tails in human dynamics. Nature 435(7039):207–211

    Article  Google Scholar 

  • Becker H, Naaman M, Gravano L (2011) Beyond trending topics: real-world event identification on Twitter. In: ICWSM

  • Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Bogdanov P, Busch M, Moehlis J, Singh AK, Szymanski BK (2013) The social media genome: modeling individual topic-specific behavior in social media. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ACM, pp 236–242

  • Boyd D, Golder S, Lotan G (2010) Tweet, tweet, retweet: conversational aspects of retweeting on Twitter. In: System sciences (HICSS), 2010 43rd Hawaii international conference. IEEE, pp 1–10

  • Cataldi M, Di Caro L, Schifanella C (2010) Emerging topic detection on Twitter based on temporal and social terms evaluation. In: Proceedings of the tenth international workshop on multimedia data mining. ACM, p 4

  • Dumais S, Platt J, Heckerman D, Sahami M (1998) Inductive learning algorithms and representations for text categorization. In: Proceedings of the seventh international conference on Information and knowledge management. ACM, pp 148–155

  • Fan R, Zhao J, Chen Y, Xu K (2014) Anger is more influential than joy: sentiment correlation in Weibo. PLoS One 9:e110, 184

  • Genc Y, Sakamoto Y, Nickerson JV (2011) Discovering context: classifying tweets through a semantic transform based on Wikipedia. In: Foundations of augmented cognition. Directing the future of adaptive systems. Springer, pp 484–492

  • Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 50–57

  • Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML, vol 99, pp 200–209

  • Kinsella S, Passant A, Breslin JG (2011) Topic classification in social media using metadata from hyperlinked objects. In: Advances in information retrieval. Springer, pp 201–206

  • Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th international conference on World Wide Web, WWW ’10. ACM, pp 591–600

  • Michelson M, Macskassy SA (2010) Discovering users’ topics of interest on Twitter: a first look. In: Proceedings of the fourth workshop on analytics for noisy unstructured text data. ACM, pp 73–80

  • Mislove A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B (2007) Measurement and analysis of online social networks. In: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, pp 29–42

  • Novakovic J (2010) The impact of feature selection on the accuracy of naïve Bayes classifier. In: 18th telecommunications forum TELFOR, pp 1113–1116

  • Quercia D, Capra L, Crowcroft J (2012) The social world of Twitter: topics, geography, and emotions. In: ICWSM

  • Ritter A, Etzioni O, Clark S et al (2012) Open domain event extraction from Twitter. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1104–1112

  • Romero DM, Meeder B, Kleinberg J (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on Twitter. In: Proceedings of the 20th international conference on World Wide Web. ACM, pp 695–704

  • Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123

    Article  Google Scholar 

  • Sankaranarayanan J, Samet H, Teitler BE, Lieberman MD, Sperling J (2009) Twitterstand: news in tweets. In: Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 42–51

  • Schönhofen P (2009) Identifying document topics using the Wikipedia category network. Web Intell Agent Syst 7(2):195–207

    Google Scholar 

  • Song S, Li Q, Bao H (2012) Detecting dynamic association among Twitter topics. In: Proceedings of the 21st international conference companion on World Wide Web. ACM, pp 605–606

  • Sriram B, Fuhry D, Demir E, Ferhatosmanoglu H, Demirbas M (2010) Short text classification in Twitter to improve information filtering. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. ACM, pp 841–842

  • Suh B, Hong L, Pirolli P, Chi EH (2010) Want to be retweeted? large scale analytics on factors impacting retweet in Twitter network. In: Social computing (socialcom), 2010 IEEE second international conference. IEEE, pp. 177–184

  • Yamaguchi Y, Amagasa T, Kitagawa H (2011) Tag-based user topic discovery using Twitter lists. In: Advances in social networks analysis and mining (ASONAM), 2011 international conference. IEEE, pp 13–20

  • Yang J, Leskovec J (2011) Patterns of temporal variation in online media. In: Proceedings of the fourth ACM international conference on web search and data mining. ACM, pp 177–186

  • Yang T, Lee D, Yan S (2013) Steeler nation, 12th man, and boo birds: classifying Twitter user interests using time series. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ACM, pp 684–691

  • Yang Y, Liu X (1999) A re-examination of text categorization methods. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 42–49

  • Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: ICML, vol 97, pp 412–420

  • Yang Z, Guo J, Cai K, Tang J, Li J, Zhang L, Su Z (2010) Understanding retweeting behaviors in social networks. In: Proceedings of the 19th ACM international conference on information and knowledge management. ACM, pp 1633–1636

  • Yu L, Asur S, Huberman BA (2011) What trends in Chinese social media. In: The 5th SNA-KDD workshop’11 (SNA-KDD’11), 21 August 2011, San Diego, CA

  • Yu L, Asur S, Huberman BA (2015) Trend dynamics and attention in Chinese social media. Am Behav Sci. doi:10.1177/0002764215580619

  • Zhang T, Oles FJ (2001) Text categorization based on regularized linear classification methods. Inf Retr 4(1):5–31

    Article  MATH  Google Scholar 

  • Zhao J, Dong L, Wu J, Xu K (2012) Moodlens: an emoticon-based sentiment analysis system for Chinese tweets. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1528–1531

  • Zhou T, Han XP, Wang BH (2008) Towards the understanding of human dynamics. In: Science matters: humanities as complex systems, pp 207–233

Download references

Acknowledgments

This work was supported by NSFC (Grant No. 61421003) and the fund of the State Key Lab of Software Development Environment (Grant No. SKLSDE-2015ZX-05). Jichang Zhao was partially supported by the fund of the State Key Laboratory of Software Development Environment (Grant No. SKLSDE-2015ZX-28) and the Fundamental Research Funds for the Central Universities (Grant No. YWF-15-JGXY-011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jichang Zhao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, R., Zhao, J. & Xu, K. Topic dynamics in Weibo: a comprehensive study. Soc. Netw. Anal. Min. 5, 41 (2015). https://doi.org/10.1007/s13278-015-0282-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-015-0282-0

Keywords

Navigation