Abstract
The tremendous development of online social media has changed people’s life fundamentally in recent years. Weibo, a Twitter-like service in China, has attracted more than 500 million users in less than 5 years and produces more than 100 million Chinese tweets everyday. In these massive tweets, different user interests and daily trends are reflected by different topics. To our best knowledge, a systematic investigation of topic dynamics in Weibo is still missing. Aiming at filling this vital gap, we try to comprehensively disclose the topic dynamics from the perspective of time, geography, demographics, emotion, retweeting and correlation. An incremental learning framework is first established to probe more than 200 million streaming tweets and an interaction network constituted by around 90,000 active users. Many interesting patterns are then revealed, which could provide insights for topic-related applications in online social media, such as user profiling, event detection, trend tracking or content recommendation.














Similar content being viewed by others
References
Ardon S, Bagchi A, Mahanti A, Ruhela A, Seth A, Tripathy RM, Triukose S (2013) Spatio-temporal and events based analysis of topic popularity in Twitter. In: Proceedings of the 22nd ACM international conference on conference on information & knowledge management (CIKM), San Francisco, CA, ACM, pp 219–228
Banerjee S, Ramanathan K, Gupta A (2007) Clustering short texts using Wikipedia.In: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 787–788
Barabasi AL (2005) The origin of bursts and heavy tails in human dynamics. Nature 435(7039):207–211
Becker H, Naaman M, Gravano L (2011) Beyond trending topics: real-world event identification on Twitter. In: ICWSM
Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
Bogdanov P, Busch M, Moehlis J, Singh AK, Szymanski BK (2013) The social media genome: modeling individual topic-specific behavior in social media. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ACM, pp 236–242
Boyd D, Golder S, Lotan G (2010) Tweet, tweet, retweet: conversational aspects of retweeting on Twitter. In: System sciences (HICSS), 2010 43rd Hawaii international conference. IEEE, pp 1–10
Cataldi M, Di Caro L, Schifanella C (2010) Emerging topic detection on Twitter based on temporal and social terms evaluation. In: Proceedings of the tenth international workshop on multimedia data mining. ACM, p 4
Dumais S, Platt J, Heckerman D, Sahami M (1998) Inductive learning algorithms and representations for text categorization. In: Proceedings of the seventh international conference on Information and knowledge management. ACM, pp 148–155
Fan R, Zhao J, Chen Y, Xu K (2014) Anger is more influential than joy: sentiment correlation in Weibo. PLoS One 9:e110, 184
Genc Y, Sakamoto Y, Nickerson JV (2011) Discovering context: classifying tweets through a semantic transform based on Wikipedia. In: Foundations of augmented cognition. Directing the future of adaptive systems. Springer, pp 484–492
Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. ACM, pp 50–57
Joachims T (1999) Transductive inference for text classification using support vector machines. In: ICML, vol 99, pp 200–209
Kinsella S, Passant A, Breslin JG (2011) Topic classification in social media using metadata from hyperlinked objects. In: Advances in information retrieval. Springer, pp 201–206
Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th international conference on World Wide Web, WWW ’10. ACM, pp 591–600
Michelson M, Macskassy SA (2010) Discovering users’ topics of interest on Twitter: a first look. In: Proceedings of the fourth workshop on analytics for noisy unstructured text data. ACM, pp 73–80
Mislove A, Marcon M, Gummadi KP, Druschel P, Bhattacharjee B (2007) Measurement and analysis of online social networks. In: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. ACM, pp 29–42
Novakovic J (2010) The impact of feature selection on the accuracy of naïve Bayes classifier. In: 18th telecommunications forum TELFOR, pp 1113–1116
Quercia D, Capra L, Crowcroft J (2012) The social world of Twitter: topics, geography, and emotions. In: ICWSM
Ritter A, Etzioni O, Clark S et al (2012) Open domain event extraction from Twitter. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1104–1112
Romero DM, Meeder B, Kleinberg J (2011) Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on Twitter. In: Proceedings of the 20th international conference on World Wide Web. ACM, pp 695–704
Rosvall M, Bergstrom CT (2008) Maps of random walks on complex networks reveal community structure. Proc Natl Acad Sci 105(4):1118–1123
Sankaranarayanan J, Samet H, Teitler BE, Lieberman MD, Sperling J (2009) Twitterstand: news in tweets. In: Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems. ACM, pp 42–51
Schönhofen P (2009) Identifying document topics using the Wikipedia category network. Web Intell Agent Syst 7(2):195–207
Song S, Li Q, Bao H (2012) Detecting dynamic association among Twitter topics. In: Proceedings of the 21st international conference companion on World Wide Web. ACM, pp 605–606
Sriram B, Fuhry D, Demir E, Ferhatosmanoglu H, Demirbas M (2010) Short text classification in Twitter to improve information filtering. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. ACM, pp 841–842
Suh B, Hong L, Pirolli P, Chi EH (2010) Want to be retweeted? large scale analytics on factors impacting retweet in Twitter network. In: Social computing (socialcom), 2010 IEEE second international conference. IEEE, pp. 177–184
Yamaguchi Y, Amagasa T, Kitagawa H (2011) Tag-based user topic discovery using Twitter lists. In: Advances in social networks analysis and mining (ASONAM), 2011 international conference. IEEE, pp 13–20
Yang J, Leskovec J (2011) Patterns of temporal variation in online media. In: Proceedings of the fourth ACM international conference on web search and data mining. ACM, pp 177–186
Yang T, Lee D, Yan S (2013) Steeler nation, 12th man, and boo birds: classifying Twitter user interests using time series. In: Proceedings of the 2013 IEEE/ACM international conference on advances in social networks analysis and mining. ACM, pp 684–691
Yang Y, Liu X (1999) A re-examination of text categorization methods. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval. ACM, pp 42–49
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: ICML, vol 97, pp 412–420
Yang Z, Guo J, Cai K, Tang J, Li J, Zhang L, Su Z (2010) Understanding retweeting behaviors in social networks. In: Proceedings of the 19th ACM international conference on information and knowledge management. ACM, pp 1633–1636
Yu L, Asur S, Huberman BA (2011) What trends in Chinese social media. In: The 5th SNA-KDD workshop’11 (SNA-KDD’11), 21 August 2011, San Diego, CA
Yu L, Asur S, Huberman BA (2015) Trend dynamics and attention in Chinese social media. Am Behav Sci. doi:10.1177/0002764215580619
Zhang T, Oles FJ (2001) Text categorization based on regularized linear classification methods. Inf Retr 4(1):5–31
Zhao J, Dong L, Wu J, Xu K (2012) Moodlens: an emoticon-based sentiment analysis system for Chinese tweets. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1528–1531
Zhou T, Han XP, Wang BH (2008) Towards the understanding of human dynamics. In: Science matters: humanities as complex systems, pp 207–233
Acknowledgments
This work was supported by NSFC (Grant No. 61421003) and the fund of the State Key Lab of Software Development Environment (Grant No. SKLSDE-2015ZX-05). Jichang Zhao was partially supported by the fund of the State Key Laboratory of Software Development Environment (Grant No. SKLSDE-2015ZX-28) and the Fundamental Research Funds for the Central Universities (Grant No. YWF-15-JGXY-011).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fan, R., Zhao, J. & Xu, K. Topic dynamics in Weibo: a comprehensive study. Soc. Netw. Anal. Min. 5, 41 (2015). https://doi.org/10.1007/s13278-015-0282-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-015-0282-0