Abstract
In social networking services (SNSs), persistent topics are extremely rare and valuable. In this paper, we propose an algorithm for the detection of persistent topics in SNSs based on Topic Graph. A topic graph is a subgraph of the ordinary social network graph that consists of the users who shared a certain topic up to some time point. Based on the assumption that the time evolutions of the topic graphs associated with persistent and non-persistent topics are different, we propose to detect persistent topics by performing anomaly detection on the feature values extracted from the time evolution of the topic graph. For anomaly detection, we use principal component analysis to capture the subspace spanned by normal (non-persistent) topics. We demonstrate our technique on a real dataset we gathered from Twitter and show that it performs significantly better than a baseline method based on power-law curve fitting, the linear influence model, ridge regression, and Support Vector Machine.












Similar content being viewed by others
References
Allan J, Carbonell J, Doddington G, Yamron J, Yang Y (1998) Topic detection and tracking pilot study: Final report. Evaluation 1998:194–218
Allan J, Papka R, Lavrenko V (1998b) On-line new event detection and tracking. In: Proceedings of SIGIR, pp 37–45
Asur S, Huberman B, Szabó G, Wang C (2011) Trends in social media: Persistence and decay. In: Proceedings of ICSWM
Bakshy E, Hofman JM, Mason WA, Watts DJ (2011) Everyone’s an influencer: quantifying influence on twitter. In: Proceedings of WSDM, pp 65–74
Bakshy E, Rosenn I, Marlow C, Adamic L (2012) The role of social networks in information diffusion. In: Proceedings of WWW, pp 519–528
Bishop CM (2007) Pattern recognition and machine learning. Springer
Boser BE, Guyon IM, Vapnik V (1992) A training algorithm for optimal margin classifiers. In: Proceedings of ACM, COLT, pp 144–152
Boyd D, Ellison N (2007) Social network sites: definition, history, and scholarship. J Comput Mediat Commun 13(1–2):210–230
Cataldi M, Torino U, Caro L, Schifanella C (2010) Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of MDMKDD, pp 1–10
Cha M, Haddadi H, Benevenuto F, Gummadi K (2010) Measuring user influence in twitter: The million follower fallacy. In: Proceedings of ICWSM, pp 10–17
Christakis N, Fowler J (2008) The Collective Dynamics of Smoking in a Large Social Network. N Eng J Med 358(21):2249–2258
Cormen T (2001) Introduction to algorithms. The MIT press
Dijkstra E (1959) A note on two problems in connexion with graphs. Numerische mathematik 1(1):269–271
Donchin E, Heffley E (1978) Multivariate analysis of event-related potential data: a tutorial review. U.S. Gov, Printing Office
Fortunato S (2010) Community detection in graphs. Phys Rep 486(3–5):75–174
Hirose S, Yamanishi K, Nakata T, Fujimaki R (2009) Network anomaly detection based on eigen equation compression. In: Proceedings of KDD
Ide T, Kashima H (2004) Eigenspace-based anomaly detection in computer systems. In: Proceedings of KDD, pp 440–449
Inokuchi A, Kashima H (2003) Mining significant pairs of patterns from graph structures with class labels. In: Proceedings of ICDM
Kim D, Motter A (2007) Ensemble averageability in network spectra. Phys Rev Lett 98(24):248701
Kleinberg J (2002) Bursty and hierarchical structure in streams. In: Proceedings of KDD, pp 91–101
Kwak H, Lee C, Park H, Moon S (2010) What is twitter, a social network or a news media? In: Proceedings of WWW, pp 591–600
Lakhina A, Crovella M, Diot C (2004) Diagnosing network-wide traffic anomalies. In: Proceedings of SIGCOMM, pp 219–230
Lerman K, Ghosh R (2010) Information contagion: An empirical study of the spread of news on digg and twitter social networks. In: Proceedings of ICWSM
Newman M (2004) Fast algorithm for detecting community structure in networks. Physics Review E 69:066–133
Newman M (2005) Power laws, Pareto distributions and Zipf’s law. Contemp Phys 46(5):323–351
Newman M (2006) Modularity and community structure in networks. Proc Natl Acad of Sci USA 103(23):8577
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Phil Mag 2(11):559–572
Phuvipadawat S, Murata T (2010) Breaking news detection and tracking in twitter. In: Proceedings of WICACM, vol 3, pp 120–123
Preisendorfer R, Mobley C (1988) Principal component analysis in meteorology and oceanography. Elsevier, Developments in atmospheric science
Purcell K, Rainie L, Mitchell A, Rosenstiel T, Olmstead K (2010) Understanding the participatory news consumer. Pew Internet and American Life Project 1
Saito S, Tomioka R, Yamanishi K (2014) Early detection of persistent topics in social networks. In: Proceedings of ASONAM, pp 417–424
Sakaki T, Okazaki M, Matsuo Y (2010) Earthquake shakes twitter users: real-time event detection by social sensors. In: Proceedings of WWW, pp 851–860
Takahashi T, Tomioka R, Yamanishi K (2011) Discovering emerging topics in social streams via link anomaly detection. In: Proceedings of ICDM, pp 1230–1235
Trusov M, Bucklin R, Pauwels K (2009) Effects of word-of-mouth versus traditional marketing: Findings from an internet social networking site. J Mark 73(5):90–102
Vapnik V (1998) Statistical learning theory, vol 2. Wiley, New York
Von Luxburg U (2007) A tutorial on spectral clustering. Statistics and computing 17(4):395–416
Wang C, Huberman B (2011) Long trend dynamics in social media. CoRR abs/1109.1852
Watts D, Strogatz S (1998) Collective dynamics of ‘small-world’ networks. Nature 393(6684):440–442
Yang J, Leskovec J (2010) Modeling information diffusion in implicit networks. In: Proceedings of ICDM, pp 599–608
Acknowledgments
This work was partially supported by MEXT KAKENHI 23240019 and JST–CREST. This work was supported by MEXT KAKENHI 23240019 and JST–CREST.
Conflict of interest
The authors declare that they have no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Saito, S., Tomioka, R. & Yamanishi, K. Early detection of persistent topics in social networks. Soc. Netw. Anal. Min. 5, 19 (2015). https://doi.org/10.1007/s13278-015-0257-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-015-0257-1