ABSTRACT
In this paper, we study the problem of automatically discovering and tracking transient crowds in highly-dynamic social messaging systems like Twitter and Facebook. Unlike the more static and long-lived group-based membership offered on many social networks (e.g., fan of the LA Lakers), a transient crowd is a short-lived ad-hoc collection of users, representing a "hotspot" on the real-time web. Successful detection of these hotspots can positively impact related research directions in online event detection, content personalization, social information discovery, etc. Concretely, we propose to model crowd formation and dispersion through a message-based communication clustering approach over time-evolving graphs that captures the natural conversational nature of social messaging systems. Two of the salient features of the proposed approach are (i) an efficient locality- based clustering approach for identifying crowds of users in near real-time compared to more heavyweight static clustering algorithms; and (ii) a novel crowd tracking and evolution approach for linking crowds across time periods. We find that the locality-based clustering approach results in empirically high-quality clusters relative to static graph clus- tering techniques at a fraction of the computational cost. Based on a three month snapshot of Twitter consisting of 711,612 users and 61.3 million messages, we show how the proposed approach can successfully identify and track interesting crowds based on the Twitter communication structure and uncover crowd-based topics of interest.
- A website that maintains statistical information about tweets. http://popacular.com/gigatweet/.Google Scholar
- S. Asur, S. Parthasarathy, and D. Ucar. An event--based framework for characterizing the evolutionary behavior of interaction graphs. In KDD '07, pages 913--921, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In KDD '06, pages 44--54, New York, NY, USA, 2006. ACM. Google ScholarDigital Library
- I. Dhillon, Y. Guan, and B. Kulis. A fast kernel-based multilevel algorithm for graph clustering. In KDD '05, pages 629--634, New York, NY, USA, 2005. ACM. Google ScholarDigital Library
- G. W. Flake, R. E. Tarjan, and K. Tsioutsiouliklis. Graph clustering and minimum cut trees. Internet Mathematics, 1(4):385--408, 2004.Google ScholarCross Ref
- A. V. Goldberg and R. E. Tarjan. A new approach to the maximum flow problem. In STOC '86, pages 136--146, New York, NY, USA, 1986. ACM. Google ScholarDigital Library
- R. E. Gomory and T. C. Hu. Multi-terminal network flows. Journal of the Society for Industrial and AppliedMathematics, 9(4):551--570, 1961.Google ScholarCross Ref
- K. Y. Kamath and J. Caverlee. Identifying hotspots on the real-time web. In CIKM '10, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
- M. E. J. Newman. Fast algorithm for detecting community structure in networks, September 2003.Google Scholar
- B. Saha and P. Mitra. Dynamic algorithm for graph clustering using minimum cut tree. In ICDMW '06, pages 667--671, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
- J. Sun, C. Faloutsos, S. Papadimitriou, and P. S. Yu. Graphscope: parameter-free mining of large time-evolving graphs. In 13th ACM SIGKDD' 07, pages 687--696, New York, NY, USA, 2007. ACM. Google ScholarDigital Library
- S. Van Dongen. Graph clustering via a discrete uncoupling process. SIAM J. Matrix Anal. Appl., 30(1):121--141, 2008. Google ScholarDigital Library
Index Terms
- Transient crowd discovery on the real-time social web
Recommendations
Content-based crowd retrieval on the real-time web
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge managementIn this paper, we propose and evaluate a novel content-driven crowd discovery algorithm that can efficiently identify newly-formed communities of users from the real-time web. Short-lived crowds reflect the real-time interests of their constituents and ...
Uses and gratifications of social networking sites for bridging and bonding social capital
Applying uses and gratifications theory (UGT) and social capital theory, our study examined users of four social networking sites (SNSs) (Facebook, Twitter, Instagram, and Snapchat), and their influence on online bridging and bonding social capital. ...
Discovering Overlapping Groups in Social Media
ICDM '10: Proceedings of the 2010 IEEE International Conference on Data MiningThe increasing popularity of social media is shortening the distance between people. Social activities, e.g., tagging in Flickr, book marking in Delicious, twittering in Twitter, etc. are reshaping people’s social life and redefining their social roles. ...
Comments