skip to main content
10.1145/1935826.1935909acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
poster

Transient crowd discovery on the real-time social web

Published:09 February 2011Publication History

ABSTRACT

In this paper, we study the problem of automatically discovering and tracking transient crowds in highly-dynamic social messaging systems like Twitter and Facebook. Unlike the more static and long-lived group-based membership offered on many social networks (e.g., fan of the LA Lakers), a transient crowd is a short-lived ad-hoc collection of users, representing a "hotspot" on the real-time web. Successful detection of these hotspots can positively impact related research directions in online event detection, content personalization, social information discovery, etc. Concretely, we propose to model crowd formation and dispersion through a message-based communication clustering approach over time-evolving graphs that captures the natural conversational nature of social messaging systems. Two of the salient features of the proposed approach are (i) an efficient locality- based clustering approach for identifying crowds of users in near real-time compared to more heavyweight static clustering algorithms; and (ii) a novel crowd tracking and evolution approach for linking crowds across time periods. We find that the locality-based clustering approach results in empirically high-quality clusters relative to static graph clus- tering techniques at a fraction of the computational cost. Based on a three month snapshot of Twitter consisting of 711,612 users and 61.3 million messages, we show how the proposed approach can successfully identify and track interesting crowds based on the Twitter communication structure and uncover crowd-based topics of interest.

References

  1. A website that maintains statistical information about tweets. http://popacular.com/gigatweet/.Google ScholarGoogle Scholar
  2. S. Asur, S. Parthasarathy, and D. Ucar. An event--based framework for characterizing the evolutionary behavior of interaction graphs. In KDD '07, pages 913--921, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. L. Backstrom, D. Huttenlocher, J. Kleinberg, and X. Lan. Group formation in large social networks: membership, growth, and evolution. In KDD '06, pages 44--54, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. I. Dhillon, Y. Guan, and B. Kulis. A fast kernel-based multilevel algorithm for graph clustering. In KDD '05, pages 629--634, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. W. Flake, R. E. Tarjan, and K. Tsioutsiouliklis. Graph clustering and minimum cut trees. Internet Mathematics, 1(4):385--408, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  6. A. V. Goldberg and R. E. Tarjan. A new approach to the maximum flow problem. In STOC '86, pages 136--146, New York, NY, USA, 1986. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. E. Gomory and T. C. Hu. Multi-terminal network flows. Journal of the Society for Industrial and AppliedMathematics, 9(4):551--570, 1961.Google ScholarGoogle ScholarCross RefCross Ref
  8. K. Y. Kamath and J. Caverlee. Identifying hotspots on the real-time web. In CIKM '10, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. M. E. J. Newman. Fast algorithm for detecting community structure in networks, September 2003.Google ScholarGoogle Scholar
  10. B. Saha and P. Mitra. Dynamic algorithm for graph clustering using minimum cut tree. In ICDMW '06, pages 667--671, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Sun, C. Faloutsos, S. Papadimitriou, and P. S. Yu. Graphscope: parameter-free mining of large time-evolving graphs. In 13th ACM SIGKDD' 07, pages 687--696, New York, NY, USA, 2007. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Van Dongen. Graph clustering via a discrete uncoupling process. SIAM J. Matrix Anal. Appl., 30(1):121--141, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Transient crowd discovery on the real-time social web

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WSDM '11: Proceedings of the fourth ACM international conference on Web search and data mining
      February 2011
      870 pages
      ISBN:9781450304931
      DOI:10.1145/1935826

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 February 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      WSDM '11 Paper Acceptance Rate83of372submissions,22%Overall Acceptance Rate498of2,863submissions,17%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader