Skip to main content

Community Evolution and Change Point Detection in Time-Evolving Graphs

  • Chapter
  • First Online:
Link Mining: Models, Algorithms, and Applications

Abstract

How can we find communities in dynamic networks of social interactions, such as who calls whom, who emails whom, or who sells to whom? How do we store a large volume of IP network source–destination connection graphs, which grow over time? In this chapter, we study these two fundamental problems on time-evolving graphs and exploit the subtle connection between pattern mining and compression. We propose a pattern mining method, GraphScope, that automatically reveals the underlying communities in the graphs, as well as the change points in time. Our method needs no human intervention, and it is carefully designed to operate in a streaming fashion. Moreover, it is based on lossless compression principles. Therefore, in addition to revealing the fundamental structure of the graphs, the discovered patterns naturally lead to an excellent storage scheme for graph streams. Thus, our proposed GraphScope method unifies and solves both the mining and the compression problem (1) by producing meaningful time-evolving patterns agreeing with human intuition and (2) by identifying key change points in several real large time-evolving graphs. We demonstrate its efficiency and effectiveness on real data sets from several domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    To encode a positive integer x, we need \({\log^\star} x\approx \log_2 x+\log_2\log_2 x+\cdots \), where only the positive terms are retained and this is the optimal length, if the range of x is unknown [25]

  2. 2.

    One edge from 4 to 3 in \({\ensuremath{G^{(1)}_{}}}\), two edges from 4 to 2 and 3 in \({\ensuremath{G^{(2)}_{}}}\) in Fig. 3.2.

  3. 3.

    \((t_{s+1}-t_s)n\) is the total number of possible edges of a source node in the graph segment

  4. 4.

    Two graphs are superimposed together by taking the union of their edges.

  5. 5.

    The campus network is constantly running some port-scanning program to identify potential vulnerability of the in-network hosts.

  6. 6.

    Due anonymity requirements, the account types are obfuscated.

References

  1. C. C. Aggarwal and P. S. Yu. Online analysis of community evolution in data streams. In SDM, 2005.

    Google Scholar 

  2. R. Rantzau, and R. Srikant. Auditing compliance with a hippocratic database. In VLDB, Toronto, Ontario, Canada, 2004.

    Google Scholar 

  3. B. Babcock, M. Datar, and R. Motwani. Sampling from a moving window over streaming data. In SODA, 2002.

    Google Scholar 

  4. L. Backstrom, D. P. Huttenlocher, J. M. Kleinberg, and X. Lan. Group formation in large social networks: Membership, growth, and evolution. In KDD, pages 44–54, 2006.

    Google Scholar 

  5. D. Chakrabarti. Autopart: Parameter-free graph partitioning and outlier detection. In PKDD, pages 112–124, 2004.

    Google Scholar 

  6. D. Chakrabarti, S. Papadimitriou, D. S. Modha, and C. Faloutsos. Fully automatic cross-associations. In KDD, pages 79–88. ACM Press, 2004.

    Google Scholar 

  7. G. Cormode, M. Datar, P. Indyk, and S. Muthukrishnan. Comparing data streams using hamming norms (how to zero in). In TKDE, 15(3), 2003.

    Google Scholar 

  8. T. M. Cover and J. A. Thomas. Elements of information theory. Wiley-Interscience, New York, NY, 1991.

    Book  Google Scholar 

  9. I. S. Dhillon, S. Mallela, and D. S. Modha. Information-theoretic co-clustering. In KDD, pages 89–98, 2003.

    Google Scholar 

  10. P. Drineas, R. Kannan, and M. W. Mahoney. Fast monte carlo algorithms for matrices iii: Computing a compressed approximate matrix decomposition. In SIAM Journal on Computing, 36(1):184–206, 2006.

    Article  Google Scholar 

  11. M. Faloutsos, P. Faloutsos, and C. Faloutsos. On power-law relationships of the internet topology. In SIGCOMM, 1999.

    Google Scholar 

  12. V. Ganti, J. Gehrke, and R. Ramakrishnan. Mining data streams under block evolution. SIGKDD Explorations Newsletter, 3(2), 2002.

    Google Scholar 

  13. M. Garofalakis and P. B. Gibbons. Probabilistic wavelet synopses. ACM Transactions on Database System, 29(1), 2004.

    Google Scholar 

  14. G. H. Golub and C. F. V. Loan. Matrix Computation. Johns Hopkins University Press, Baltimore, MD, 3rd edition, 1996.

    Google Scholar 

  15. P. Indyk. Stable distributions, pseudorandom generators, embeddings and data stream computation. In FOCS, 2000.

    Google Scholar 

  16. G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distributed Computing, 48(1):96–129, 1998.

    Article  Google Scholar 

  17. E. Keogh, S. Lonardi, and C. A. Ratanamahatana. Towards parameter-free data mining. In KDD, pages 206–215, New York, NY ACM Press, 2004.

    Google Scholar 

  18. R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. Extracting large-scale knowledge bases from the web. In VLDB, Edinburgh, Scotland, 1999.

    Google Scholar 

  19. J. Leskovec, J. Kleinberg, and C. Faloutsos. Graphs over time: Densification laws, shrinking diameters and possible explanations. In SIGKDD, 2005.

    Google Scholar 

  20. Y.-R. Lin, Y. Chi, S. Zhu, H. Sundaram, and B. Tseng. Analyzing communities and their evolutions in dynamics networks. ACM Transactions on Knowledge Discovery from Data (TKDD), special issue on Social Computing, Behavioral Modeling, 3, 2009.

    Google Scholar 

  21. Y.-R. Lin, J. Sun, P. Castro, R. Konuru, H. Sundaram, and A. Kelliher. Metafac: Community discovery via relational hypergraph factorization. In KDD, 2009.

    Google Scholar 

  22. S. Muthukrishnan. Data streams: algorithms and applications, Foundations and Trends. in Theoretical Computer Science, 1, 2005.

    Google Scholar 

  23. A. Y. Ng, M. I. Jordan, and Y. Weiss. On spectral clustering: Analysis and an algorithm. In NIPS, pages 849–856, 2001.

    Google Scholar 

  24. S. Papadimitriou, J. Sun, and C. Faloutsos. Streaming pattern discovery in multiple time-series. In VLDB, pages 697–708, Trondheim, Norway, 2005.

    Google Scholar 

  25. J. Rissanen. A universal prior for integers and estimation by minimum description length. Annals of Statistics, 11(2):416–431, 1983.

    Article  Google Scholar 

  26. J. Sun, D. Tao, and C. Faloutsos. Beyond streams and graphs: Dynamic tensor analysis. In KDD, pages 374–383, 2006.

    Google Scholar 

  27. J. Sun, Y. Xie, H. Zhang, and C. Faloutsos. Less is more: Compact matrix decomposition for large sparse graphs. In Proceedings of the 2007 SIAM International Conference on Data Mining (SDM), 2007.

    Google Scholar 

  28. Y. Zhu and D. Shasha. Statstream: Statistical monitoring of thousands of data streams in real time. In VLDB, pages 358–369, Hong Kong, China, 2002.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jimeng Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Sun, J., Papadimitriou, S., Yu, P.S., Faloutsos, C. (2010). Community Evolution and Change Point Detection in Time-Evolving Graphs. In: Yu, P., Han, J., Faloutsos, C. (eds) Link Mining: Models, Algorithms, and Applications. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-6515-8_3

Download citation

Publish with us

Policies and ethics