Skip to main content
Log in

Discovering correlated spatio-temporal changes in evolving graphs

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Graphs provide powerful abstractions of relational data, and are widely used in fields such as network management, web page analysis and sociology. While many graph representations of data describe dynamic and time evolving relationships, most graph mining work treats graphs as static entities. Our focus in this paper is to discover regions of a graph that are evolving in a similar manner. To discover regions of correlated spatio-temporal change in graphs, we propose an algorithm called cSTAG. Whereas most clustering techniques are designed to find clusters that optimise a single distance measure, cSTAG addresses the problem of finding clusters that optimise both temporal and spatial distance measures simultaneously. We show the effectiveness of cSTAG using a quantitative analysis of accuracy on synthetic data sets, as well as demonstrating its utility on two large, real-life data sets, where one is the routing topology of the Internet, and the other is the dynamic graph of files accessed together on the 1998 World Cup official website.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: Proceedings of the 29th international conference on very large data bases, pp 81–92

  2. Ahuja R, Magnanti T and Orlin J (1993). Network flows: theory, algorithms, and applications. Prentice Hall, Englewood clitts

    Google Scholar 

  3. Ali MH, Mokbel MF, Aref WG, Kamel I (2005) Detection and tracking of discrete phenomena in sensor-network databases. In: Proceedings of the 17th international conference on scientific and statistical database management, pp 163–172

  4. An Y, Janssen J and Milios EE (2004). Characterizing and mining the citation graph of the computer science literature. Knowl Inf Sys 6: 664–678

    Article  Google Scholar 

  5. Arlitt M, Jin T (1999) Workload characterization of the 1998 World Cup website. Technical report HPL-99-35R1, Hewlett-Packard Labs

  6. Bar-Yossef Z, Guy I, Lempel R, Maarek YS, Soroka V (2007) Cluster ranking with an application to mining mailbox networks. Knowl Inf Sys,

  7. Barabasi AL and Albert R (1999). Emergence of scaling in random networks. Science 286: 500–512

    MathSciNet  Google Scholar 

  8. Borgwardt KM, Kriegel HP, Wackersreuther P (2006) Pattern mining in frequent dynamic subgraphs. In: Proceedings of the 6th international conference on data mining, pp 818–822

  9. Celik M, Shekhar S, Rogers JP, Shine JA, Yoo JS (2006) Mixed-drove spatio-temporal co-occurance pattern mining: A summary of results. In: Proceedings of the 6th international conference on data mining, pp 119–128

  10. Chen C (2005) The centrality of pivotal points in the evolution of scientific networks. In: Proceedings of the 10th international conference on intelligent user interfaces, pp 98–105

  11. Chan, J, Bailey J, Leckie C (2006) Discovering and summarising regions of correlated spatio-temporal change in evolving graphs. In: First workshop on spatial and spatio-temporal data mining, pp 361–365

  12. Cheng Y, Church GM (2000) Biclustering of expression data. In: Proceedings of the 8th international conference on intelligent systems for molecular biology, pp 93–103

  13. Cook D, Holder L (1994) Substructure discovery using minimum description length and background knowledge. In: AAAI-94: The 12th national conference on artificial intelligence. vol 2, p 1442

  14. Cook WJ, Cunningham WH, Pulleyblank WR and Schrijver A (1998). Combinatorial Optimization. Wiley-Interscience, New York

    MATH  Google Scholar 

  15. Cormen TH, Leiserson CE, Rivest RL and Stein C (2001). Introduction to algorithms. MIT Press, Cambridge

    MATH  Google Scholar 

  16. Cowie J, Popescu A, Underwood T (2005) Impact of Hurricane Katrina on Internet infrastructure. Technical report, Renesys Corporation. http://www.renesys.com/resource_library/Renesys-Katrina-Report-9sep2005.pdf

  17. Demetrescu C and Italiano GF (2004). A new approach to dynamic all pairs shortest paths. J ACM 51(6): 968–992

    Article  MATH  MathSciNet  Google Scholar 

  18. Demetrescu C and Italiano GF (2006). Experimental analysis of dynamic all pairs shortest path algorithms. ACM Trans Algorithms 2(4): 578–601

    Article  MathSciNet  Google Scholar 

  19. Desikan P, Pathak N, Srivastava J, Kumar V (2005) Incremental pagerank computation on evolving graphs. In: Proceedings of 14th international conference on World Wide Web, pp 1094–1095

  20. Desikan P, Srivastava J (2004a) Analyzing network traffic to detect e-mail spamming machines. In: ICDM workshop on privacy and security aspects of data mining

  21. Desikan P, Srivastava J (2004b) Mining temporally evolving graphs. In: KDD workshop on web mining and web usage analysis. Seattle

  22. Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining, pp 269–274

  23. Duda RO, Hart PE and Stork DG (2000). Pattern classification. Wiley-Interscience, New York

    Google Scholar 

  24. Feamster N, Balakrishnan H, Rexford J (2004) Some foundational problems in interdomain routing. In: 3rd ACM SIGCOMM workshop on hot topics in networking (HotNets)

  25. Frigioni D, Marchetti-Spaccamela A, Nanni U (1996) Fully dynamic output bounded single source shortest path problem. In: Proceedings of the 7th annual ACM-SIAM symposium on discrete algorithms, pp 212–221

  26. Gaertler M, Patrignani M (2004) Dynamic analysis of the autonomous system graph. In: Second international workshop on inter-domain performance and simulation, pp 13–24

  27. Girvan M, Newman ME (2002) Community structure in social and biological networks. In: Proceedings of the national academy of science. vol 99, pp 7821–7826

  28. Halabi S and McPherson D. (2001). Internet routing architectures, 2nd edn. Cisco Press, USA

    Google Scholar 

  29. Halkidi M, Batisakis Y and Vazirgiannis M (2001). On clustering validation techniques. J Intelligent Inf Sys 17(2–3): 107–145

    Article  MATH  Google Scholar 

  30. Hoebe CJ, Spanjaard L, Dankert J, Nlkerke N and Melker H (2004). Space-time cluster analysis of invasive meningococcal disease. Emerg Infect Dis 10(9): 1621–1626

    Google Scholar 

  31. Jain AK and Dubes RC (1998). Algorithms for Clustering Data. Prentice-Hall, Englewood Clitts

    Google Scholar 

  32. Kaindl H and Kainz G (1997). Bidirectional heuristic search reconsidered. J Artif Intell Res 7: 283–317

    MATH  MathSciNet  Google Scholar 

  33. Kandula S, Katabi D, Vasseur J-P (2005) Shrink: A tool for failure diagnosis in IP networks. In: ACM SIGCOMM workshop on mining network data (MineNet-05), pp 173–178

  34. Kawaji H, Yamaguchi Y, Matsuda H and Hashimoto A (2001). A graph-based clustering method for a large set of sequences using a graph partitioning algorithm. Genome Inf 12: 93–102

    Google Scholar 

  35. Keogh E, Pazzani M (2001) Derivative dynamic time warping. In: Proceedings of 1st SIAM international conference on data mining

  36. King V (1999) Fully dynamic algorithms for maintaining all-pairs shorest path and transitive closure in digraphs. In: Proceedings of the 40th IEEE symposium on foundations of computer science, pp 81–99

  37. Kleinberg JM (1998) Authoritative sources in a hyperlinked environment. In: Proceedings of the ACM-SIAM symposium on discrete algorithms, pp 668–677

  38. Kleinberg JM, Kumar R, Raghavan P, Rajagopalan S, Tomkins AS (1999) The Web as a graph: Measurements, models and methods. Lecture notes in computer science vol 1627, pp 1–17

  39. Kumar R, Novak J, Raghavan P, Tomkins AS (2003) On the bursty evolution of blogspace. In: Proceedings of the 12th international conference on World Wide Web, pp 568–576

  40. Kumar R, Novak J, Tomkins AS (2006) Structure and evolution of online social networks. In: Proceedings of the 12th ACM SIGKDD conference on knowledge discovery and data mining (poster)

  41. Lauw HW, Lim E-P, Tan T-T, Pang H-H (2005) Mining social networks from spatio-temporal events. In: Workshop on link analysis, couterterrorism and Security

  42. Lee GJ, Poole L (2006) Diagnosis of TCP overlay connection failures using bayesian networks. In: ACM SIGCOMM Workshop on Mining Network Data (MineNet-06), pp 305–310

  43. Leskovec J, Kleinberg J, Faloutsos C (2005) Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery in data mining, pp 177–187

  44. Neill DB, Moore AW, Sabhnani M, Daniel K (2005) Detection of emerging space-time clusters. In: Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining, pp 218–227

  45. Newman MEJ (2003). The structure and function of complex networks. SIAM Rev 45: 167–256

    Article  MATH  MathSciNet  Google Scholar 

  46. Ramalingam G and Reps T (1996). An incremental algorithm for a generalisation of the shortest-path problem. J Algorithms 21: 267–305

    Article  MATH  MathSciNet  Google Scholar 

  47. Rattigan MJ, Majer M, Jensen D (2006) Using structure indices for efficient approximation of network properties. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 357–366

  48. Salvador S, Chan P (2004) Fastdtw: toward accurate dynamic time warping in linear time and space. In: KDD workshop on mining temporal and sequential data

  49. Shoubridge PJ, Kraetzl M, Wallis WD and Bunke H (2002). Detection of abnormal change in a time series of graphs. J Interconnect Netw 3(1–2): 85–101

    Article  Google Scholar 

  50. Steinder M, Sethi AS (2001) The present and future of event correlation: A need for end-to-end service fault localization. In: Proceedings of world multi-conference on systemics, cybernetics, and informatics, pp 124–129

  51. Steinder M and Sethi AS (2004). Probabilistic fault localization in communication systems using belief networks. IEEE/ACM Trans Netwo 12(5): 809–822

    Article  Google Scholar 

  52. Tang Y, Al-Shaer E, Boutaba R (2005) Active integrated fault localization in communication networks. In: Proceedings of 9th IFIP/IEEE international symposium on integrated network management, 2005, pp 543–556

  53. Ting R, Bailey J (2006) Mining minimal contrast subgraph patterns. In: Proceedings of SIAM international conference on data mining, pp 639–643

  54. Tung AKH, Ng RT, Lakshmanan LVS, Han J (2001) Constraint-based clustering in large databases. In: Proceedings of the 8th international conference on database theory, pp 405–419

  55. Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings of the 18th international conference on data engineering p 673

  56. Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. In: Proceedings of the 17th international conference on machine learning, pp 1103–1110

  57. Washio T and Motoda H (2003). State of the art of graph-based data mining. ACM SIGKDD Explor News 5(1): 59–68

    Article  Google Scholar 

  58. Wu AY, Garland M, Han J (2004) Mining scale-free networks using geodesic clustering. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, pp 719–724

  59. Zhao Q, Liu T-Y, Bhowmick SS, ng Ma W-Y (2006) Event detection from evolution of click-through data. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, pp 484–493

  60. Zhou A, Cao F, Qian W, Jin C (2007) Tracking clusters in evolving data streams over sliding windows. Knowl Inf Sys

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeffrey Chan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chan, J., Bailey, J. & Leckie, C. Discovering correlated spatio-temporal changes in evolving graphs. Knowl Inf Syst 16, 53–96 (2008). https://doi.org/10.1007/s10115-007-0117-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-007-0117-z

Keywords

Navigation