skip to main content
abstract

Incremental pattern discovery on streams, graphs and tensors

Published: 20 December 2008 Publication History

Abstract

Incremental pattern discovery targets streaming applications where the data continuously arrive incrementally. The questions are how to find patterns (main trends) incrementally; or how to efficiently update the old patterns when new data arrive; or how to utilize the patterns to solve other problems such as anomaly detection?
We first investigate a powerful data model, tensor stream (TS), where there is one tensor per timestamp. To capture diverse data formats, we have a zero-order TS for a single time-series (e.g., the stock price over time), a first-order TS for multiple time-series (sensor measurement streams), a second-order TS for matrices (graphs), and a high-order TS for multi-arrays (Internet communication network, source-destination-port). Second, we develop different online algorithms on TS: 1) the centralized and distributed SPIRIT [7] for mining a 1st-order TS, as well as its extensions for local correlation function and privacy preservation; 2) the compact matrix decomposition (CMD) [5] and GraphScope [4] for a 2nd-order TS; 3) the dynamic tensor analysis (DTA) [2], streaming tensor analysis (STA) and window-based tensor analysis (WTA) for a high-order TS. All the techniques are extensively evaluated for real applications such as network forensics, cluster monitoring.

References

[1]
Evan Hoke, Jimeng Sun, John D. Strunk, Gregory R. Ganger, and Christos Faloutsos. Intemon: Continuous mining of sensor data in large-scale self-* infrastructures. ACM SIGOPS Operating Systems Review, 40(3), 2003.
[2]
Jimeng Sun, Dacheng Tao, and Christos Faloutsos. Beyond streams and graphs: Dynamic tensor analysis. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (SIGKDD), 2006.
[3]
Jimeng Sun, Spiros Papadimitriou, and Christos Faloutsos. Distributed pattern discovery in multiple streams. In Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2006.
[4]
Jimeng Sun, Spiros Papadimitriou, Christos Faloutsos, Philip S. Yu: GraphScope: parameter-free mining of large time-evolving graphs. KDD 2007: 687--696.
[5]
Jimeng Sun, Yinglian Xie, Hui Zhang, and Christos Faloutsos. Less is more: Compact matrix decomposition for large sparse graphs. In SDM, 2007.
[6]
Lieven De Lathauwer. Signal Processing Based on Multilinear Algebra. PhD thesis, Katholieke, University of Leuven, Belgium, 1997.
[7]
Spiros Papadimitriou, Jimeng Sun, and Christos Faloutsos. Streaming pattern discovery in multiple time-series. In VLDB, pages 697--708, 2005.
[8]
Stephen Bay, Krishna Kumaraswamy, Markus G. Anderle, Rohit Kumar, and David M. Steier. Large scale detection of irregularities in accounting data. In ICDM, pages 75--86, 2006.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGKDD Explorations Newsletter
ACM SIGKDD Explorations Newsletter  Volume 10, Issue 2
December 2008
98 pages
ISSN:1931-0145
EISSN:1931-0153
DOI:10.1145/1540276
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2008
Published in SIGKDD Volume 10, Issue 2

Check for updates

Author Tags

  1. data mining
  2. graph mining
  3. stream mining
  4. tensor analysis

Qualifiers

  • Abstract

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media