Skip to main content

TWStream: Finding Correlated Data Streams Under Time Warping

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3841))

Abstract

Consider the problem of monitoring multiple data streams and finding all correlated pairs in real time. Such correlations are of special interest for many applications, e.g., the price of two stocks may demonstrate quite similar rise/fall patterns, which provides the market trader with an opportunity of arbitrage. However, the correlated patterns may occur on any unknown scale, with arbitrary lag or even out of phase, which blinds most traditional methods. In this paper, we propose TWStream, a method that can detect pairs of streams, of which subsequences are correlated with elastic shift and arbitrary lag in the time axis. Specifically, (1) to accommodate varying scale and arbitrary lag, we propose to use the geometric time frame in conjunction with a piecewise smoothing approach; (2) to detect unsynchronized correlation, we extend the cross correlation to support time warping, which is proved much more robust than Euclidian based metrics. Our method has a sound theoretical foundation, and is efficient in terms of both time and space complexity. Experiments on both synthetic and real data are done to show its effectiveness and efficiency.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aach, J., Church, G.: Aligning Gene Expression Time Series with Time Warping Algorithms. Bioinformatics 17, 495–508 (2001)

    Article  Google Scholar 

  2. Agrawal, R., Faloutsos, C., Swami, A.: Efficient Similarity Search In Sequence Databases. In: Proc. of FODO Conf. (1993)

    Google Scholar 

  3. Bulut, A., Singh, A.: A Unified Framework for Monitoring Data Streams in Real Time. In: Proc. of ICDE Conf. (2005)

    Google Scholar 

  4. Chan, K., Fu, A.: Efficient Time Series Matching by Wavelets. In: Proc. of ICDE Conf. (1999)

    Google Scholar 

  5. Chan, K., Fu, A., Yu, C.: Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping. IEEE Transactions on Knowledge and Data Engineering 15(3), 686–705 (2003)

    Article  Google Scholar 

  6. Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: Prof. of SIGKDD Conf. (2000)

    Google Scholar 

  7. Ganti, V., Gehrke, J., Ramakrishnan, R.: Mining Data Streams under Block Evlolutions. SIGKDD Explorations 3(2), 1–10 (2002)

    Article  Google Scholar 

  8. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast Subsequence Matching in Time-Series Databases. In: Proc. of ACM SIGMOD Conf. (1994)

    Google Scholar 

  9. Gao, L., Wang, X.: Continually Evaluating Similarity-Based Pattern Queries on a Streaming Time Series. In: Proc. of ACM SIGMOD Conf. (2002)

    Google Scholar 

  10. Geurts, P.: Pattern Extraction for Time Series Classification. In: Proc. of PKDD Conf. (2001)

    Google Scholar 

  11. Ghua, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams: Theory and Practice. IEEE TKDE 15(3), 515–528 (2003)

    Google Scholar 

  12. Keogh, E.: Exact Indexing of Dynamic Time Warping. In: Proc. of VLDB Conf. (2002)

    Google Scholar 

  13. Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases. Knowledge and Information Systems 3(3), 263–286 (2000)

    Article  Google Scholar 

  14. Keogh, E., Chakrabarti, K., Mehrotra, S., Pazzani, M.: Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases. In: Proc. of ACM SIGMOD Conf. (2001)

    Google Scholar 

  15. Korn, F., Jagadish, H., Faloutsos, C.: Efficiently Supporting adhoc Queries in Large Datasets of Time Sequences. In: Proc. of ACM SIGMOD Conf. (1997)

    Google Scholar 

  16. Sakurai, Y., Papadimitriou, S., Faloutsos, C.: BRAID: Streaming Mining through Group Lag Correlations. In: Proc. of ACM SIGMOD Conf. (2005) (to appear)

    Google Scholar 

  17. Yi, B., Faloutsos, C.: Fast Time Sequence Indexing for Arbitrary LP Norms. In: Proc. of VLDB Conf. (2000)

    Google Scholar 

  18. Yi, B., Sidiropoulos, N., Johnson, T., Jagadish, H., Faloutsos, C., Biliris, A.: Online Data Mining for Coevolving Time Sequences. In: Proc. of ICDE Conf. (2000)

    Google Scholar 

  19. Zhu, Y., Shasha, D.: Statistical Monitoring of Thousands of Data Streams in Real Time. In: Proc. of VLDB Conf. (2002)

    Google Scholar 

  20. Zhu, Y., Shasha, D.: Warping Indexes with Envelope Transforms for Query by Humming. In: Proc. of ACM SIGMOD Conf. (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, T. (2006). TWStream: Finding Correlated Data Streams Under Time Warping. In: Zhou, X., Li, J., Shen, H.T., Kitsuregawa, M., Zhang, Y. (eds) Frontiers of WWW Research and Development - APWeb 2006. APWeb 2006. Lecture Notes in Computer Science, vol 3841. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11610113_20

Download citation

  • DOI: https://doi.org/10.1007/11610113_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31142-3

  • Online ISBN: 978-3-540-32437-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics