Abstract
Consider the problem of monitoring multiple data streams and finding all correlated pairs in real time. Such correlations are of special interest for many applications, e.g., the price of two stocks may demonstrate quite similar rise/fall patterns, which provides the market trader with an opportunity of arbitrage. However, the correlated patterns may occur on any unknown scale, with arbitrary lag or even out of phase, which blinds most traditional methods. In this paper, we propose TWStream, a method that can detect pairs of streams, of which subsequences are correlated with elastic shift and arbitrary lag in the time axis. Specifically, (1) to accommodate varying scale and arbitrary lag, we propose to use the geometric time frame in conjunction with a piecewise smoothing approach; (2) to detect unsynchronized correlation, we extend the cross correlation to support time warping, which is proved much more robust than Euclidian based metrics. Our method has a sound theoretical foundation, and is efficient in terms of both time and space complexity. Experiments on both synthetic and real data are done to show its effectiveness and efficiency.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aach, J., Church, G.: Aligning Gene Expression Time Series with Time Warping Algorithms. Bioinformatics 17, 495–508 (2001)
Agrawal, R., Faloutsos, C., Swami, A.: Efficient Similarity Search In Sequence Databases. In: Proc. of FODO Conf. (1993)
Bulut, A., Singh, A.: A Unified Framework for Monitoring Data Streams in Real Time. In: Proc. of ICDE Conf. (2005)
Chan, K., Fu, A.: Efficient Time Series Matching by Wavelets. In: Proc. of ICDE Conf. (1999)
Chan, K., Fu, A., Yu, C.: Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping. IEEE Transactions on Knowledge and Data Engineering 15(3), 686–705 (2003)
Domingos, P., Hulten, G.: Mining High-Speed Data Streams. In: Prof. of SIGKDD Conf. (2000)
Ganti, V., Gehrke, J., Ramakrishnan, R.: Mining Data Streams under Block Evlolutions. SIGKDD Explorations 3(2), 1–10 (2002)
Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast Subsequence Matching in Time-Series Databases. In: Proc. of ACM SIGMOD Conf. (1994)
Gao, L., Wang, X.: Continually Evaluating Similarity-Based Pattern Queries on a Streaming Time Series. In: Proc. of ACM SIGMOD Conf. (2002)
Geurts, P.: Pattern Extraction for Time Series Classification. In: Proc. of PKDD Conf. (2001)
Ghua, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering Data Streams: Theory and Practice. IEEE TKDE 15(3), 515–528 (2003)
Keogh, E.: Exact Indexing of Dynamic Time Warping. In: Proc. of VLDB Conf. (2002)
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases. Knowledge and Information Systems 3(3), 263–286 (2000)
Keogh, E., Chakrabarti, K., Mehrotra, S., Pazzani, M.: Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases. In: Proc. of ACM SIGMOD Conf. (2001)
Korn, F., Jagadish, H., Faloutsos, C.: Efficiently Supporting adhoc Queries in Large Datasets of Time Sequences. In: Proc. of ACM SIGMOD Conf. (1997)
Sakurai, Y., Papadimitriou, S., Faloutsos, C.: BRAID: Streaming Mining through Group Lag Correlations. In: Proc. of ACM SIGMOD Conf. (2005) (to appear)
Yi, B., Faloutsos, C.: Fast Time Sequence Indexing for Arbitrary LP Norms. In: Proc. of VLDB Conf. (2000)
Yi, B., Sidiropoulos, N., Johnson, T., Jagadish, H., Faloutsos, C., Biliris, A.: Online Data Mining for Coevolving Time Sequences. In: Proc. of ICDE Conf. (2000)
Zhu, Y., Shasha, D.: Statistical Monitoring of Thousands of Data Streams in Real Time. In: Proc. of VLDB Conf. (2002)
Zhu, Y., Shasha, D.: Warping Indexes with Envelope Transforms for Query by Humming. In: Proc. of ACM SIGMOD Conf. (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, T. (2006). TWStream: Finding Correlated Data Streams Under Time Warping. In: Zhou, X., Li, J., Shen, H.T., Kitsuregawa, M., Zhang, Y. (eds) Frontiers of WWW Research and Development - APWeb 2006. APWeb 2006. Lecture Notes in Computer Science, vol 3841. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11610113_20
Download citation
DOI: https://doi.org/10.1007/11610113_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-31142-3
Online ISBN: 978-3-540-32437-9
eBook Packages: Computer ScienceComputer Science (R0)