Discovering sub-patterns from time series using a normalized cross-match algorithm

Gong, Xueyuan; Fong, Simon; Wong, Raymond K.; Mohammed, Sabah; Fiaidhi, Jinan; Vasilakos, Athanasios V.

doi:10.1007/s11227-016-1632-z

Discovering sub-patterns from time series using a normalized cross-match algorithm

Published: 04 February 2016

Volume 72, pages 3850–3867, (2016)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Xueyuan Gong¹,
Simon Fong¹,
Raymond K. Wong²,
Sabah Mohammed³,
Jinan Fiaidhi³ &
…
Athanasios V. Vasilakos⁴

316 Accesses
2 Citations
Explore all metrics

Abstract

Time series data stream mining has attracted considerable research interest in recent years. Pattern discovery is a challenging problem in time series data stream mining. Because the data update continuously and the sampling rates may be different, dynamic time warping (DTW)-based approaches are used to solve the pattern discovery problem in time series data streams. However, the naive form of the DTW-based approach is computationally expensive. Therefore, Toyoda proposed the CrossMatch (CM) approach to discover the patterns between two time series data streams (sequences), which requires only O(n) time per data update, where n is the length of one sequence. CM, however, does not support normalization, which is required for some kinds of sequences (e.g. stock prices, ECG data). Therefore, we propose a normalized-CrossMatch approach that extends CM to enforce normalization while maintaining the same performance capabilities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep learning for time series classification: a review

Article 02 March 2019

Hassan Ismail Fawaz, Germain Forestier, … Pierre-Alain Muller

A survey of methods for time series change point detection

Article 08 September 2016

Samaneh Aminikhanghahi & Diane J. Cook

Evaluating time series forecasting models: an empirical study on performance estimation methods

Article 13 October 2020

Vitor Cerqueira, Luis Torgo & Igor Mozetič

References

Sakurai Y, Faloutsos C, Yamamuro M (2007) Stream monitoring under the time warping distance. In: IEEE 23rd international conference on data engineering (ICDE), pp 1046–1055
Gong X, Si Y-W, Fong S, Mohammed S (2014) Nspring: normalization-supported spring for subsequence matching on time series streams. In: IEEE 15th international symposium on computational intelligence and informatics (CINTI), pp 373–378
Toyoda M, Sakurai Y, Ichikawa T (2008) Identifying similar subsequences in data streams. In: Database and expert systems applications, pp 210–224
Toyoda M, Sakurai Y (2010) Discovery of cross-similarity in data streams. In: IEEE 26th international conference on data engineering (ICDE), pp 101–104
Toyoda M, Sakurai Y, Ishikawa Y (2013) Pattern discovery in data streams under the time warping distance. VLDB J 22(3):295–318
Article Google Scholar
Angiulli F, Fassetti F (2007) Detecting distance-based outliers in streams of data. In: Proceedings of the 16th conference on information and knowledge management (CIKM), pp 811–820
Bu Y, Chen L, Fu AW-C, Liu D (2009) Efficient anomaly monitoring over moving object trajectory streams. In: Proceedings of the 15th international conference on knowledge discovery and data mining (SIGKDD), pp 159–168
Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Discov 7(4):349–371
Article MathSciNet Google Scholar
Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th international conference on knowledge discovery and data mining (SIGKDD), pp 262–270
Aach J, Church GM (2001) Aligning gene expression time series with time warping algorithms. Bioinformatics 17(6):495–508
Article Google Scholar
Yi B-K, Jagadish H, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: Proceedings of the 14th international conference on data engineering (ICDE), pp 201–208
Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process 23(1):67–72
Article Google Scholar
Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49
Article MATH Google Scholar
Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386
Article Google Scholar
Keogh E, Wei L, Xi X, Vlachos M, Lee S-H, Protopapas P (2009) Supporting exact indexing of arbitrarily rotated shapes and periodic time series under euclidean and warping distance measures. Int J Very Large Data Bases 18(3):611–630
Article Google Scholar
Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: Proceedings of the 9th international conference on knowledge discovery and data mining (SIGKDD), pp 493–498
Mueen A (2013) Enumeration of time series motifs of all lengths. In: IEEE 13th international conference on data mining (ICDM), pp 547–556
Mueen A, Keogh EJ, Zhu Q, Cash S, Westover MB (2009) Exact discovery of time series motifs. In: SDM, pp 473–484
Ringeval F, Sonderegger A, Sauer J, Lalanne D (2013) Introducing the recola multimodal corpus of remote collaborative and affective interactions. In: 10th IEEE international conference and workshops on automatic face and gesture recognition (FG), pp 1–8
Agrawal R, Faloutsos C, Swami AN (1993) Efficient similarity search in sequence databases. In: Proceedings of the 4th international conference on foundations of data organization and algorithms (FODO), pp 69–84
Wan Y, Gong X, Si Y-W (2016) Effect of segmentation on financial time series pattern matching. Appl Soft Comput 38:346–359
Article Google Scholar

Download references

Acknowledgments

The authors are thankful for the financial support from the research grant “Temporal Data Stream Mining by Using Incrementally Optimized Very Fast Decision Forest (iOVFDF)”, Grant No. MYRG2015-00128-FST, offered by the University of Macau, FST, and RDAO.

Author information

Authors and Affiliations

Department of Computer and Information Science, University of Macau, Macau, China
Xueyuan Gong & Simon Fong
School of Computer Science and Engineering, University of New South Wales, Sydney, Australia
Raymond K. Wong
Department of Computer Science, Lakehead University, Thunder Bay, Canada
Sabah Mohammed & Jinan Fiaidhi
Department of Computer Science, Electrical and Space Engineering, Lulea University of Technology, Lulea, Sweden
Athanasios V. Vasilakos

Authors

Xueyuan Gong
View author publications
You can also search for this author in PubMed Google Scholar
Simon Fong
View author publications
You can also search for this author in PubMed Google Scholar
Raymond K. Wong
View author publications
You can also search for this author in PubMed Google Scholar
Sabah Mohammed
View author publications
You can also search for this author in PubMed Google Scholar
Jinan Fiaidhi
View author publications
You can also search for this author in PubMed Google Scholar
Athanasios V. Vasilakos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Fong.

Appendix

The original equation of $\sigma _{i,j}$ is shown below:

$$\begin{aligned} \sigma _{i,j}=\sqrt{\frac{1}{j-i+1}\sum _{r=i}^{j} (x_{r}-\mu _{i,j})^{2}} \end{aligned}$$

Then it is derived as follows:

$$\begin{aligned} \sigma ^{2}_{i,j}&=\frac{1}{j-i+1}\sum _{r=i}^{j}(x_{r}-\mu _{i,j})^{2}\\&=\frac{1}{j-i+1}\sum _{r=i}^{j}(x^{2}_{r}-2x_{r}\mu _{i,j}+\mu ^{2}_{i,j})\\&=\frac{1}{j-i+1}\left( \sum _{r=i}^{j}x^{2}_{r}-\sum _{r=i}^{j}2x_{r}\mu _{i,j}+\sum _{r=i}^{j}\mu ^{2}_{i,j}\right) \\&=\frac{1}{j-i+1}\left( \sum _{r=i}^{j}x^{2}_{r}-2\mu _{i,j}\sum _{r=i}^{j}x_{r}+\left( j-i+1\right) \mu ^{2}_{i,j}\right) \\&=\frac{1}{j-i+1}\sum _{r=i}^{j}x^{2}_{r}-2\mu _{i,j}\frac{1}{j-i+1}\sum _{r=i}^{j}x_{r}+\mu ^{2}_{i,j} \end{aligned}$$

From Eq. (5), we know $\frac{1}{j-i+1}\sum _{r=i}^{j}x_{r}=\mu _{i,j}$. Then we have:

$$\begin{aligned} \sigma ^{2}_{i,j}&=\frac{1}{j-i+1}\sum _{r=i}^{j}x^{2}_{r}-2\mu ^{2}_{i,j}+\mu ^{2}_{i,j}\\&=\frac{1}{j-i+1}\sum _{r=i}^{j}x^{2}_{r}-\mu ^{2}_{i,j} \end{aligned}$$

Finally, we get:

$$\begin{aligned} \sigma _{i,j}=\sqrt{\frac{1}{j-i+1}\sum _{r=i}^{j}x^{2}_{r}-\mu ^{2}_{i,j}} \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gong, X., Fong, S., Wong, R.K. et al. Discovering sub-patterns from time series using a normalized cross-match algorithm. J Supercomput 72, 3850–3867 (2016). https://doi.org/10.1007/s11227-016-1632-z

Download citation

Published: 04 February 2016
Issue Date: October 2016
DOI: https://doi.org/10.1007/s11227-016-1632-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering sub-patterns from time series using a normalized cross-match algorithm

Abstract

Access this article

Similar content being viewed by others

Deep learning for time series classification: a review

A survey of methods for time series change point detection

Evaluating time series forecasting models: an empirical study on performance estimation methods

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discovering sub-patterns from time series using a normalized cross-match algorithm

Abstract

Access this article

Similar content being viewed by others

Deep learning for time series classification: a review

A survey of methods for time series change point detection

Evaluating time series forecasting models: an empirical study on performance estimation methods

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation