Abstract
Motif discovery is a fundamental operation in the analysis of time series data. Existing motif discovery algorithms that support Dynamic Time Warping require manual determination of the exact length of motifs. However, setting appropriate length for interesting motifs can be challenging and selecting inappropriate motif lengths may result in valuable patterns being overlooked. This paper addresses the above problem by proposing algorithms that automatically compute motifs of all lengths using Dynamic Time Warping. Specifically, a batch algorithm as well as an anytime algorithm are designed in this paper, which are refered as BatchMotif and AnytimeMotif respectively. The proposed algorithms achieve significant improvements in efficiency by fully leveraging the correlations between the motifs of different lengths. Experiments conducted on real datasets demonstrate the superiority of both of the proposed algorithms. On average, BatchMotif is 13 times faster than the baseline method. Additionally, AnytimeMotif is 7 times faster than the baseline method and is capable of providing relatively satisfying results with only a small portion of calculation.







Similar content being viewed by others
Notes
In this paper, the term "candidate" refers to a pair of subsequences of equal length. Because \(L_{\min }< L_{\max }<<n\), there are about \(O(n^2)\) pairs of candidates for a fixed motif length.
Set \(\hat{S}_i^L[t] \leftarrow -\hat{S}_i^L[t]\), \(\hat{S}_i^{L+K}[t] \leftarrow -\hat{S}_i^{L+K}[t]\), \(u_t \leftarrow -v_t\), \(u_t \leftarrow -v_t\), \(u'_t \leftarrow -v'_t\), \(u'_t \leftarrow -v'_t\). Clearly, such a permutation does not affect the value of \(LB_{Keogh}\). Then you can find it is identical to the previous case.
\(LB_{Keogh2}\),\(LB_{Keogh4}\),...,\(LB_{Keogh16}\) are the downsampling variants of \(LB_{Keogh}\), they are faster but less effective.
In our paper, the top-1 ED motif is computed using STOMP [15] algorithm.
References
Dau, H.A., Keogh, E. (2017) Matrix profile v: A generic technique to incorporate domain knowledge into motif discovery. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 125–134
Chiu, B., Keogh, E., Lonardi, S. (2003) Probabilistic discovery of time series motifs. In: Proceedings of the Ninth ACM SIGKDD international conference on knowledge discovery and data mining, pp. 493–498
Mueen, A., Keogh, E., Zhu, Q., Cash, S., Westover, B. (2009) Exact discovery of time series motifs. In: Proceedings of the 2009 SIAM international conference on data mining, pp. 473–484. SIAM
Mueen, A., Keogh, E. (2010) Online discovery and maintenance of time series motifs. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1089–1098
Alaee, S., Kamgar, K., Keogh, E. (2020) Matrix profile xxii: exact discovery of time series motifs under dtw. In: 2020 IEEE international conference on data mining (ICDM), pp. 900–905. IEEE
Alaee, S., Mercer, R., Kamgar, K., Keogh, E.: Time series motifs discovery under dtw allows more robust discovery of conserved structure. Data Mining and Knowledge Discovery 35, 863–910 (2021)
Vullings, H., Verhaegen, M.H., Verbruggen, H. (1998) Automated ecg segmentation with dynamic time warping. In: Proceedings of the 20th annual international conference of the IEEE engineering in medicine and biology society. vol. 20 Biomedical Engineering Towards the Year 2000 and Beyond (Cat. No. 98CH36286), pp. 163–166. IEEE
Wang, K., Gasser, T.: Alignment of curves by dynamic time warping. The Annals of Statistics 25(3), 1251–1276 (1997)
Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Addressing big data time series: Mining trillions of time series subsequences under dynamic time warping. ACM Transactions on Knowledge Discovery from Data (TKDD) 7(3), 1–31 (2013)
Wu, J., Wang, P., Pan, N., Wang, C., Wang, W., Wang, J. (2019) Kv-match: A subsequence matching approach supporting normalization and time warping. In: 2019 IEEE 35th international conference on data engineering (ICDE), pp. 866–877. IEEE
Madrid, F., Imani, S., Mercer, R., Zimmerman, Z., Shakibay, N., Keogh, E. (2019) Matrix profile xx: Finding and visualizing time series motifs of all lengths using the matrix profile. In: 2019 IEEE International conference on big knowledge (ICBK), pp. 175–182. IEEE
Linardi, M., Palpanas, T.: Scalable, variable-length similarity search in data series: The ulisse approach. Proceedings of the VLDB Endowment 11(13), 2236–2248 (2018)
Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7, 358–386 (2005)
Gharghabi, S., Yeh, C.-C.M., Ding, Y., Ding, W., Hibbing, P., LaMunion, S., Kaplan, A., Crouter, S.E., Keogh, E.: Domain agnostic online semantic segmentation for multi-dimensional time series. Data Min. Knowl. Disc. 33, 96–130 (2019)
Yeh, C.-C.M., Zhu, Y., Ulanova, L., Begum, N., Ding, Y., Dau, H.A., Silva, D.F., Mueen, A., Keogh, E. (2016) Matrix profile i: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: 2016 IEEE 16th international conference on data mining (ICDM), pp. 1317–1322. Ieee
Box, G.E., Jenkins, G.M., Reinsel, G.C., Ljung, G.M. (2015) Time Series Analysis: Forecasting and Control. Wiley
Böse, J.-H., Flunkert, V., Gasthaus, J., Januschowski, T., Lange, D., Salinas, D., Schelter, S., Seeger, M., Wang, Y.: Probabilistic demand forecasting at scale. Proceedings of the VLDB Endowment 10(12), 1694–1705 (2017)
Lim, B., Zohren, S.: Time-series forecasting with deep learning: a survey. Phil. Trans. R. Soc. A 379(2194), 20200209 (2021)
Gupta, M., Gao, J., Aggarwal, C.C., Han, J. (2013) Outlier detection for temporal data: A survey.IEEE Trans. Knowl. Data Eng. 26(9):2250–2267
Blázquez-García, A., Conde, A., Mori, U., Lozano, J.A.: A review on outlier/anomaly detection in time series data. ACM Computing Surveys (CSUR) 54(3), 1–33 (2021)
Abanda, A., Mori, U., Lozano, J.A.: A review on distance based time series classification. Data Min. Knowl. Disc. 33(2), 378–412 (2019)
Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., Muller, P.-A.: Deep learning for time series classification: a review. Data Min. Knowl. Disc. 33(4), 917–963 (2019)
Lonardi, J., Patel, P. (2002) Finding motifs in time series. In: Proc. of the 2nd workshop on temporal data mining, pp. 53–68
Yeh, C.-C.M., Kavantzas, N., Keogh, E. (2017) Matrix profile vi: Meaningful multidimensional motif discovery. In: 2017 IEEE international conference on data mining (ICDM), pp. 565–574. IEEE
Zhu, Y., Zimmerman, Z., Senobari, N.S., Yeh, C.-C.M., Funning, G., Mueen, A., Brisk, P., Keogh, E. (2016) Matrix profile ii: Exploiting a novel algorithm and gpus to break the one hundred million barrier for time series motifs and joins. In: 2016 IEEE 16th international conference on data mining (ICDM), pp. 739–748. IEEE
Zimmerman, Z., Kamgar, K., Senobari, N.S., Crites, B., Funning, G., Brisk, P., Keogh, E. (2019) Matrix profile xiv: scaling time series motif discovery with gpus to break a quintillion pairwise comparisons a day and beyond. In: Proceedings of the ACM symposium on cloud computing, pp. 74–86
Zhu, Y., Yeh, C.-C.M., Zimmerman, Z., Kamgar, K., Keogh, E. (2018) Matrix profile xi: Scrimp++: time series motif discovery at interactive speeds. In: 2018 IEEE international conference on data mining (ICDM), pp. 837–846. IEEE
Ratanamahatana, C.A., Keogh, E. (2004) Everything you know about dynamic time warping is wrong. In: Third workshop on mining temporal and sequential data, vol. 32. Citeseer
Murray, D., Stankovic, L., Stankovic, V.: An electrical load measurements dataset of united kingdom households from a two-year longitudinal study. Scientific Data 4(1), 1–12 (2017)
Willett, D.S., George, J., Willett, N.S., Stelinski, L.L., Lapointe, S.L.: Machine learning for characterization of insect vector feeding. PLoS Computational Biology 12(11), 1005158 (2016)
Acknowledgements
Thanks to Dr. Zhixin Qi and Yonghang Yu for proofreading the paper.
Funding
This work is supported by National Natural Science Foundation of China (NSFC) Grant NOs. U19A2059, U22A2025, U1811461, 61972110, 61832003, 61872105, 62072136.
Author information
Authors and Affiliations
Contributions
Zemin Chao provides the main idea for the paper and conducts the experiments. Hong Gao and Dongjing Miao are responsible for designing the experiments and algorithms. Hongzhi Wang provides valuable suggestions to improve the algorithms and played a crucial role in writing the manuscript.
Corresponding author
Ethics declarations
Competing of interest
We declare that all authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and discussion reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chao, Z., Gao, H., Miao, D. et al. Discovering time series motifs of all lengths using dynamic time warping. World Wide Web 26, 3815–3836 (2023). https://doi.org/10.1007/s11280-023-01207-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-023-01207-6