Abstract
Dynamic Time Warping (DTW) is a very popular similarity measure used for time series classification, retrieval or clustering. DTW is, however, a costly measure, and its application on numerous and/or very long time series is difficult in practice. This paper proposes a new approach for time series retrieval: time series are embedded into another space where the search procedure is less computationally demanding, while still accurate. This approach is based on transforming time series into high-dimensional vectors using DTW-preserving shapelets. That transform is such that the relative distance between the vectors in the Euclidean transformed space well reflects the corresponding DTW measurements in the original space. We also propose strategies for selecting a subset of shapelets in the transformed space, resulting in a trade-off between the complexity of the transformation and the accuracy of the retrieval. Experimental results using the well known UCR time series demonstrate the importance of this trade-off.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97(1–2), 245–271 (1997)
Chen, Y., et al.: The UCR time series classification archive, July 2015. www.cs.ucr.edu/~eamonn/time_series_data/
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.J.: Querying and mining of time series data: experimental comparison of representations and distance measures. PVLDB 1(2), 1542–1552 (2008)
Esling, P., Agón, C.: Time-series data mining. CSUR 45(1), 12:1–12:34 (2012)
Grabocka, J., Schilling, N., Wistuba, M., Schmidt-Thieme, L.: Learning time-series shapelets. In: KDD, pp. 392–401. ACM (2014)
Hills, J., Lines, J., Baranauskas, E., Mapp, J., Bagnall, A.: Classification of time series by shapelet transformation. DMKD 28(4), 851–881 (2014)
Itakura, F.: Minimum prediction residual principle applied to speech recognition. IEEE Trans. Sig. Process. 23(1), 67–72 (1975)
Keogh, E.J.: Exact indexing of dynamic time warping. In: VLDB, pp. 406–417. Morgan Kaufmann, Burlington (2002)
Keogh, E.J., Chakrabarti, K., Pazzani, M.J., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. KAIS 3(3), 263–286 (2001)
Li, J., et al.: Feature selection: a data perspective. ACM Comput. Surv. 50(6), 94:1–94:45 (2017)
Lods, A., Malinowski, S., Tavenard, R., Amsaleg, L.: Learning DTW-preserving shapelets. In: Adams, N., Tucker, A., Weston, D. (eds.) IDA 2017. LNCS, vol. 10584, pp. 198–209. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68765-0_17
Moradi, P., Rostami, M.: A graph theoretic approach for unsupervised feature selection. Eng. Appl. AI 44, 33–45 (2015)
Papapetrou, P., Athitsos, V., Potamias, M., Kollios, G., Gunopulos, D.: Embedding-based subsequence matching in time-series databases. TODS 36(3), 17:1–17:39 (2011)
Rakthanmanon, T., et al.: Searching and mining trillions of time series subsequences under DTW. In: KDD, pp. 262–270. ACM (2012)
Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Sig. Process. 26(1), 43–49 (1978)
Shieh, J., Keogh, E.J.: iSAX: indexing and mining terabyte sized time series. In: KDD, pp. 623–631. ACM (2008)
Tan, C.W., Webb, G.I., Petitjean, F.: Indexing and classifying gigabytes of time series under time warping. In: SDM, pp. 282–290. SIAM (2017)
Tavenard, R.: tslearn: a machine learning toolkit dedicated to time-series data (2017). https://github.com/rtavenar/tslearn
Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.J.: Experimental comparison of representation methods and distance measures for time series data. DMKD 26(2), 275–309 (2013)
Ye, L., Keogh, E.J.: Time series shapelets: a new primitive for data mining. In: KDD, pp. 947–956. ACM (2009)
Yi, B., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: VLDB, pp. 385–394. Morgan Kaufmann, Burlington (2000)
Zakaria, J., Mueen, A., Keogh, E.J.: Clustering time series using unsupervised-shapelets. In: ICDM, pp. 785–794. IEEE Computer Society (2012)
Acknowledgments
The current work has been performed with the support of CNPq (Conselho Nacional de Desenvolvimento CientÃfico e Tecnológico), Brazil (Process number 233209/2014–0). The authors are grateful to the TRANSFORM project funded by STIC-AMSUD (18-STIC-09) for the partial financial support to this work.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Sperandio, R.C., Malinowski, S., Amsaleg, L., Tavenard, R. (2018). Time Series Retrieval Using DTW-Preserving Shapelets. In: Marchand-Maillet, S., Silva, Y., Chávez, E. (eds) Similarity Search and Applications. SISAP 2018. Lecture Notes in Computer Science(), vol 11223. Springer, Cham. https://doi.org/10.1007/978-3-030-02224-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-02224-2_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02223-5
Online ISBN: 978-3-030-02224-2
eBook Packages: Computer ScienceComputer Science (R0)