Similarity measures for time series data classification using grid representation and matrix distance

Ye, Yanqing; Jiang, Jiang; Ge, Bingfeng; Dou, Yajie; Yang, Kewei

doi:10.1007/s10115-018-1264-0

Similarity measures for time series data classification using grid representation and matrix distance

Regular Paper
Published: 05 September 2018

Volume 60, pages 1105–1134, (2019)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Yanqing Ye ORCID: orcid.org/0000-0003-4244-5385¹,
Jiang Jiang¹,
Bingfeng Ge¹,
Yajie Dou¹ &
…
Kewei Yang¹

1990 Accesses
17 Citations
Explore all metrics

Abstract

Two similarity measures are proposed that can successfully capture both the numerical and point distribution characteristics of time series. More specifically, a novel grid representation for time series is first presented, with which a time series is segmented and compiled into a matrix format. Based on the proposed grid representation, two matrix matching algorithms, matrix-based Euclidean distance (GMED) and matrix-based dynamic time warping (GMDTW), are adapted to measure the similarity of matrix-like time series. Last, to assess the effectiveness of the proposed similarity measures, 1NN classification and K-means experiments are conducted using 22 online datasets from the UCR time series datasets Web site. In general, the results indicate that GMDTW measure is apparently superior to most current measures in accuracy, while the GMED can achieve much higher efficiency than dynamic time warping algorithm with equivalent performance. Furthermore, effects of the parameters in the proposed measures are analyzed and a way to determine the values of the parameters has been given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Elastic similarity and distance measures for multivariate time series

Article Open access 14 February 2023

A Comparative Study of Similarity Measures for Time Series Classification

A review on distance based time series classification

Article 01 November 2018

References

Leary DEO (2016) Ethics for big data and analytics. IEEE Intell Syst 31(4):81–84
Article Google Scholar
Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time-series clustering: a decade review. Inf Syst 53:16–38
Article Google Scholar
Gandhi, A (2002) Content-based image retrieval: plant species identification. MS thesis, Oregon State University
Esling P, Agon C (2012) Time series data mining. ACM Comput Surv 45(1):7–7
Article MATH Google Scholar
Nielsen CB, Larsen PG, Fitzgerald J, Woodcock J, Peleska J (2015) Systems of systems engineering: basic concepts, model-based techniques, and research directions. ACM Comput Surv 48(2):1–41
Article Google Scholar
Mori U, Mendiburu A, Lozano JA (2016) Similarity measure selection for clustering time series databases. IEEE Trans Knowl Data Eng 28(1):181–195
Article Google Scholar
Serra J, Arcos JL (2014) An empirical evaluation of similarity measures for time series classification. Knowl Based Syst 67:305–314
Article Google Scholar
Baydogan MG, Runger G (2016) Time series representation and similarity based on local autopatterns. Data Min Knowl Discov 30(2):476–509
Article MathSciNet MATH Google Scholar
Keogh E, Chakrabarti K, Mehrotra S, Pazzani M (2001) Locally adaptive dimensionality reduction for indexing large time series databases. In: Proceedings of the 2001 ACM SIGMOD international conference on management of data, pp 151–163
Keogh E (1997) Fast similarity search in the presence of longitudinal scaling in time series databases. In: Proceedings of the ninth IEEE international conference on tools with artificial intelligence, pp 578–584
Keogh E, Pazzani M (2000) A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Proceedings of the 4th Pacific-Asia conference on knowledge discovery and data mining, pp 122–133
Azzouzi M, Nabney IT (1998) Analysing time series structure with hidden Markov models. In: Proceedings of the IEEE conference on neural networks and signal processing, pp 402–408
Serr J, Kantz H, Serra X, Andrzejak RG (2012) Predictability of music descriptor time series and its application to cover song detection. IEEE Trans Audio Speech Lang Process 20:514–525
Google Scholar
Weng X, Shen J (2008) Classification of multivariate time series using two-dimensional singular value decomposition. Knowl Based Syst 21:535–539
Article Google Scholar
Shieh J, Keogh E (2008) iSAX: indexing and mining terabyte sized time series. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 623–631
Zhang Z, Tang P, Duan R (2015) Dynamic time warping under pointwise shape context. Inf Sci 315:88–101
Article MathSciNet Google Scholar
Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista G (2015) The UCR time series classification archive. www.cs.ucr.edu/eamonn/time_series_data/
Aghabozorgi S, Shirkhorshidi AS, Wah TY (2015) Time series clustering: a decade review. Inf Syst 53:16–38
Article Google Scholar
Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery, pp 2–11
Shieh J, Keogh E (2008) iSAX: indexing and mining terabyte sized time series. In: Proceedings the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 623–631
Agrawal R, Faloutsos C, Swami A, Lomet D (ed) (1993) Efficient similarity search in sequence databases, foundations of data organization and algorithms. Springer, Berlin, pp 69–84
Chen L, TamerOzsu M (2003) Similarity-based retrieval of time-series data using multi-scale histograms, computer sciences technical report. University of Waterloo, Waterloo, CS-2003-31
An J, Chen H, Furuse K, Ohbo N, Keogh E (2003) Grid-based indexing for large time series databases. In: Intelligent data engineering and automated learning (IDEAL), pp 614–621
Duan G, Suzuki Y, Kawagoe K (2006) Grid representation of efficient similarity search in time series databases. In: Proceedings of the 22nd international conference on data engineering workshops (ICDEW’06), pp 64–70
Reshef DN, Reshef YA, Finucane HK, Grossman SR, Mcvean G, Turnbaugh PJ, Lander ES, Mitzenmacher M, Sabeti PC (2011) Detecting novel associations in large data sets. Science 334(6062):1518–1524
Article MATH Google Scholar
Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386
Article Google Scholar
Gorecki T (2014) Using derivatives in a longest common subsequence dissimilarity measure for time series classification. Pattern Recogn Lett 45:99–105
Article Google Scholar
Jeong YS, Jayaraman R (2015) Support vector-based algorithms with weighted dynamic time warping kernel function for time series classification. Knowl Based Syst 75:184–191
Article Google Scholar
Jeong YS, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recogn 44:2231–2240
Article Google Scholar
Chen L, Ng R (2004) On the marriage of Lp-norms and edit distance. In: VLDB04: Proceedings of the 30th international conference on very large data bases, pp 792–803
Das G, Gunopulos D, Mannila H (1997) Finding similar time series. In: Komorowski J, Zytkow J (eds) Principles of data mining and knowledge discovery. Springer, Berlin, pp 88–100
Chapter Google Scholar
Morse MD, Patel JM (2007) An efficient and accurate method for evaluating time series similarity. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, pp 569–580
Chen L, Zsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on management of data, pp 491–502
Yueguo C, Nascimento MA, Beng CO, Tung AKH (2007) SpADe: on shape based pattern detection in streaming time series. In: Proceedings of the IEEE 23rd international conference on data engineering, pp 786–795
Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7:358–386
Article Google Scholar
Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. Proc VLDB Endow 1:1542–1552
Article Google Scholar
Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh EJ (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26:275–309
Article MathSciNet Google Scholar
Batista GEAPA, Wang X, Keogh EJ (2011) A complexity-invariant distance measure for time series. In: Proceedings of the 11th SIAM international conference on data mining. SIAM, pp 699–710
Javid MAJ, Blackwell T, Zimmer R, Alrifaie MM (2016) Analysis of information gain and Kolmogorov complexity for structural evaluation of cellular automata configurations. Connect Sci 28(2):1–16
Google Scholar
Greckia T, Luczak M (2015) Multivariate time series classification with parametric derivative dynamic time warping. Expert Syst Appl 42:2305–2312
Article Google Scholar
Kate RJ (2015) Using dynamic time warping distances as features for improved time series classification. Data Min Knowl Discov 30(2):283–312
Article MathSciNet MATH Google Scholar
Pietzsch T, Saalfeld S, Preibisch S, Tomancak P (2015) BigDataViewer: visualization and processing for large image data sets. Nat Methods 12(6):481–483
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 71571182, 71571185, and 71671186, and the Research Project of National University of Defense Technology. The authors would like to thank the UCR time series for providing online datasets and results of partial measures. Many thanks to the reviewers for proposing sound advices that are really helpful in improving our paper.

Author information

Authors and Affiliations

College of Systems Engineering, National University of Defense Technology, Changsha, 410073, Hunan, China
Yanqing Ye, Jiang Jiang, Bingfeng Ge, Yajie Dou & Kewei Yang

Authors

Yanqing Ye
View author publications
You can also search for this author inPubMed Google Scholar
Jiang Jiang
View author publications
You can also search for this author inPubMed Google Scholar
Bingfeng Ge
View author publications
You can also search for this author inPubMed Google Scholar
Yajie Dou
View author publications
You can also search for this author inPubMed Google Scholar
Kewei Yang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yanqing Ye.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ye, Y., Jiang, J., Ge, B. et al. Similarity measures for time series data classification using grid representation and matrix distance. Knowl Inf Syst 60, 1105–1134 (2019). https://doi.org/10.1007/s10115-018-1264-0

Download citation

Received: 25 April 2017
Revised: 04 May 2018
Accepted: 04 July 2018
Published: 05 September 2018
Issue Date: 01 August 2019
DOI: https://doi.org/10.1007/s10115-018-1264-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similarity measures for time series data classification using grid representation and matrix distance

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Elastic similarity and distance measures for multivariate time series

A Comparative Study of Similarity Measures for Time Series Classification

A review on distance based time series classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now