Skip to main content
Log in

Elastic distances for time-series classification: Itakura versus Sakoe-Chiba constraints

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In the field of time series data mining, the accuracy of the simple, but very successful nearest neighbor (NN) classifier directly depends on the chosen similarity measure. To improve the efficiency of elastic measures introduced to overcome the shortcomings of Euclidean distance, the Sakoe-Chiba band is usually applied as a constraint. In this paper, we provide a detailed analysis of the influence of the alternative Itakura parallelogram constraint on the accuracy of the NN classifier in combination with four well-known elastic measures, compared to the Sakoe-Chiba constraint and the unconstrained variants of these measures. The findings suggest that, although the Sakoe-Chiba band generally produces better results, for certain types of datasets the Itakura parallelogram represents a better choice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26

Similar content being viewed by others

Data availability

The results presented in this paper are based on the UCR time Series Classification Archive (https://www.cs.ucr.edu/~eamonn/time_series_data_2018/).

References

  1. Laxman S, Sastry PS (2006) A survey of temporal data mining. Sadhana 31:173–198. https://doi.org/10.1007/BF02719780

    Article  MathSciNet  MATH  Google Scholar 

  2. Mitsa T (2010) Temporal Data Mining. Taylor & Francis

    Book  Google Scholar 

  3. Esling P, Agon C (2012) Time-series data mining. ACM Comput Surv 45:11:1-12:34. https://doi.org/10.1145/2379776.2379788

    Article  MATH  Google Scholar 

  4. Singh P, Borah B (2014) Forecasting stock index price based on M-factors fuzzy time series and particle swarm optimization. Int J Approx Reason 55:812–833. https://doi.org/10.1016/j.ijar.2013.09.014

    Article  MathSciNet  MATH  Google Scholar 

  5. Pecev P, Rackovic M (2017) LTR-MDTS structure - a structure for multiple dependent time series prediction. Comput Sci Inf Syst 14:467–490. https://doi.org/10.2298/CSIS150815004P

    Article  Google Scholar 

  6. Wang X, Mueen A, Ding H et al (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26:275–309. https://doi.org/10.1007/s10618-012-0250-5

    Article  MathSciNet  Google Scholar 

  7. Gou J, Sun L, Du L et al (2022) A representation coefficient-based k-nearest centroid neighbor classifier. Expert Syst Appl 194:116529. https://doi.org/10.1016/j.eswa.2022.116529

    Article  Google Scholar 

  8. Gou J, Ma H, Ou W et al (2019) A generalized mean distance-based k-nearest neighbor classifier. Expert Syst Appl 115:356–372. https://doi.org/10.1016/j.eswa.2018.08.021

    Article  Google Scholar 

  9. Singh P, Borah B (2013) High-order fuzzy-neuro expert system for time series forecasting. Knowledge-Based Syst 46:12–21. https://doi.org/10.1016/j.knosys.2013.01.030

    Article  Google Scholar 

  10. Radovanović M, Nanopoulos A, Ivanović M (2010) Time-Series Classification in Many Intrinsic Dimensions. In: Proceedings of the 2010 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, Philadelphia, PA, pp 677–688

  11. Ding H, Trajcevski G, Scheuermann P, et al (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In: Proceedings of the VLDB Endowment. VLDB Endowment, pp 1542–1552

  12. Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7:358–386. https://doi.org/10.1007/s10115-004-0154-9

    Article  Google Scholar 

  13. Xi X, Keogh E, Shelton C, et al (2006) Fast time series classification using numerosity reduction. In: Proceedings of the 23rd international conference on Machine learning - ICML ’06. ACM Press, New York, New York, USA, pp 1033–1040

  14. Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: Usama M. Fayyad RU (ed) Knowledge Discovery in Databases: Papers from the 1994 AAAI Workshop. AAAI Press, Seattle, Washington, pp 359–370

  15. Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings 18th International Conference on Data Engineering. IEEE Comput. Soc, pp 673–684

  16. Chen L, Ng R (2004) On The Marriage of Lp-norms and Edit Distance. In: Nascimento MA, Özsu MT, Kossmann D et al (eds) Proceedings 2004 VLDB Conference. Elsevier, pp 792–803

    Chapter  Google Scholar 

  17. Chen L, Özsu MT, Oria V (2005) Robust and fast similarity search for moving object trajectories. In: Proceedings of the 2005 ACM SIGMOD international conference on Management of data - SIGMOD ’05. ACM Press, New York, New York, USA, pp 491–502

  18. Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust 26:43–49. https://doi.org/10.1109/TASSP.1978.1163055

    Article  MATH  Google Scholar 

  19. Geler Z (2015) Role of Similarity Measures in Time Series Analysis. Dissertation, University of Novi Sad, Serbia

  20. Geler Z, Kurbalija V, Radovanović M, Ivanović M (2014) Impact of the Sakoe-Chiba band on the DTW time series distance measure for kNN classification. In: Buchmann R, Kifor CV, Yu J (eds) The 7th International conference on knowledge science, engineering and management KSEM 2014. Springer International Publishing, Cham, pp 105–114

    Google Scholar 

  21. Kurbalija V, Radovanović M, Geler Z, Ivanović M (2011) The influence of global constraints on DTW and LCS similarity measures for time-series databases. In: Dicheva D, Markov Z, Stefanova E (eds) Third international conference on software, services and semantic technologies S3T 2011 SE - 10. Springer, Berlin Heidelberg, pp 67–74

    Google Scholar 

  22. Kurbalija V, Radovanović M, Geler Z, Ivanović M (2014) The influence of global constraints on similarity measures for time-series databases. Knowledge-Based Syst 56:49–67. https://doi.org/10.1016/j.knosys.2013.10.021

    Article  Google Scholar 

  23. Ratanamahatana CA, Keogh E (2005) Three Myths about Dynamic Time Warping Data Mining. In: Proceedings of the 2005 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, Philadelphia, PA, pp 506–510

  24. Geler Z, Kurbalija V, Ivanovic M, et al (2019) Dynamic Time Warping: Itakura vs Sakoe-Chiba. In: 2019 IEEE International Symposium on INnovations in Intelligent SysTems and Applications (INISTA). IEEE, pp 1–6

  25. Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust 23:67–72. https://doi.org/10.1109/TASSP.1975.1162641

    Article  Google Scholar 

  26. Anh Dau H, Keogh E, Kamgar K, et al (2019) The UCR Time Series Classification Archive. https://www.cs.ucr.edu/~eamonn/time_series_data_2018/

  27. Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. ACM SIGMOD Rec 23:419–429. https://doi.org/10.1145/191843.191925

    Article  Google Scholar 

  28. Agrawal R, Faloutsos C, Swami A (1993) Efficient similarity search in sequence databases. In: David B. Lomet (ed) Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms (FODO ’93). Springer Berlin Heidelberg, pp 69–84

  29. Rakthanmanon T, Campana B, Mueen A, et al (2012) Searching and Mining Trillions of Time Series Subsequences Under Dynamic Time Warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, USA, pp 262–270

  30. Górecki T, Łuczak M (2019) The influence of the Sakoe-Chiba band size on time series classification. J Intell Fuzzy Syst 36:527–539. https://doi.org/10.3233/JIFS-18839

    Article  Google Scholar 

  31. Strle B, Možina M, Bratko I (2009) Qualitative approximation to Dynamic Time Warping similarity between time series data. In: Proceedings of the 23rd international workshop on qualitative reasoning. pp 104–110

  32. Salvador S, Chan P (2007) Toward accurate dynamic time warping in linear time and space. Intell Data Anal 11:561–580

    Article  Google Scholar 

  33. Wu R, Keogh EJ (2020) FastDTW is approximate and Generally Slower than the Algorithm it Approximates. IEEE Trans Knowl Data Eng. https://doi.org/10.1109/TKDE.2020.3033752

    Article  Google Scholar 

  34. Bagnall A, Lines J, Bostrom A et al (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31:606–660. https://doi.org/10.1007/s10618-016-0483-9

    Article  MathSciNet  Google Scholar 

  35. Jiang W (2020) Time series classification: nearest neighbor versus deep learning models. SN Appl Sci 2:721. https://doi.org/10.1007/s42452-020-2506-9

    Article  Google Scholar 

  36. Witten IH, Frank E, Hall MA, Pal CJ (2017) Data mining: practical machine learning tools and techniques, 4th edn. Morgan Kaufmann

    Google Scholar 

  37. García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf Sci (Ny) 180:2044–2064. https://doi.org/10.1016/j.ins.2009.12.010

    Article  Google Scholar 

  38. García S, Fernández A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13:959–977. https://doi.org/10.1007/s00500-008-0392-y

    Article  Google Scholar 

  39. García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9:2677–2694

    MATH  Google Scholar 

  40. Bouckaert RR, Frank E (2004) Evaluating the replicability of significance tests for comparing learning algorithms. In: Dai H, Srikant R, Zhang C (eds) Advances in knowledge discovery and data mining. Springer, Berlin Heidelberg, pp 3–12

    Chapter  Google Scholar 

  41. Batista GEAPA, Wang X, Keogh EJ (2011) A Complexity-Invariant Distance Measure for Time Series. In: Proceedings of the 2011 SIAM international conference on data mining. society for industrial and applied mathematics, Philadelphia, PA, pp 699–710

  42. Paparrizos J (2019) 2018 UCR Time-series archive: backward compatibility, missing values, and varying lengths. https://github.com/johnpaparrizos/UCRArchiveFixes

  43. Geler Z, Kurbalija V, Radovanović M, Ivanović M (2016) Comparison of different weighting schemes for the kNN classifier on time-series data. Knowl Inf Syst 48:331–378. https://doi.org/10.1007/s10115-015-0881-0

    Article  Google Scholar 

  44. Geler Z, Kurbalija V, Ivanović M, Radovanović M (2020) Weighted kNN and constrained elastic distances for time-series classification. Expert Syst Appl 162:113829. https://doi.org/10.1016/j.eswa.2020.113829

    Article  Google Scholar 

  45. Kurbalija V, Radovanović M, Geler Z, Ivanović M (2010) A Framework for time-series analysis. In: Dicheva D, Dochev D (eds) Artificial intelligence: methodology, systems, and applications SE - 5. Springer, Berlin Heidelberg, pp 42–51

    Chapter  Google Scholar 

  46. Kurbalija V, Ivanović M, Geler Z, Radovanović M (2018) Two faces of the framework for analysis and prediction, part 1 - education. Inf Technol Control 47:249–261. https://doi.org/10.5755/j01.itc.47.2.18746

    Article  Google Scholar 

  47. Kurbalija V, Ivanović M, Geler Z, Radovanović M (2018) Two faces of the framework for analysis and prediction, part 2 - research. Inf Technol Control 47:489–502. https://doi.org/10.5755/j01.itc.47.3.18747

    Article  Google Scholar 

  48. Mitrović D, Geler Z, Ivanović M (2012) Distributed distance matrix generator based on agents. In: Proceedings of the 2nd international conference on web intelligence, mining and semantics - WIMS ’12. ACM Press, New York, New York, USA, pp 1–6

  49. Mitrovic D, Ivanović M, Geler Z (2014) Agent-based distributed computing for dynamic networks. Inf Technol Control 43:88–97. https://doi.org/10.5755/j01.itc.43.1.4588

    Article  Google Scholar 

  50. Kurbalija V, Ivanović M, Radovanović M, et al (2015) Cultural Differences and Similarities in Emotion Recognition. In: Proceedings of the 7th Balkan conference on informatics conference - BCI ’15. ACM Press, New York, New York, USA, pp 1–6

  51. Kurbalija V, Ivanović M, Radovanović M et al (2018) Emotion perception and recognition: an exploration of cultural differences and similarities. Cogn Syst Res 52:103–116. https://doi.org/10.1016/j.cogsys.2018.06.009

    Article  Google Scholar 

  52. Ratanamahatana CA, Keogh E (2004) Making Time-series Classification More Accurate Using Learned Constraints. In: Proceedings of the 2004 SIAM international conference on data mining. Society for industrial and applied mathematics, Philadelphia, PA, pp 11–22

Download references

Acknowledgements

V. Kurbalija, M. Ivanović, and M. Radovanović acknowledge financial support of the Ministry of Education, Science and Technological Development of the Republic of Serbia (Grant No. 451-03-68/2022-14/200125). The authors would like to express special thanks to Eamonn Keogh for his valuable comments about the paper, and for collecting and making available the UCR time-series datasets, and also to everyone who contributed data to the collection, without which the presented work would not have been possible.

Funding

Partial financial support was received from the Ministry of Education, Science and Technological Development of the Republic of Serbia (Grant No. 451-03-68/2022-14/200125).

Author information

Authors and Affiliations

Authors

Contributions

Z. Geler and V. Kurbalija designed and performed the experiments. V. Kurbalija and M. Radovanović provided the expertise for interpreting the results and designing the experiments. Z. Geler was involved in data acquisition and pre-processing. V. Kurbalija, M. Radovanović and M. Ivanović provided valuable feedback and interpretation of the experimental results. Z. Geler drafted the manuscript, while V. Kurbalija, M. Radovanović and M. Ivanović gave comments and made modifications to the manuscript. All authors approved the manuscript. M. Ivanović coordinated the joint research.

Corresponding author

Correspondence to Zoltan Geler.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: The intuition behind the Sakoe-Chiba band and the Itakura parallelogram

Appendix: The intuition behind the Sakoe-Chiba band and the Itakura parallelogram

The reader may wonder why both the Sakoe-Chiba band and Itakura parallelogram exist. It may be helpful to revisit DTW in the context of its invention as a speech processing tool. In addition, let us use text as an intuitive proxy for time series. For some speech processing problems, it is easy to precisely locate the beginning and end of an utterance (by the dramatic change in volume). Imagine we wish to compare the name of a Southern US state with a stuttered "plosive" consonant, Mississippppppi with a version that has significant sibilance Missssissssippi. The reader will appreciate that DTW is an ideal tool to recognize that these refer to the same word. Note that for such comparisons there is an interesting constraint. Both utterances start with the same letter “M”, and end with the same letter “i”. This means there is no warping at the beginning of the alignment, the warping can grow as we transverse the words, and as we move to the end of the word, the fact that we must end with the same sound again constrains the amount of warping that is possible. The reader will appreciate that this, in geometric terms, describes a parallelogram. The parallelogram shaped alignment is very common in natural and human processes where there is a mechanism to naturally align the beginning and end of a process, but the amount of lagging or leading is proportional to the distance to the nearer of, the beginning or the end. For example, amateur musicians typically start and end in time, but may differ in the middle of a performance.

In contrast, consider the problem of comparing braided and upbraid. While these two words have a lot of common structure, their differences are concentrated at both ends, exactly were the Itakura parallelogram would penalize them, by forcing them to an alignment that is almost Euclidean. Such comparisons need just as much flexibility at every point, which the reader will appreciate describes a band.

We believe that this consideration exactly explains when the Sakoe-Chiba band or Itakura parallelogram are most appropriate. For example, in the GunPoint dataset, the creators use a metronome that produced an audible bleep every five seconds; this provided a cue that aligned the ends of the signal snippet. This explains why Itakura parallelogram works well on this dataset. A good example is GunPointMaleVersusFemale dataset in combination with DTW measure. Here, the IT version is more than two times better than SC version (error rates are 0.0024 and 0.0058 respectively). The alignment on the beginning and on the end of the time series is clearly visible in Fig. 

Fig. 27
figure 27

Several time series from the GunPointMaleVersusFemale dataset

27 where several time series from this dataset are plotted.

Another example of good IT performance could be datasets where the images are converted to time series by measuring and recording the local angles of each pixel of an image contour. This idea is nicely shown in Fig. 

Fig. 28
figure 28

Transformation of raw image shape to time series illustrated on the FaceFour dataset [26]

28 on the FaceFour dataset. The same idea is applied in the Plane dataset where the contours of 7 different airplanes are used to create time-series dataset in the same manner. In both datasets, time series are aligned at the beginning and the end, while in the middle sector the variations are significant, as seen in Fig. 

Fig. 29
figure 29

Several time series from the FaceFour A, and Plane B datasets

29. This fact makes them more appropriate for the IT constraint which is evident in the obtained classification error rates:

  • For the FaceFour dataset using the LCS measure: 0.0089 (IT), 0.0179 (SC)

  • For the Plane dataset applying the DTW measure: 0.0029 (IT), 0.0038 (SC)

In contrast to datasets suitable for the IT approach, in the datasets where SC performs better, time series do not necessarily need to be aligned at the beginning and at the end. The example datasets with this property are more frequent and can be easily found in the UCR repository. Two such datasets (only a few time series are plotted) are shown in Fig. 30.

Fig. 30
figure 30

Several time series from the GesturePebbleZ2 A, and BeetleFly B datasets

In both datasets, SC produced notably lower error rates:

  • For the GesturePebbleZ2 dataset using DTW: 0.1384 (IT), 0.0671 (SC)

  • For the BeetleFly dataset applying DTW: 0.2225 (IT), 0.125 (SC)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Geler, Z., Kurbalija, V., Ivanović, M. et al. Elastic distances for time-series classification: Itakura versus Sakoe-Chiba constraints. Knowl Inf Syst 64, 2797–2832 (2022). https://doi.org/10.1007/s10115-022-01725-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-022-01725-1

Keywords

Navigation