Skip to main content
Log in

Accurate and fast time series classification based on compressed random Shapelet Forest

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Achieving accurate, fast, and interpretable time series classification (TSC) has attracted considerable attention from the data mining community over the past decades. In this paper, we propose an efficient algorithm, called Compressed Random Shapelet Forest (CRSF), to tackle this problem. Different from most of the shapelet-based TSC methods, CRSF obtains promising performance by greatly compressing the shapelet features space. In order to achieve the aim of compression, the time series dataset, as well as the shapelets, are represented by Symbolic Aggregate approXimation (SAX) at first. Then, the shapelet-based decision trees are built upon a pool of high-quality shapelet candidates of which the useless shapelets and the self-similar shapelets have been pre-pruned. A new function for measuring the distance between two SAX-represented time series is also introduced. Extensive experiments were conducted on 50 UCR time series datasets. The results show that (1) CRSF can achieve the highest average accuracy on the datasets and it outperforms most of the existing shapelet-based TSC methods; (2) CRSF is slightly superior to gRSF in terms of accuracy and is significantly superior to gRSF in terms of time cost. Specifically, it is on average 41 times faster than gRSF according to the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Disc 31(3):606–660

    Article  MathSciNet  Google Scholar 

  2. Ruiz AP, Flynn M, Large J, Middlehurst M, Bagnall A (2021) The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Disc 35(2):401–449

    Article  MathSciNet  MATH  Google Scholar 

  3. Gordon D, Hendler D, Kontorovich A, Rokach L (2015) Local-shapelets for fast classification of spectrographic measurements. Expert Syst Appl 42(6):3150–3158

    Article  Google Scholar 

  4. Li G, Yan W, Wu Z (2019) Discovering shapelets with key points in time series classification. Expert Sys Appl 132:76–86

    Article  Google Scholar 

  5. Hong JY, Park SH, Baek J-G (2020) SSDTW: shape segment dynamic time warping. Expert Syst Appl 150:113291

    Article  Google Scholar 

  6. Lahreche A, Boucheham B (2021) A fast and accurate similarity measure for long time series classification based on local extrema and dynamic time warping Expert Sys Appl 168:114374

  7. Wang Z, Yan W, Oates T (2017) Time series classification from scratch with deep neural networks: a strong baseline. In: 2017 international joint conference on neural networks (IJCNN'17), pp 1578–85

  8. Ismail Fawaz H, Forestier G, Weber J, Idoumghar L, Muller P-A (2019) Deep learning for time series classification: a review. Data Min Knowl Disc 33(4):917–963

    Article  MathSciNet  MATH  Google Scholar 

  9. Ismail Fawaz H, Lucas B, Forestier G, Pelletier C, Schmidt DF, Weber J, Webb GI, Idoumghar L, Muller P-A, Petitjean F (2020) InceptionTime: finding AlexNet for time series classification. Data Min Knowl Disc 34(6):1936–1962

    Article  MathSciNet  Google Scholar 

  10. Zhang X, Gao Y, Lin J, Lu C-T (2020) TapNet: multivariate time series classification with attentional prototypical network. In: proceedings of the AAAI conference on artificial intelligence (AAAI'20), pp. 6845–52

  11. Baydogan MG, Runger G (2015) Time series representation and similarity based on local autopatterns. Data Min Knowl Disc 30(2):476–509

    Article  MathSciNet  MATH  Google Scholar 

  12. Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD'09), pp 947–956

  13. Ye L, Keogh E (2011) Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min Knowl Disc 22:149–182

    Article  MathSciNet  MATH  Google Scholar 

  14. Dau HA, Bagnall A, Kamgar K, Yeh C-C M, Zhu Y, Gharghabi S, Ratanamahatana CA, Keogh E (2019) The UCR time series archive. IEEE/CAA Journal of Automatica Sinica 6(6):1293–1305

    Article  Google Scholar 

  15. Mueen A, Keogh E, Young N (2011) Logical-Shapelets: an expressive primitive for time series classification. In: proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD'11), pp 1154–1162

  16. Górecki T, Łuczak M (2012) Using derivatives in time series classification. Data Min Knowl Disc 26(2):310–331

    Article  MathSciNet  Google Scholar 

  17. Rakthanmanon T, Keogh E (2013) Fast Shapelets: a scalable algorithm for discovering time series Shapelets. In: proceedings of the 2013 SIAM international conference on data mining (SDM'13), pp 668–76

  18. Grabocka J, Schilling N, Wistuba M, Schmidt-Thieme L (2014) Learning time-series shapelets. In: proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (KDD'14), pp 392–401

  19. Grabocka J, Wistuba M, Schmidt-Thieme L (2015) Fast classification of univariate and multivariate time series through shapelet discovery. Knowl Inf Syst 49(2):429–454

    Article  Google Scholar 

  20. Hou L, Kwok J T, Zurada J M (2016) Efficient learning of Timeseries Shapelets. In: proceedings of the thirtieth AAAI conference on artificial intelligence (AAAI'16), pp 1209–15

  21. Karlsson I, Papapetrou P, Boström H (2016) Generalized random shapelet forests. Data Min Knowl Disc 30(5):1053–1085

    Article  MathSciNet  MATH  Google Scholar 

  22. Fang Z, Wang P, Wang W (2018) Efficient learning interpretable Shapelets for accurate time series classification. In: 2018 IEEE 34th international conference on data engineering (ICDE'18), pp 497–508

  23. Li G, Choi B, Xu J, Bhowmick S S, Chun K-P, Wong G L-H (2020) Efficient Shapelet discovery for time series classification. IEEE transactions on knowledge and data engineering 34(3):1149–1163

  24. Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Disc 15(2):107–144

    Article  MathSciNet  Google Scholar 

  25. Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Disc 26(2):275–309

    Article  MathSciNet  Google Scholar 

  26. Agrawal R, Faloutsos C, Swami A (1993) Efficient similarity search in sequence databases. In: International Conference on Foundations of Data Organization and Algorithms (FODO'93), pp. 69–84

  27. Chan FK-P, Fu AW-C, Yu C (2003) Haar wavelets for efficient similarity search of time-series: with and without time warping. IEEE Trans Knowl Data Eng 15(3):686–705

    Article  Google Scholar 

  28. Marteau PF (2009) Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans Pattern Anal Mach Intell 31(2):306–318

    Article  Google Scholar 

  29. Jeong Y-S, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recogn 44(9):2231–2240

    Article  Google Scholar 

  30. Batista GEAPA, Keogh EJ, Tataw OM, de Souza VMA (2013) CID: an efficient complexity-invariant distance for time series. Data Min Knowl Disc 28(3):634–669

    Article  MathSciNet  MATH  Google Scholar 

  31. Stefan A, Athitsos V, Das G (2013) The move-Split-merge metric for time series. IEEE Trans Knowl Data Eng 25(6):1425–1438

    Article  Google Scholar 

  32. Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in Large time series databases. Knowl Inf Syst 3(3):263–286

    Article  MATH  Google Scholar 

  33. Hills J, Lines J, Baranauskas E, Mapp J, Bagnall A (2013) Classification of time series by shapelet transformation. Data Min Knowl Disc 28(4):851–881

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors thank the reviewers for their work and the contributors of the UCR archive. This work is supported by the open project fund of Intelligent Terminal Key Laboratory of Sichuan Province (Grant No. SCITLAB-1002), and the open fund of Key Laboratory of Internet Natural Language Processing of Sichuan Province Education Department (Grant No. INLP201906), and fund of Science and Technology Bureau of Leshan Town (Grant Nos. 21SZD092, 20GZD020).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Siyuan Jing.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, J., Jing, S. & Huang, G. Accurate and fast time series classification based on compressed random Shapelet Forest. Appl Intell 53, 5240–5258 (2023). https://doi.org/10.1007/s10489-022-03852-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03852-2

Keywords

Navigation