Accurate and fast time series classification based on compressed random Shapelet Forest

Yang, Jun; Jing, Siyuan; Huang, Guanying

doi:10.1007/s10489-022-03852-2

Accurate and fast time series classification based on compressed random Shapelet Forest

Published: 21 June 2022

Volume 53, pages 5240–5258, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

488 Accesses
4 Citations
Explore all metrics

Abstract

Achieving accurate, fast, and interpretable time series classification (TSC) has attracted considerable attention from the data mining community over the past decades. In this paper, we propose an efficient algorithm, called Compressed Random Shapelet Forest (CRSF), to tackle this problem. Different from most of the shapelet-based TSC methods, CRSF obtains promising performance by greatly compressing the shapelet features space. In order to achieve the aim of compression, the time series dataset, as well as the shapelets, are represented by Symbolic Aggregate approXimation (SAX) at first. Then, the shapelet-based decision trees are built upon a pool of high-quality shapelet candidates of which the useless shapelets and the self-similar shapelets have been pre-pruned. A new function for measuring the distance between two SAX-represented time series is also introduced. Extensive experiments were conducted on 50 UCR time series datasets. The results show that (1) CRSF can achieve the highest average accuracy on the datasets and it outperforms most of the existing shapelet-based TSC methods; (2) CRSF is slightly superior to gRSF in terms of accuracy and is significantly superior to gRSF in terms of time cost. Specifically, it is on average 41 times faster than gRSF according to the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Fig. 2

Random pairwise shapelets forest: an effective classifier for time series

Article 10 January 2022

Jidong Yuan, Mohan Shi, … Jinyang Li

Random Pairwise Shapelets Forest

Early Random Shapelet Forest

References

Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Disc 31(3):606–660
Article MathSciNet Google Scholar
Ruiz AP, Flynn M, Large J, Middlehurst M, Bagnall A (2021) The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Disc 35(2):401–449
Article MathSciNet MATH Google Scholar
Gordon D, Hendler D, Kontorovich A, Rokach L (2015) Local-shapelets for fast classification of spectrographic measurements. Expert Syst Appl 42(6):3150–3158
Article Google Scholar
Li G, Yan W, Wu Z (2019) Discovering shapelets with key points in time series classification. Expert Sys Appl 132:76–86
Article Google Scholar
Hong JY, Park SH, Baek J-G (2020) SSDTW: shape segment dynamic time warping. Expert Syst Appl 150:113291
Article Google Scholar
Lahreche A, Boucheham B (2021) A fast and accurate similarity measure for long time series classification based on local extrema and dynamic time warping Expert Sys Appl 168:114374
Wang Z, Yan W, Oates T (2017) Time series classification from scratch with deep neural networks: a strong baseline. In: 2017 international joint conference on neural networks (IJCNN'17), pp 1578–85
Ismail Fawaz H, Forestier G, Weber J, Idoumghar L, Muller P-A (2019) Deep learning for time series classification: a review. Data Min Knowl Disc 33(4):917–963
Article MathSciNet MATH Google Scholar
Ismail Fawaz H, Lucas B, Forestier G, Pelletier C, Schmidt DF, Weber J, Webb GI, Idoumghar L, Muller P-A, Petitjean F (2020) InceptionTime: finding AlexNet for time series classification. Data Min Knowl Disc 34(6):1936–1962
Article MathSciNet Google Scholar
Zhang X, Gao Y, Lin J, Lu C-T (2020) TapNet: multivariate time series classification with attentional prototypical network. In: proceedings of the AAAI conference on artificial intelligence (AAAI'20), pp. 6845–52
Baydogan MG, Runger G (2015) Time series representation and similarity based on local autopatterns. Data Min Knowl Disc 30(2):476–509
Article MathSciNet MATH Google Scholar
Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD'09), pp 947–956
Ye L, Keogh E (2011) Time series shapelets: a novel technique that allows accurate, interpretable and fast classification. Data Min Knowl Disc 22:149–182
Article MathSciNet MATH Google Scholar
Dau HA, Bagnall A, Kamgar K, Yeh C-C M, Zhu Y, Gharghabi S, Ratanamahatana CA, Keogh E (2019) The UCR time series archive. IEEE/CAA Journal of Automatica Sinica 6(6):1293–1305
Article Google Scholar
Mueen A, Keogh E, Young N (2011) Logical-Shapelets: an expressive primitive for time series classification. In: proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining (KDD'11), pp 1154–1162
Górecki T, Łuczak M (2012) Using derivatives in time series classification. Data Min Knowl Disc 26(2):310–331
Article MathSciNet Google Scholar
Rakthanmanon T, Keogh E (2013) Fast Shapelets: a scalable algorithm for discovering time series Shapelets. In: proceedings of the 2013 SIAM international conference on data mining (SDM'13), pp 668–76
Grabocka J, Schilling N, Wistuba M, Schmidt-Thieme L (2014) Learning time-series shapelets. In: proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (KDD'14), pp 392–401
Grabocka J, Wistuba M, Schmidt-Thieme L (2015) Fast classification of univariate and multivariate time series through shapelet discovery. Knowl Inf Syst 49(2):429–454
Article Google Scholar
Hou L, Kwok J T, Zurada J M (2016) Efficient learning of Timeseries Shapelets. In: proceedings of the thirtieth AAAI conference on artificial intelligence (AAAI'16), pp 1209–15
Karlsson I, Papapetrou P, Boström H (2016) Generalized random shapelet forests. Data Min Knowl Disc 30(5):1053–1085
Article MathSciNet MATH Google Scholar
Fang Z, Wang P, Wang W (2018) Efficient learning interpretable Shapelets for accurate time series classification. In: 2018 IEEE 34th international conference on data engineering (ICDE'18), pp 497–508
Li G, Choi B, Xu J, Bhowmick S S, Chun K-P, Wong G L-H (2020) Efficient Shapelet discovery for time series classification. IEEE transactions on knowledge and data engineering 34(3):1149–1163
Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Disc 15(2):107–144
Article MathSciNet Google Scholar
Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Disc 26(2):275–309
Article MathSciNet Google Scholar
Agrawal R, Faloutsos C, Swami A (1993) Efficient similarity search in sequence databases. In: International Conference on Foundations of Data Organization and Algorithms (FODO'93), pp. 69–84
Chan FK-P, Fu AW-C, Yu C (2003) Haar wavelets for efficient similarity search of time-series: with and without time warping. IEEE Trans Knowl Data Eng 15(3):686–705
Article Google Scholar
Marteau PF (2009) Time warp edit distance with stiffness adjustment for time series matching. IEEE Trans Pattern Anal Mach Intell 31(2):306–318
Article Google Scholar
Jeong Y-S, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recogn 44(9):2231–2240
Article Google Scholar
Batista GEAPA, Keogh EJ, Tataw OM, de Souza VMA (2013) CID: an efficient complexity-invariant distance for time series. Data Min Knowl Disc 28(3):634–669
Article MathSciNet MATH Google Scholar
Stefan A, Athitsos V, Das G (2013) The move-Split-merge metric for time series. IEEE Trans Knowl Data Eng 25(6):1425–1438
Article Google Scholar
Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in Large time series databases. Knowl Inf Syst 3(3):263–286
Article MATH Google Scholar
Hills J, Lines J, Baranauskas E, Mapp J, Bagnall A (2013) Classification of time series by shapelet transformation. Data Min Knowl Disc 28(4):851–881
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors thank the reviewers for their work and the contributors of the UCR archive. This work is supported by the open project fund of Intelligent Terminal Key Laboratory of Sichuan Province (Grant No. SCITLAB-1002), and the open fund of Key Laboratory of Internet Natural Language Processing of Sichuan Province Education Department (Grant No. INLP201906), and fund of Science and Technology Bureau of Leshan Town (Grant Nos. 21SZD092, 20GZD020).

Author information

Authors and Affiliations

Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu, 610041, China
Jun Yang
University of Chinese Academy of Sciences, Beijing, 100049, China
Jun Yang
School of Electronic Information and Artificial Intelligence, Leshan Normal University, Leshan, 614000, China
Jun Yang, Siyuan Jing & Guanying Huang
Intelligent Terminal Key Laboratory of Sichuan Province, Yibin, 644000, China
Siyuan Jing

Authors

Jun Yang
View author publications
You can also search for this author in PubMed Google Scholar
Siyuan Jing
View author publications
You can also search for this author in PubMed Google Scholar
Guanying Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Siyuan Jing.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, J., Jing, S. & Huang, G. Accurate and fast time series classification based on compressed random Shapelet Forest. Appl Intell 53, 5240–5258 (2023). https://doi.org/10.1007/s10489-022-03852-2

Download citation

Accepted: 03 June 2022
Published: 21 June 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10489-022-03852-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Accurate and fast time series classification based on compressed random Shapelet Forest

Abstract

Access this article

Similar content being viewed by others

Random pairwise shapelets forest: an effective classifier for time series

Random Pairwise Shapelets Forest

Early Random Shapelet Forest

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Accurate and fast time series classification based on compressed random Shapelet Forest

Abstract

Access this article

Similar content being viewed by others

Random pairwise shapelets forest: an effective classifier for time series

Random Pairwise Shapelets Forest

Early Random Shapelet Forest

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation