Extracting Features from Random Subseries: A Hybrid Pipeline for Time Series Classification and Extrinsic Regression

Middlehurst, Matthew; Bagnall, Anthony

doi:10.1007/978-3-031-49896-1_8

Matthew Middlehurst¹⁴ &
Anthony Bagnall¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14343))

Included in the following conference series:

International Workshop on Advanced Analytics and Learning on Temporal Data

513 Accesses
4 Citations

Abstract

In time series classification (TSC) literature, approaches which incorporate multiple feature extraction domains such as HIVE-COTE and TS-CHIEF have generally shown to perform better than single domain approaches in situations where no expert knowledge is available for the data. Time series extrinsic regression (TSER) has seen very little activity compared to TSC, but the provision of benchmark datasets for regression by researchers at Monash University and the University of East Anglia provide an opportunity to see if this insight gleaned from TSC literature applies to regression data. We show that extracting random shapelets and intervals from different series representations and concatenating the output as part of a feature extraction pipeline significantly outperforms the single domain approaches for both classification and regression. In addition to our main contribution, we provide results for shapelet based algorithms on the regression archive datasets using the RDST transform, and show that current interval based approaches such as DrCIF can find noticeable scalability improvements by adopting the pipeline format.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Time series extrinsic regression

Article 11 March 2021

Unsupervised feature based algorithms for time series extrinsic regression

Article Open access 19 May 2024

The FreshPRINCE: A Simple Transformation Based Pipeline Time Series Classifier

Notes

References

Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Disc. 31(3), 606–660 (2017)
Article MathSciNet Google Scholar
Bagnall, A., et al.: The UEA multivariate time series classification archive. arXiv preprint arXiv:1811.00075 (2018)
Benavoli, A., Corani, G., Mangili, F.: Should we really use post-hoc tests based on mean-ranks? J. Mach. Learn. Res. 17, 1–10 (2016)
MathSciNet Google Scholar
Bostrom, A., Bagnall, A.: Binary shapelet transform for multiclass time series classification. Trans. Large-Scale Data Knowl. Centered Syst. 32, 24–46 (2017)
Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Article Google Scholar
Cabello, N., Naghizade, E., Qi, J., Kulik, L.: Fast and accurate time series classification through supervised interval search. In: IEEE International Conference on Data Mining (2020)
Google Scholar
Cabello, N., Naghizade, E., Qi, J., Kulik, L.: Fast, accurate and interpretable time series classification through randomization. arXiv preprint arXiv:2105.14876 (2021)
Christ, M., Braun, N., Neuffer, J., Kempa-Liehr, A.W.: Time Series FeatuRe Extraction on basis of Scalable Hypothesis tests (tsfresh-A Python package). Neurocomputing 307, 72–77 (2018)
Article Google Scholar
Dau, H., et al.: The UCR time series archive. IEEE/CAA J. Automatica Sinica 6(6), 1293–1305 (2019)
Article Google Scholar
Dempster, A., Petitjean, F., Webb, G.: ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Min. Knowl. Disc. 34, 1454–1495 (2020)
Article MathSciNet Google Scholar
Dempster, A., Schmidt, D.F., Webb, G.I.: HYDRA: competing convolutional kernels for fast and accurate time series classification. arXiv preprint arXiv:2203.13652 (2022)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet Google Scholar
Deng, H., Runger, G., Tuv, E., Vladimir, M.: A time series forest for classification and feature extraction. Inf. Sci. 239, 142–153 (2013)
Article MathSciNet Google Scholar
Fawaz, H., et al.: InceptionTime: finding AlexNet for time series classification. Data Min. Knowl. Disc. 34(6), 1936–1962 (2020)
Article MathSciNet Google Scholar
Flynn, M., Large, J., Bagnall, T.: The contract random interval spectral ensemble (c-RISE): the effect of contracting a classifier on accuracy. In: Pérez García, H., Sánchez González, L., Castejón Limas, M., Quintián Pardo, H., Corchado Rodríguez, E. (eds.) HAIS 2019. LNCS (LNAI), vol. 11734, pp. 381–392. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29859-3_33
Chapter Google Scholar
García, S., Herrera, F.: An extension on “statistical comparisons of classifiers over multiple data sets’’ for all pairwise comparisons. J. Mach. Learn. Res. 9, 2677–2694 (2008)
Google Scholar
Gay, D., Bondu, A., Lemaire, V., Boullé, M.: Interpretable feature construction for time series extrinsic regression. In: Karlapalem, K., et al. (eds.) PAKDD 2021. LNCS (LNAI), vol. 12712, pp. 804–816. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-75762-5_63
Chapter Google Scholar
Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63, 3–42 (2006)
Article Google Scholar
Guijo-Rubio, D., Middlehurst, M., Arcencio, G., Silva, D.F., Bagnall, A.: Unsupervised feature based algorithms for time series extrinsic regression. arXiv preprint arXiv:2305.01429 (2023)
Guillaume, A., Vrain, C., Elloumi, W.: Random dilated shapelet transform: a new approach for time series shapelets. In: El Yacoubi, M., Granger, E., Yuen, P.C., Pal, U., Vincent, N. (eds.) Pattern Recognition and Artificial Intelligence: Third International Conference, ICPRAI 2022, Paris, France, 1–3 June 2022, Proceedings, Part I, pp. 653–664. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-09037-0_53
Herrmann, M., Tan, C.W., Salehi, M., Webb, G.I.: Proximity Forest 2.0: a new effective and scalable similarity-based classifier for time series. arXiv preprint arXiv:2304.05800 (2023)
Lines, J., Davis, L., Hills, J., Bagnall, A.: A shapelet transform for time series classification. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2012)
Google Scholar
Lines, J., Taylor, S., Bagnall, A.: Time series classification with HIVE-COTE: the hierarchical vote collective of transformation-based ensembles. ACM Trans. Knowl. Discov. Data 12(5), 1–36 (2018)
Article Google Scholar
Lubba, C., Sethi, S., Knaute, P., Schultz, S., Fulcher, B., Jones, N.: Catch22: canonical time-series characteristics. Data Min. Knowl. Disc. 33(6), 1821–1852 (2019)
Article Google Scholar
Lucas, B., et al.: Proximity forest: an effective and scalable distance-based classifier for time series. Data Min. Knowl. Disc. 33(3), 607–635 (2019)
Article Google Scholar
Middlehurst, M., Large, J., Cawley, G., Bagnall, A.: The temporal dictionary ensemble (TDE) classifier for time series classification. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds.) ECML PKDD 2020. LNCS (LNAI), vol. 12457, pp. 660–676. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67658-2_38
Chapter Google Scholar
Middlehurst, M., Bagnall, A.: The FreshPRINCE: a simple transformation based pipeline time series classifier. In: El Yacoubi, M., Granger, E., Yuen, P.C., Pal, U., Vincent, N. (eds.) Pattern Recognition and Artificial Intelligence, ICPRAI 2022. LNCS, vol. 13364, pp. 150–161. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-09282-4_13
Middlehurst, M., Large, J., Bagnall, A.: The canonical interval forest (CIF) classifier for time series classification. In: IEEE International Conference on Big Data, pp. 188–195 (2020)
Google Scholar
Middlehurst, M., Large, J., Flynn, M., Lines, J., Bostrom, A., Bagnall, A.: HIVE-COTE 2.0: a new meta ensemble for time series classification. Mach. Learn. 110, 3211–3243 (2021)
Google Scholar
Middlehurst, M., Schäfer, P., Bagnall, A.: Bake off redux: a review and experimental evaluation of recent time series classification algorithms. arXiv preprint arXiv:2304.13029 (2023)
Nguyen, T.L., Ifrim, G.: Fast time series classification with random symbolic subsequences. In: Guyet, T., Ifrim, G., Malinowski, S., Bagnall, A., Shafer, P., Lemaire, V. (eds.) International Workshop on Advanced Analytics and Learning on Temporal Data, vol. 13812, pp. 50–65. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-24378-3_4
Rodriguez, J.J., Kuncheva, L.I., Alonso, C.J.: Rotation forest: a new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1619–1630 (2006)
Article Google Scholar
Schäfer, P., Leser, U.: Fast and accurate time series classification with WEASEL. In: Proceedings of the ACM Conference on Information and Knowledge Management, pp. 637–646 (2017)
Google Scholar
Schäfer, P., Leser, U.: Weasel 2.0 - a random dilated dictionary transform for fast, accurate and memory constrained time series classification. arXiv preprint arXiv:2301.10194 (2023)
Shifaz, A., Pelletier, C., Petitjean, F., Webb, G.I.: TS-CHIEF: a scalable and accurate forest algorithm for time series classification. Data Min. Knowl. Discov. 34(3), 742–775 (2020)
Article MathSciNet Google Scholar
Tan, C.W., Bergmeir, C., Petitjean, F., Webb, G.: Time series extrinsic regression. Data Min. Knowl. Discov. 35, 1032–1060 (2021)
Article MathSciNet Google Scholar
Tan, C.W., Dempster, A., Bergmeir, C., Webb, G.: MultiRocket: multiple pooling operators and transformations for fast and effective time series classification. Data Min. Knowl. Discov. 36, 1623–1646 (2022)
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work is supported by the UK Engineering and Physical Sciences Research Council (EPSRC) grant number EP/W030756/1. The experiments were carried out on the High Performance Computing Cluster supported by the Research and Specialist Computing Support service at the University of East Anglia (UEA). We would like to thank all those responsible for helping maintain the time series dataset archives and those contributing to open source implementations of the algorithms.

Author information

Authors and Affiliations

School of Electronics and Computer Science, University of Southampton, Southampton, UK
Matthew Middlehurst & Anthony Bagnall

Authors

Matthew Middlehurst
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Bagnall
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthew Middlehurst .

Editor information

Editors and Affiliations

University College Dublin, Dublin, Ireland
Georgiana Ifrim
University of Rennes 2, Rennes, France
Romain Tavenard
University of Southampton, Southampton, UK
Anthony Bagnall
Humboldt University of Berlin, Berlin, Germany
Patrick Schaefer
University of Rennes, Rennes, France
Simon Malinowski
Claude Bernard University Lyon 1, Villeurbanne, France
Thomas Guyet
Orange Innovation, Lannion, France
Vincent Lemaire

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Middlehurst, M., Bagnall, A. (2023). Extracting Features from Random Subseries: A Hybrid Pipeline for Time Series Classification and Extrinsic Regression. In: Ifrim, G., et al. Advanced Analytics and Learning on Temporal Data. AALTD 2023. Lecture Notes in Computer Science(), vol 14343. Springer, Cham. https://doi.org/10.1007/978-3-031-49896-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-49896-1_8
Published: 20 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49895-4
Online ISBN: 978-3-031-49896-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)

Extracting Features from Random Subseries: A Hybrid Pipeline for Time Series Classification and Extrinsic Regression