Abstract
Shapelets have recently been proposed for data series classification, due to their ability to capture phase independent and local information. Decision trees based on shapelets have been shown to provide not only interpretable models, but also, in many cases, state-of-the-art predictive performance. Shapelet discovery is, however, computationally costly, and although several techniques for speeding up this task have been proposed, the computational cost is still in many cases prohibitive. In this work, an ensemble-based method, referred to as Random Shapelet Forest (RSF), is proposed, which builds on the success of the random forest algorithm, and which is shown to have a lower computational complexity than the original shapelet tree learning algorithm. An extensive empirical investigation shows that the algorithm provides competitive predictive performance and that a proposed way of calculating importance scores can be used to successfully identify influential regions.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bagnall, A., Davis, L.M., Hills, J., Lines, J.: Transformation basedensembles for time series classification. In: SDM, vol.12, pp. 307–318. SIAM (2012)
Batista, G.E., Wang, X., Keogh, E.J.: A complexity-invariant distance measure for time series. In: SDM, vol. 11, pp. 699–710. SIAM (2011)
Berndt, D.J., Clifford, J.: Using dynamic time warping to find patterns in time series. In: KDD workshop, vol. 10, pp. 359–370. Seattle, WA (1994)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and regression trees. CRC Press (1984)
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7, 1–30 (2006)
Deng, H., Runger, G., Tuv, E., Vladimir, M.: A time series forest for classification and feature extraction. Information Sciences 239, 142–153 (2013)
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. Proc. of the VLDB Endowment 1(2), 1542–1552 (2008)
Fayyad, U.M., Irani, K.B.: On the handling of continuous-valued attributes in decision tree generation. Machine Learning 8(1), 87–102 (1992)
Gordon, D., Hendler, D., Rokach, L.: Fast randomized model generation for shapelet-based time series classification. arXiv preprint arXiv:1209.5038 (2012)
Hills, J., Lines, J., Baranauskas, E., Mapp, J., Bagnall, A.: Classification of time series by shapelet transformation. Data Mining and Know. Discovery 28(4) (2014)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. on Pat. Analysis and Machine Intelligence 20(8), 832–844 (1998)
Kampouraki, A., Manis, G., Nikou, C.: Heartbeat time series classification with support vector machines. Inf. Tech. in Biomedicine 13(4) (2009)
Keogh, E., Zhu, Q., Hu, B., Hao, Y., Xi, X., Wei, L., Ratanamahatana, C.A.: The ucr time series classification/clustering homepage, www.cs.ucr.edu/ eamonn/time_series_data/
Mueen, A., Keogh, E., Young, N.: Logical-shapelets: an expressive primitive for time series classification. In: Proc. 17th ACM SIGKDD. ACM (2011)
Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proc. of the 18th ACM SIGKDD. ACM (2012)
Rakthanmanon, T., Keogh, E.: Fast shapelets: a scalable algorithm for discovering time series shapelets. In: Proc. 13th SDM. SIAM (2013)
Rebbapragada, U., Protopapas, P., Brodley, C.E., Alcock, C.: Finding anomalous periodic time series. Machine Learning 74(3), 281–313 (2009)
Sakoe, H., Chiba, S.. In: Transactions on ASSP, vol. 26, pp. 43–49
Schmidhuber, J.: Deep learning in neural networks: An overview. arXiv preprint arXiv:1404.7828 (2014)
Wang, X., Mueen, A., Ding, H., Trajcevski, G., Scheuermann, P., Keogh, E.: Experimental comparison of representation methods and distance measures for time series data. Data Mining and Knowl. Discovery 26(2) (2013)
Ye, L., Keogh, E.: Time series shapelets: a new primitive for data mining. In: Proc. of the 15th ACM SIGKDD. ACM (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Karlsson, I., Papapetrou, P., Boström, H. (2015). Forests of Randomized Shapelet Trees. In: Gammerman, A., Vovk, V., Papadopoulos, H. (eds) Statistical Learning and Data Sciences. SLDS 2015. Lecture Notes in Computer Science(), vol 9047. Springer, Cham. https://doi.org/10.1007/978-3-319-17091-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-17091-6_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17090-9
Online ISBN: 978-3-319-17091-6
eBook Packages: Computer ScienceComputer Science (R0)