Skip to main content
Log in

Extracting diverse-shapelets for early classification on time series

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

In recent years, early classification on time series has become increasingly important in time-sensitive applications. Existing shapelet based methods still cannot work well on this problem. First, the effectiveness of traditional shapelet based methods would be influenced by the number of shapelet candidates. Second, it is difficult for previous methods to obtain diverse shapelets in shapelet selection. In this paper, we propose an Improved Early Distinctive Shapelet Classification method named IEDSC. We first present a new method to more precisely measure the similarity between time series, which takes into account of the relative trend of time series. Second, in shapelet extraction, we propose a pruning technique to reduce the number of shapelets by predicting the starting positions of shapelets with good quality. In addition, a new shapelet selection method is also proposed to remove the similar shapelets, so as to maintain the diversity of shapelets. Finally, the experimental results on 16 benchmark datasets show that the proposed method outperforms state-of-the-art for early classification on time series.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16

Similar content being viewed by others

References

  1. Ando, S., Suzuki, E.: Minimizing response time in time series classification. Knowl. Inf. Syst. 46(2), 449–476 (2016)

    Article  Google Scholar 

  2. Bentley, J. L., Sedgewick, R.: Fast algorithms for sorting and searching strings. In: 8th Acm-Siam symposium on discrete algorithms, pp. 360–369 (1997)

  3. Chiu, B., Keogh, E., Lonardi, S.: Probabilistic discovery of time series motifs. In: Proc.acm Sigkdd int.conf.on knowledge discovery & data mining, pp 493–498 (2003)

  4. Dau, H.A., Keogh, E., Kamgar, K., Yeh, C.C.M., Zhu, Y., Gharghabi, S., Ratanamahatana, C.A., Yanping, H.B., Begum, N., Bagnall, A., Mueen, A., Batista, G.: The ucr time series classification archive. https://www.cs.ucr.edu/eamonn/time_series_data_2018/ (2018)

  5. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7(Jan), 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  6. Di Marzio, M., Taylor, C. C.: Kernel density classification and boosting: an l2 analysis. Stat. Comput. 15(2), 113–123 (2005)

    Article  MathSciNet  Google Scholar 

  7. Fulcher, B.D.: Feature-based time-series analysis. arXiv:1709.08055 (2017)

  8. Ghalwash, M. F., Obradovic, Z.: Early classification of multivariate temporal observations by extraction of interpretable shapelets. BMC Bioinforma. 13(1), 195 (2012). https://doi.org/10.1186/1471-2105-13-195

    Article  Google Scholar 

  9. Ghalwash, M. F., Radosavljevic, V., Obradovic, Z.: Utilizing temporal patterns for estimating uncertainty in interpretable early decision making. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp. 402–411 (2014)

  10. Ghalwash, M. F., Ramljak, D., Obradovic, Z.: Early classification of multivariate time series using a hybrid hmm/svm model. In: Proceedings of the 2012 IEEE international conference on bioinformatics and biomedicine (BIBM), pp. 1–6 (2012)

  11. Grabocka, J., Schilling, N., Wistuba, M., Schmidt-Thieme, L.: Learning time-series shapelets. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp. 392–401 (2014)

  12. Hartvigsen, T., Sen, C., Kong, X., Rundensteiner, E.: Adaptive-halting policy network for early classification. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 101–110 (2019)

  13. He, G., Duan, Y., Peng, R., Jing, X., Qian, T., Wang, L.: Early classification on multivariate time series. Neurocomputing 149, 777–787 (2015)

    Article  Google Scholar 

  14. He, G., Zhao, W., Xia, X., Peng, R., Wu, X.: An ensemble of shapelet-based classifiers on inter-class and intra-class imbalanced multivariate time series at the early stage. Soft Computing (2018)

  15. Jiang, L., Li, C., Cai, Z.: Learning decision tree for ranking. Knowl. Inf. Syst. 20(1), 123–135 (2009)

    Article  Google Scholar 

  16. Karlsson, I., Papapetrou, P., Boström, H.: Early random shapelet forest. In: Calders, T., Ceci, M., Malerba, D. (eds.) Discovery science, pp 261–276. Springer International Publishing, Cham (2016)

  17. Keller, J. M., Gray, M. R., Givens, J. A.: A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man Cybern. 4, 580–585 (1985)

    Article  Google Scholar 

  18. Keogh, E., Jessica, L., Ada, F.: Hot sax: Finding the most unusual time series subsequence: Algorithms and applications. In: International conference on data mining, pp. 1–27 (2008)

  19. Li, G., Bräysy, O., Jiang, L., Wu, Z., Wang, Y.: Finding time series discord based on bit representation clustering. Knowl.-Based Syst. 54, 243–254 (2013)

    Article  Google Scholar 

  20. Li, G., Yan, W., Wu, Z.: Discovering shapelets with key points in time series classification. Expert Syst. Appl. 132, 76–86 (2019)

    Article  Google Scholar 

  21. Lin, T. H., Kaminski, N., Bar-Joseph, Z.: Alignment and classification of time series gene expression in clinical studies. Bioinformatics 24(13), 147–155 (2008)

    Article  Google Scholar 

  22. Lines, J., Davis, L. M., Hills, J., Bagnall, A.: A shapelet transform for time series classification. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’12, pp. 289–297. ACM (2012)

  23. Ma, C., Weng, X., Shan, Z.: Early classification of multivariate time series based on piecewise aggregate approximation. In: Health information science, pp. 81–88 (2017)

  24. Mori, U., Mendiburu, A., Dasgupta, S., Lozano, J. A.: Early classification of time series by simultaneously optimizing the accuracy and earliness. IEEE Transactions on Neural Networks and Learning Systems (2017)

  25. Mori, U., Mendiburu, A., Keogh, E., Lozano, J. A.: Reliable early classification of time series based on discriminating the classes over time. Data Min. Knowl. Disc. 31(1), 233–263 (2017)

    Article  MathSciNet  Google Scholar 

  26. Parrish, N., Anderson, H. S., Gupta, M. R., Hsiao, D. Y.: Classifying with confidence from incomplete information. J. Mach. Learn. Res. 14(1), 3561–3589 (2013)

    MathSciNet  MATH  Google Scholar 

  27. Romain, T., Simon, M.: Cost-aware early classification of time series. In: Machine learning and knowledge discovery in databases, pp. 632–647 (2016)

  28. Sangnier, M., Gauthier, J., Rakotomamonjy, A.: Early and reliable event detection using proximity space representation. In: Proceedings of the 33rd international conference on international conference on machine learning - vol. 48, ICML’16, pp. 2310–2319 (2016)

  29. Schäfer, P., Leser, U.: Teaser: Early and accurate time series classification. arXiv:1908.03405 (2019)

  30. Song, W., Wang, L., Xiang, Y., Zomaya, A. Y.: Geographic spatiotemporal big data correlation analysis via the hilbert-huang transformation. J. Comput. Syst. Sci. 89, 130–141 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  31. Wang, S., Cao, J., Yu, P.S.: Deep learning for spatio-temporal data mining: A survey. arXiv:1906.04928 (2019)

  32. Wang, W., Chen, C., Wang, W., Rai, P., Carin, L.: Earliness-aware deep convolutional networks for early time series classification. arXiv:1611.04578 (2016)

  33. Wu, J., Pan, S., Zhu, X., Cai, Z.: Boosting for multi-graph classification. IEEE Trans. Cybern. 45(3), 430–443 (2015)

    Article  Google Scholar 

  34. Xing, Z., Pei, J., Yu, P. S.: Early prediction on time series: A nearest neighbor approach. In: International jont conference on artifical intelligence, pp. 1297–1302 (2009)

  35. Xing, Z., Pei, J., Yu, P. S.: Early classification on time series. Knowl. Inf. Syst. 31(1), 105–127 (2012)

    Article  Google Scholar 

  36. Xing, Z., Pei, J., Yu, P. S., Wang, K.: Extracting interpretable features for early classification on time series. In: 11th Siam international conference on data mining, SDM 2011, April 28-30, 2011, Mesa, Arizona, USA, pp. 247–258 (2011)

  37. Ye, L., Keogh, E.: Time series shapelets:a new primitive for data mining. In: ACM SIGKDD international conference on knowledge discovery and data mining, Paris, France, June 28 - July, pp. 947–956 (2009)

  38. Yeh, C. C. M., Zhu, Y., Ulanova, L., Begum, N., Ding, Y., Dau, H. A., Zimmerman, Z., Silva, D. F., Mueen, A., Keogh, E.: Time series joins, motifs, discords and shapelets: A unifying view that exploits the matrix profile. Data Mining & Knowledge Discovery 32(1), 83–123 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  39. Zalewski, W., Silva, F., Maletzke, A. G., Ferrero, C. A.: Exploring shapelet transformation for time series classification in decision trees. Knowl.-Based Syst. 112, 80–91 (2016)

    Article  Google Scholar 

Download references

Acknowledgment

The authors would like to thank Prof. Eamonn Keogh and all the people who have contributed to the UCR time series classification archive for their selfless work. We also thank the anonymous reviewers for their valuable advice.

The work is supported by the National Natural Science Foundation of China (No. 61702468), Open Research Project of The Hubei Key Laboratory of Intelligent Geo-Information Processing (no. KLIGIP-2018B03) and the Zhejiang Provincial Natural Science Foundation of China (No. LZ18F020001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guiling Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, W., Li, G., Wu, Z. et al. Extracting diverse-shapelets for early classification on time series. World Wide Web 23, 3055–3081 (2020). https://doi.org/10.1007/s11280-020-00820-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-020-00820-z

Keywords

Navigation