Skip to main content

A Novel Decision Tree Approach for the Handling of Time Series

  • Conference paper
  • First Online:
Book cover Mining Intelligence and Knowledge Exploration (MIKE 2018)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11308))

Abstract

Time series play a major role in many analysis tasks. As an example, in the stock market, they can be used to model price histories and to make predictions about future trends. Sometimes, information contained in a time series is complemented by other kinds of data, which may be encoded by static attributes, e.g., categorical or numeric ones, or by more general discrete data sequences. In this paper, we present J48SS, a novel decision tree learning algorithm capable of natively mixing static, sequential, and time series data for classification purposes. The proposed solution is based on the well-known C4.5 decision tree learner, and it relies on the concept of time series shapelets, which are generated by means of multi-objective evolutionary computation techniques and, differently from most previous approaches, are not required to be part of the training set. We evaluate the algorithm against a set of well-known UCR time series datasets, and we show that it provides better classification performances with respect to previous approaches based on decision trees, while generating highly interpretable models and effectively reducing the data preparation effort.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adesuyi, A.S., Munch, Z.: Using time-series NDVI to model land cover change: a case study in the Berg River catchment area, Western Cape, South Africa. Int. J. Environ. Chem. Ecol. Geol. Geophys. Eng. 9(5), 537–542 (2015)

    Google Scholar 

  2. Arathi, M., Govardhan, A.: Effect of Mahalanobis distance on time series classification using shapelets. In: Satapathy, S., Govardhan, A., Raju, K., Mandal, J. (eds.) CSI 2015. AISC, vol. 338, pp. 525–535. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-13731-5_57

    Chapter  Google Scholar 

  3. Barros, R.C., Freitas, A.A.: A survey of evolutionary algorithms for decision-tree induction. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(3), 291–312 (2012)

    Article  Google Scholar 

  4. Boström, H.: Concurrent learning of large-scale random forests. In: SCAI. Frontiers in Artificial Intelligence and Applications, vol. 227, pp. 20–29. IOS Press (2011)

    Google Scholar 

  5. Brunello, A., Gallo, P., Marzano, E., Montanari, A., Vitacolonna, N.: An event-based data warehouse to support decisions in multi-channel, multi-service contact centers. J. Cases Inf. Technol. 21(1), 33–51 (2019)

    Article  Google Scholar 

  6. Brunello, A., Marzano, E., Montanari, A., Sciavicco, G.: J48S: a sequence classification approach to text analysis based on decision trees. In: Damaševičius, R., Vasiljevienė, G. (eds.) ICIST 2018. CCIS, vol. 920, pp. 240–256. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99972-2_19

    Chapter  Google Scholar 

  7. Chen, Y., et al.: The UCR time series classification archive, July 2015

    Google Scholar 

  8. Dabhi, V.K., Chaudhary, S.: A survey on techniques of improving generalization ability of genetic programming solutions. arXiv preprint arXiv:1211.1119 (2012)

  9. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  Google Scholar 

  10. Durillo, J.J., Nebro, A.J., Alba, E.: The jMetal framework for multi-objective optimization: design and architecture. In: Proceedings of the IEEE Congress on Evolutionary Computation (CEC 2010), Barcelona, Spain, pp. 4138–4325, July 2010

    Google Scholar 

  11. Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-662-05094-1

    Book  MATH  Google Scholar 

  12. Fitzgerald, J., Azad, R.M.A., Ryan, C.: A bootstrapping approach to reduce over-fitting in genetic programming. In: Proceedings of the 15th Annual Conference Companion on Genetic and Evolutionary Computation (GECCO 2013), pp. 1113–1120. ACM (2013)

    Google Scholar 

  13. Gagné, C., Schoenauer, M., Parizeau, M., Tomassini, M.: Genetic programming, validation sets, and parsimony pressure. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 109–120. Springer, Heidelberg (2006). https://doi.org/10.1007/11729976_10

    Chapter  Google Scholar 

  14. Gonçalves, I., Silva, S.: Balancing learning and overfitting in genetic programming with interleaved sampling of training data. In: Krawiec, K., Moraglio, A., Hu, T., Etaner-Uyar, A.Ş., Hu, B. (eds.) EuroGP 2013. LNCS, vol. 7831, pp. 73–84. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37207-0_7

    Chapter  Google Scholar 

  15. Grabocka, J., Wistuba, M., Schmidt-Thieme, L.: Scalable discovery of time-series shapelets. arXiv preprint arXiv:1503.03238 (2015)

  16. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)

    Article  Google Scholar 

  17. Hou, L., Kwok, J.T., Zurada, J.M.: Efficient learning of timeseries shapelets. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI 2016) (2016)

    Google Scholar 

  18. Kampouraki, A., Manis, G., Nikou, C.: Heartbeat time series classification with support vector machines. IEEE Trans. Inf. Technol. Biomed. 13(4), 512–518 (2009)

    Article  Google Scholar 

  19. Karim, F., Majumdar, S., Darabi, H., Chen, S.: LSTM fully convolutional networks for time series classification, 6, 1662–1669 (2018). arXiv preprint arXiv:1709.05206

  20. Karlsson, I., Papapetrou, P., Boström, H.: Generalized random shapelet forests. Data Min. Knowl. Discov. 30(5), 1053–1085 (2016)

    Article  MathSciNet  Google Scholar 

  21. Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (SIGMOD 2003), pp. 2–11. ACM (2003)

    Google Scholar 

  22. Mabroukeh, N.R., Ezeife, C.I.: A taxonomy of sequential pattern mining algorithms. ACM Comput. Surv. 43(1), 1–41 (2010)

    Article  Google Scholar 

  23. Mörchen, F., Ultsch, A.: Optimizing time series discretization for knowledge discovery. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (KDD 2005), pp. 660–665. ACM (2005)

    Google Scholar 

  24. Moskovitch, R., Shahar, Y.: Classification-driven temporal discretization of multivariate time series. Data Min. Knowl. Discov. 29(4), 871–913 (2015)

    Article  MathSciNet  Google Scholar 

  25. Nerlove, M., Grether, D.M., Carvalho, J.L.: Analysis of Economic Time Series: A Synthesis. Academic Press, New York (2014)

    MATH  Google Scholar 

  26. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  27. Rakthanmanon, T., Keogh, E.: Fast shapelets: a scalable algorithm for discovering time series shapelets. In: Proceedings of the 2013 SIAM International Conference on Data Mining (SIAM 2013), pp. 668–676 (2013)

    Chapter  Google Scholar 

  28. Renard, X., Rifqi, M., Erray, W., Detyniecki, M.: Random-shapelet: an algorithm for fast shapelet discovery. In: Proceedings of the 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA 2015), pp. 1–10. IEEE (2015)

    Google Scholar 

  29. Schäfer, P., Leser, U.: Fast and accurate time series classification with WEASEL. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (CIKM 2017), pp. 637–646. ACM (2017)

    Google Scholar 

  30. Shah, M., Grabocka, J., Schilling, N., Wistuba, M., Schmidt-Thieme, L.: Learning DTW-shapelets for time-series classification. In: Proceedings of the 3rd IKDD Conference on Data Science (CODS 2016), p. 3. ACM (2016)

    Google Scholar 

  31. Vanneschi, L., Castelli, M., Silva, S.: Measuring bloat, overfitting and functional complexity in genetic programming. In: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation (GECCO 2010), pp. 877–884. ACM (2010)

    Google Scholar 

  32. Wei, L.Y., et al.: A hybrid time series model based on AR-EMD and volatility for medical data forecasting: a case study in the emergency department. Int. J. Manag., Econ. Soc. Sci. (IJMESS) 6(Spec. Issue), 166–184 (2017)

    Google Scholar 

  33. Welch, T.A.: A technique for high-performance data compression. Computer 17(6), 8–19 (1984)

    Article  Google Scholar 

  34. Wistuba, M., Grabocka, J., Schmidt-Thieme, L.: Ultra-fast shapelets for time series classification. arXiv preprint arXiv:1503.05018 (2015)

  35. Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2016). https://www.cs.waikato.ac.nz/ml/weka/book.html

  36. Ye, L., Keogh, E.: Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2009), pp. 947–956. ACM (2009)

    Google Scholar 

Download references

Acknowledgments

Andrea Brunello and Angelo Montanari would like to thank the PRID project ENCASE - Efforts in the uNderstanding of Complex interActing SystEms for the support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrea Brunello .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Brunello, A., Marzano, E., Montanari, A., Sciavicco, G. (2018). A Novel Decision Tree Approach for the Handling of Time Series. In: Groza, A., Prasath, R. (eds) Mining Intelligence and Knowledge Exploration. MIKE 2018. Lecture Notes in Computer Science(), vol 11308. Springer, Cham. https://doi.org/10.1007/978-3-030-05918-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05918-7_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05917-0

  • Online ISBN: 978-3-030-05918-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics