Skip to main content

Advertisement

Log in

Classification of multivariate time series via temporal abstraction and time intervals mining

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Classification of multivariate time series data, often including both time points and intervals at variable frequencies, is a challenging task. We introduce the KarmaLegoSification (KLS) framework for classification of multivariate time series analysis, which implements three phases: (1) application of a temporal abstraction process that transforms a series of raw time-stamped data points into a series of symbolic time intervals; (2) mining these symbolic time intervals to discover frequent time-interval-related patterns (TIRPs), using Allen’s temporal relations; and (3) using the TIRPs as features to induce a classifier. To efficiently detect multiple TIRPs (features) in a single entity to be classified, we introduce a new algorithm, SingleKarmaLego, which can be shown to be superior for that purpose over a Sequential TIRPs Detection algorithm. We evaluated the KLS framework on datasets in the domains of diabetes, intensive care, and infectious hepatitis, assessing the effects of the various settings of the KLS framework. Discretization using Symbolic Aggregate approXimation (SAX) led to better performance than using the equal-width discretization (EWD); knowledge-based cut-off definitions when available were superior to both. Using three abstract temporal relations was superior to using the seven core temporal relations. Using an epsilon value larger than zero tended to result in a slightly better accuracy when using the SAX discretization method, but resulted in a reduced accuracy when using EWD, and overall, does not seem beneficial. No feature selection method we tried proved useful. Regarding feature (TIRP) representation, mean duration performed better than horizontal support, which in turn performed better than the default Binary (existence) representation method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22

Similar content being viewed by others

Notes

  1. Karma—The law of cause and effect originated in ancient India and is central to Hindu and Buddhist philosophies.

  2. Lego—A popular game, in which modular bricks are used to construct different objects. [also, Le(t)go].

References

  1. Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843

    Article  MATH  Google Scholar 

  2. Azulay R, Moskovitch R, Stopel D, Verduijn M, de Jonge E, Shahar Y (2007) Temporal discretization of medical time series—a comparative study. In: IDAMAP 2007, Amsterdam, The Netherlands,

  3. Batal I, Valizadegan H, Cooper G, Hauskrecht M (2012a) A temporal pattern mining approach for classifying electronic health record data. ACM Transaction on Intelligent Systems and Technology (ACM TIST), Special Issue on Health Informatics

  4. Batal I, Fradkin D, Harrison J, Moerchen F, Hauskrecht M (2012b) Mining recent temporal patterns for event detection in multivariate time series data. In: Proceedings of knowledge discovery and data mining (KDD), Beijing, China

  5. Höppner F (2001) Learning temporal rules from state sequences. In: Proceedings of IJCAI Workshop on Learning from Temporal and Spatial Data (WLTSD-01), Seattle, USA, pp 25–31

  6. Höppner F (2002) Time series abstraction methods—a survey workshop on knowledge discovery in databases, Dortmund

  7. Hu B, Chen Y, Keogh E (2013) Time series classification under more realistic assumptions. In: Proceedings of SIAM data mining

  8. Kam PS, Fu AWC (2000) Discovering temporal patterns for interval based events. In: Proceedings DaWaK-00

  9. Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series with implications for streaming algorithms. In: 8th ACM SIGMOD DMKD workshop

  10. Mörchen F, Ultsch A (2005) Optimizing time series discretization for knowledge discovery. In: Proceedings of the Eleventh ACM SIGKDD international conference on knowledge discovery in data mining, Chicago, Illinois, pp 660–665

  11. Mörchen F (2006) Algorithms for time series knowledge mining. In: Proceedings of KDD

  12. Moerchen F (2006) A better tool than Allen’s relations for expressing temporal knowledge in interval data. In: Workshop on temporal data mining

  13. Moerchen F, Fradkin D (2010) Robust mining of time intervals with semi-interval partial order patterns. In: Proceedings of SIAM data mining

  14. Moskovitch R, Hessing A, Shahar Y (2004) Vaidurya-a concept-based, context-sensitive search engine for clinical guidelines. Medinfo 11:140–144

    Google Scholar 

  15. Moskovitch R, Stopel D, Verduijn M, Peek N, de Jonge E, Shahar Y (2007) Analysis of ICU patients using the time series knowledge mining method. In: IDAMAP 2007, Amsterdam, The Netherlands

  16. Moskovitch R, Gus I, Pluderman S, Stopel D, Glezer C, Shahar Y, Elovici Y (2007) Detection of unknown computer worms activity based on computer behavior using data mining. In: IEEE Symposiyum on Computational Intelligence and Data Mining, Honolulu, Hawaii

  17. Moskovitch R, Shahar Y (2009) Vaidurya: a multiple-ontology, concept-based, context-sensitive clinical-guideline search engine. J Biomed Inform 42(1):11–21

    Article  Google Scholar 

  18. Moskovitch R, Shahar Y (2009) Medical temporal-knowledge discovery via temporal abstraction. In: AMIA 2009, San Francisco, USA

  19. Moskovitch R, Peek N, Shahar Y (2009) Classification of ICU patients via temporal abstraction and temporal patterns mining. In: IDAMAP, Verona, Italy

  20. Moskovitch R, Shahar Y (2013) Fast time intervals mining using transitivity of temporal relations. Knowl Inf Syst. doi:10.1007/s10115-013-0707-x

  21. Moskovitch R, Shahar Y (2014) Fast detection of time intervals related patterns, TechReport 11/14. Ben Gurion University, Beer Sheva, Israel

  22. Moskovitch R, Walsh C, Hripcsak G, Tatonetti N (2014) Prediction of biomedical events via time intervals mining. In: Proceedings of ACM SIGKDD workshop on connected health at big data Era (BigCHat2014), New York, US

  23. Papapetrou P, Kollios G, Sclaroff S, Gunopulos D (2009) Mining frequent arrangements of temporal intervals. Knowl Inf Syst 21(2):133–171

  24. Patel D, Hsu W, Lee ML (2008) Mining relationships among interval-based events for classification. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, pp 393–404

  25. Pei J, Han J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu MC (2001) PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 17th international conference data engineering (ICDE ’01)

  26. Rabiner LR (1989) A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc IEEE 77(2):257–286

  27. Ratanamahatana C, Keogh EJ (2005) Three myths about dynamic time warping data mining. In: Proceedings of SIAM data mining

  28. Roddick J, Spiliopoulou M (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE Trans Knowl Data Eng 4(14):750–767

  29. Sacchi L, Larizza C, Combi C, Bellazi R (2007) Data mining with temporal abstractions: learning rules from time series. Data Mining Knowl Discov 15(2):217–247

  30. Shahar Y (1997) A framework for knowledge-based temporal abstraction. Artif Intell 90(1–2):79–133

  31. Shahar Y (1998) Dynamic temporal interpretation contexts for temporal abstraction. Ann Math Artif Intell 22(1–2):159–192

  32. Shahar Y (1999) Knowledge-based temporal interpolation. J Exp Theor, Artif Intell 11:102–111

  33. Shahar Y, Chen H, Stites D, Basso L, Kaizer H, Wilson D, Musen MA (1999) Semiautomated acquisition of clinical temporal-abstraction knowledge. J Am Med Inform Assoc 6(6):494–511

    Article  Google Scholar 

  34. Shknevsky A, Moskovitch R, Shahar Y (2014) Semantic considerations in time intervals mining. In: Proceedings of ACM SIGKDD workshop on connected health at big data Era (BigCHat2014), New York, US

  35. Stopel D, Boger Z, Moskovitch R, Shahar Y, Elovici Y (2006a) Application of artificial neural networks techniques to computer worm detection. In: International joint conference on neural networks, pp 2362–2369

  36. Stopel D, Boger Z, Moskovitch R, Shahar Y, Elovici Y (2006b) Improving worm detection with artificial neural networks through feature selection and temporal analysis techniques. In: Proceedings of the third international conference on neural networks, Barcelona

  37. Verduijn M, Sacchi L, Peek N, Bellazi R, de Jonge E, de Mol B (2007) Temporal abstraction for feature extraction: a comparative case study in prediction from intensive care monitoring data. Artif Intell Med 41:112

    Article  Google Scholar 

  38. Villafane R, Hua K, Tran D, Maulik B (2000) Knowledge discovery from time series of interval events. J Intell Inf Syst 15(1):71–89

  39. Winarko E, Roddick J (2007) Armada—an algorithm for discovering richer relative temporal association rules from interval-based data. Data Knowl Eng 1(63):76–90

  40. Wu S, Chen Y (2007) Mining non-ambiguous temporal patterns for interval-based events. IEEE Trans Knowl Data Eng 19(6)742–758

Download references

Acknowledgments

The authors wish to thank Marion Verduijn for sharing the ICU12h dataset, and Prof. Avi Porath from the Soroka Academic Medical Center for his assistance regarding the diabetes dataset. For insightful discussions on time intervals mining and classification using TIRPs, we would like to express our thanks to Christos Faloutsos, Christian Freksa, Panagoitis Papapetrou, Fabian Moerchen, Dhaval Patel and Iyad Batel, as well as to Guy Ezra in his help with some of the implementations. The authors also wish to acknowledge the highly useful comments of the anonymous reviewers, which have significantly improved this manuscript. This work was supported in part by grants from Deutsche Telekom Laboratories, HP labs Innovation Research Program

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Robert Moskovitch.

Appendix: Comparing the SingleKarmaLego algorithm to a sequential TIRPs detection algorithm

Appendix: Comparing the SingleKarmaLego algorithm to a sequential TIRPs detection algorithm

The theoretical advantages that can be demonstrated through a worst-case and average-case analyses of the SingleKarmaLego (SKL) algorithm, compared to a Sequential TIRPs Detection algorithm, are described in detail elsewhere, as are the complete results of the empirical runtime evaluation [21].

In our full experiment, as described in that study, we compared the runtime of the SKL algorithm to a Sequential TIRPs Detection (STD) algorithm in three very different medical domains. Here, we present just a sample of the results. We start by describing the Sequential TIRPs Detection algorithm we compared SKL to.

figure i

We first describe the STD algorithm for TIRPs matching; we then show the runtime results on one of the three datasets used in the current study (the ICU dataset).

Algorithm 7 describes a typical Sequential TIRPs Detection algorithm. The algorithm accepts a vector of the single-entity symbolic time intervals (entity_stis) and a list of TIRPs to detect (tirpsList). The algorithm detects all of the instances for each of the TIRPs (line 1). For each TIRP, it starts by going over all of the symbolic time intervals (line 2). If the first symbol of the current TIRP (tirp.s[0]) was detected, a while loop starts to look for the TIRP’s next symbols and verifies that their temporal relations with the previous detected symbolic time intervals are the same as in the TIRP temporal relations definition (tirp.r). If the time duration between the i-interval and the j-interval is larger than max_gap, the search for instances is stopped.

In the full set of experiments, we ran both the SKL algorithm and the Sequential TIRPs Detection methods on all of the three datasets used in the current paper, with a varying number of TIRPs to detect and number of entities to detect, that demonstrate typical condition of evaluation run, as well as give a good estimation of the runtime for a single entity (when dividing the runtime in the number of entities to detect).

Here, we present several typical runtime results for the ICU dataset. The ICU set has a mean of 292 symbolic time intervals in each entity; we show the runtime results when getting as input 40, 60, 80, 100, and 120 TIRPs to detect, and 50, 100, or 200 entities in which to detect these TIRPs, to demonstrate the increasing difference in runtime of the Sequential TIRPs Detection algorithm, which requires longer and longer computation times, in comparison with the SKL algorithm, whose runtime grows only slightly as the number of entities significantly increases.

Figures 23 and 24 show the runtime comparison of the SingleKarmaLego (SKL) algorithm versus the Sequential TIRPs Detection (STD) algorithm on the ICU dataset, with two sizes of max_gap of 5 and 10 time units in the domain (seconds in the case of the ICU domain). On each max_gap, the algorithms were ran on either 50, 100, or 200 entities, as well as on 40, 60, 80, 100, or 120 TIRPs to detect. Each graph shows the runtime in second on the vertical axis and the horizontal axis shows both the number of TIRPs at the top and the number of entities in the bottom. As can be seen in the following figures, the advantages of the SKL algorithm, which requires less computation time for all of the combinations we experimented with, are clear for any number of entities and TIRPs to detect.

Fig. 23
figure 23

A runtime comparison of the SingleKarmaLego (SKL) and Sequential TIRP Detection (STD) algorithms to detect a set of TIRPs within multiple given entities and a number of TIRPs to detect, on the ICU dataset. The maximal gap here was 5 domain time units (here, seconds). While SKL’s runtime is growing only slightly with the increase in the number of entities and number of TIRPs, the runtime of STD is increasing quickly, especially when applied to 200 entities

Fig. 24
figure 24

A runtime comparison of the SingleKarmaLego (SKL) and Sequential TIRP Detection (STD) algorithms to detect a set of TIRPs within multiple given entities and a number of TIRPs to detect, on the ICU dataset. The maximal gap here was 10 domain time units (here, seconds). While SKL’s runtime is growing only slightly with the increase in the number of entities and number of TIRPs, the runtime of STD is increasing quickly already when applied to 100 entities. Compared to the max_gap \(=\) 5, the STD algorithm is slowing down sooner

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Moskovitch, R., Shahar, Y. Classification of multivariate time series via temporal abstraction and time intervals mining. Knowl Inf Syst 45, 35–74 (2015). https://doi.org/10.1007/s10115-014-0784-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-014-0784-5

Keywords