Classification of multivariate time series via temporal abstraction and time intervals mining

Moskovitch, Robert; Shahar, Yuval

doi:10.1007/s10115-014-0784-5

Classification of multivariate time series via temporal abstraction and time intervals mining

Regular Paper
Published: 01 October 2014

Volume 45, pages 35–74, (2015)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Robert Moskovitch^1,2 &
Yuval Shahar¹

4465 Accesses
3 Altmetric
Explore all metrics

Abstract

Classification of multivariate time series data, often including both time points and intervals at variable frequencies, is a challenging task. We introduce the KarmaLegoSification (KLS) framework for classification of multivariate time series analysis, which implements three phases: (1) application of a temporal abstraction process that transforms a series of raw time-stamped data points into a series of symbolic time intervals; (2) mining these symbolic time intervals to discover frequent time-interval-related patterns (TIRPs), using Allen’s temporal relations; and (3) using the TIRPs as features to induce a classifier. To efficiently detect multiple TIRPs (features) in a single entity to be classified, we introduce a new algorithm, SingleKarmaLego, which can be shown to be superior for that purpose over a Sequential TIRPs Detection algorithm. We evaluated the KLS framework on datasets in the domains of diabetes, intensive care, and infectious hepatitis, assessing the effects of the various settings of the KLS framework. Discretization using Symbolic Aggregate approXimation (SAX) led to better performance than using the equal-width discretization (EWD); knowledge-based cut-off definitions when available were superior to both. Using three abstract temporal relations was superior to using the seven core temporal relations. Using an epsilon value larger than zero tended to result in a slightly better accuracy when using the SAX discretization method, but resulted in a reduced accuracy when using EWD, and overall, does not seem beneficial. No feature selection method we tried proved useful. Regarding feature (TIRP) representation, mean duration performed better than horizontal support, which in turn performed better than the default Binary (existence) representation method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Introducing the contrast profile: a novel time series primitive that allows real world classification

Article 17 March 2022

Fast classification of univariate and multivariate time series through shapelet discovery

Article 12 December 2015

Modeling and Processing of Time Interval Data for Data-Driven Decision Support

Notes

Karma—The law of cause and effect originated in ancient India and is central to Hindu and Buddhist philosophies.
Lego—A popular game, in which modular bricks are used to construct different objects. [also, Le(t)go].

References

Allen JF (1983) Maintaining knowledge about temporal intervals. Commun ACM 26(11):832–843
Article MATH Google Scholar
Azulay R, Moskovitch R, Stopel D, Verduijn M, de Jonge E, Shahar Y (2007) Temporal discretization of medical time series—a comparative study. In: IDAMAP 2007, Amsterdam, The Netherlands,
Batal I, Valizadegan H, Cooper G, Hauskrecht M (2012a) A temporal pattern mining approach for classifying electronic health record data. ACM Transaction on Intelligent Systems and Technology (ACM TIST), Special Issue on Health Informatics
Batal I, Fradkin D, Harrison J, Moerchen F, Hauskrecht M (2012b) Mining recent temporal patterns for event detection in multivariate time series data. In: Proceedings of knowledge discovery and data mining (KDD), Beijing, China
Höppner F (2001) Learning temporal rules from state sequences. In: Proceedings of IJCAI Workshop on Learning from Temporal and Spatial Data (WLTSD-01), Seattle, USA, pp 25–31
Höppner F (2002) Time series abstraction methods—a survey workshop on knowledge discovery in databases, Dortmund
Hu B, Chen Y, Keogh E (2013) Time series classification under more realistic assumptions. In: Proceedings of SIAM data mining
Kam PS, Fu AWC (2000) Discovering temporal patterns for interval based events. In: Proceedings DaWaK-00
Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series with implications for streaming algorithms. In: 8th ACM SIGMOD DMKD workshop
Mörchen F, Ultsch A (2005) Optimizing time series discretization for knowledge discovery. In: Proceedings of the Eleventh ACM SIGKDD international conference on knowledge discovery in data mining, Chicago, Illinois, pp 660–665
Mörchen F (2006) Algorithms for time series knowledge mining. In: Proceedings of KDD
Moerchen F (2006) A better tool than Allen’s relations for expressing temporal knowledge in interval data. In: Workshop on temporal data mining
Moerchen F, Fradkin D (2010) Robust mining of time intervals with semi-interval partial order patterns. In: Proceedings of SIAM data mining
Moskovitch R, Hessing A, Shahar Y (2004) Vaidurya-a concept-based, context-sensitive search engine for clinical guidelines. Medinfo 11:140–144
Google Scholar
Moskovitch R, Stopel D, Verduijn M, Peek N, de Jonge E, Shahar Y (2007) Analysis of ICU patients using the time series knowledge mining method. In: IDAMAP 2007, Amsterdam, The Netherlands
Moskovitch R, Gus I, Pluderman S, Stopel D, Glezer C, Shahar Y, Elovici Y (2007) Detection of unknown computer worms activity based on computer behavior using data mining. In: IEEE Symposiyum on Computational Intelligence and Data Mining, Honolulu, Hawaii
Moskovitch R, Shahar Y (2009) Vaidurya: a multiple-ontology, concept-based, context-sensitive clinical-guideline search engine. J Biomed Inform 42(1):11–21
Article Google Scholar
Moskovitch R, Shahar Y (2009) Medical temporal-knowledge discovery via temporal abstraction. In: AMIA 2009, San Francisco, USA
Moskovitch R, Peek N, Shahar Y (2009) Classification of ICU patients via temporal abstraction and temporal patterns mining. In: IDAMAP, Verona, Italy
Moskovitch R, Shahar Y (2013) Fast time intervals mining using transitivity of temporal relations. Knowl Inf Syst. doi:10.1007/s10115-013-0707-x
Moskovitch R, Shahar Y (2014) Fast detection of time intervals related patterns, TechReport 11/14. Ben Gurion University, Beer Sheva, Israel
Moskovitch R, Walsh C, Hripcsak G, Tatonetti N (2014) Prediction of biomedical events via time intervals mining. In: Proceedings of ACM SIGKDD workshop on connected health at big data Era (BigCHat2014), New York, US
Papapetrou P, Kollios G, Sclaroff S, Gunopulos D (2009) Mining frequent arrangements of temporal intervals. Knowl Inf Syst 21(2):133–171
Patel D, Hsu W, Lee ML (2008) Mining relationships among interval-based events for classification. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, pp 393–404
Pei J, Han J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu MC (2001) PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 17th international conference data engineering (ICDE ’01)
Rabiner LR (1989) A tutorial on Hidden Markov Models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Ratanamahatana C, Keogh EJ (2005) Three myths about dynamic time warping data mining. In: Proceedings of SIAM data mining
Roddick J, Spiliopoulou M (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE Trans Knowl Data Eng 4(14):750–767
Sacchi L, Larizza C, Combi C, Bellazi R (2007) Data mining with temporal abstractions: learning rules from time series. Data Mining Knowl Discov 15(2):217–247
Shahar Y (1997) A framework for knowledge-based temporal abstraction. Artif Intell 90(1–2):79–133
Shahar Y (1998) Dynamic temporal interpretation contexts for temporal abstraction. Ann Math Artif Intell 22(1–2):159–192
Shahar Y (1999) Knowledge-based temporal interpolation. J Exp Theor, Artif Intell 11:102–111
Shahar Y, Chen H, Stites D, Basso L, Kaizer H, Wilson D, Musen MA (1999) Semiautomated acquisition of clinical temporal-abstraction knowledge. J Am Med Inform Assoc 6(6):494–511
Article Google Scholar
Shknevsky A, Moskovitch R, Shahar Y (2014) Semantic considerations in time intervals mining. In: Proceedings of ACM SIGKDD workshop on connected health at big data Era (BigCHat2014), New York, US
Stopel D, Boger Z, Moskovitch R, Shahar Y, Elovici Y (2006a) Application of artificial neural networks techniques to computer worm detection. In: International joint conference on neural networks, pp 2362–2369
Stopel D, Boger Z, Moskovitch R, Shahar Y, Elovici Y (2006b) Improving worm detection with artificial neural networks through feature selection and temporal analysis techniques. In: Proceedings of the third international conference on neural networks, Barcelona
Verduijn M, Sacchi L, Peek N, Bellazi R, de Jonge E, de Mol B (2007) Temporal abstraction for feature extraction: a comparative case study in prediction from intensive care monitoring data. Artif Intell Med 41:112
Article Google Scholar
Villafane R, Hua K, Tran D, Maulik B (2000) Knowledge discovery from time series of interval events. J Intell Inf Syst 15(1):71–89
Winarko E, Roddick J (2007) Armada—an algorithm for discovering richer relative temporal association rules from interval-based data. Data Knowl Eng 1(63):76–90
Wu S, Chen Y (2007) Mining non-ambiguous temporal patterns for interval-based events. IEEE Trans Knowl Data Eng 19(6)742–758

Download references

Acknowledgments

The authors wish to thank Marion Verduijn for sharing the ICU12h dataset, and Prof. Avi Porath from the Soroka Academic Medical Center for his assistance regarding the diabetes dataset. For insightful discussions on time intervals mining and classification using TIRPs, we would like to express our thanks to Christos Faloutsos, Christian Freksa, Panagoitis Papapetrou, Fabian Moerchen, Dhaval Patel and Iyad Batel, as well as to Guy Ezra in his help with some of the implementations. The authors also wish to acknowledge the highly useful comments of the anonymous reviewers, which have significantly improved this manuscript. This work was supported in part by grants from Deutsche Telekom Laboratories, HP labs Innovation Research Program

Author information

Authors and Affiliations

Department of Information Systems Engineering, Ben Gurion University, Beer-Sheva, Israel
Robert Moskovitch & Yuval Shahar
Department of Biomedical Informatics, Systems Biology, and Medicine, Columbia University, New York, NY, USA
Robert Moskovitch

Authors

Robert Moskovitch
View author publications
You can also search for this author inPubMed Google Scholar
Yuval Shahar
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Robert Moskovitch.

Appendix: Comparing the SingleKarmaLego algorithm to a sequential TIRPs detection algorithm

The theoretical advantages that can be demonstrated through a worst-case and average-case analyses of the SingleKarmaLego (SKL) algorithm, compared to a Sequential TIRPs Detection algorithm, are described in detail elsewhere, as are the complete results of the empirical runtime evaluation [21].

In our full experiment, as described in that study, we compared the runtime of the SKL algorithm to a Sequential TIRPs Detection (STD) algorithm in three very different medical domains. Here, we present just a sample of the results. We start by describing the Sequential TIRPs Detection algorithm we compared SKL to.

We first describe the STD algorithm for TIRPs matching; we then show the runtime results on one of the three datasets used in the current study (the ICU dataset).

Algorithm 7 describes a typical Sequential TIRPs Detection algorithm. The algorithm accepts a vector of the single-entity symbolic time intervals (entity_stis) and a list of TIRPs to detect (tirpsList). The algorithm detects all of the instances for each of the TIRPs (line 1). For each TIRP, it starts by going over all of the symbolic time intervals (line 2). If the first symbol of the current TIRP (tirp.s[0]) was detected, a while loop starts to look for the TIRP’s next symbols and verifies that their temporal relations with the previous detected symbolic time intervals are the same as in the TIRP temporal relations definition (tirp.r). If the time duration between the i-interval and the j-interval is larger than max_gap, the search for instances is stopped.

In the full set of experiments, we ran both the SKL algorithm and the Sequential TIRPs Detection methods on all of the three datasets used in the current paper, with a varying number of TIRPs to detect and number of entities to detect, that demonstrate typical condition of evaluation run, as well as give a good estimation of the runtime for a single entity (when dividing the runtime in the number of entities to detect).

Here, we present several typical runtime results for the ICU dataset. The ICU set has a mean of 292 symbolic time intervals in each entity; we show the runtime results when getting as input 40, 60, 80, 100, and 120 TIRPs to detect, and 50, 100, or 200 entities in which to detect these TIRPs, to demonstrate the increasing difference in runtime of the Sequential TIRPs Detection algorithm, which requires longer and longer computation times, in comparison with the SKL algorithm, whose runtime grows only slightly as the number of entities significantly increases.

Figures 23 and 24 show the runtime comparison of the SingleKarmaLego (SKL) algorithm versus the Sequential TIRPs Detection (STD) algorithm on the ICU dataset, with two sizes of max_gap of 5 and 10 time units in the domain (seconds in the case of the ICU domain). On each max_gap, the algorithms were ran on either 50, 100, or 200 entities, as well as on 40, 60, 80, 100, or 120 TIRPs to detect. Each graph shows the runtime in second on the vertical axis and the horizontal axis shows both the number of TIRPs at the top and the number of entities in the bottom. As can be seen in the following figures, the advantages of the SKL algorithm, which requires less computation time for all of the combinations we experimented with, are clear for any number of entities and TIRPs to detect.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moskovitch, R., Shahar, Y. Classification of multivariate time series via temporal abstraction and time intervals mining. Knowl Inf Syst 45, 35–74 (2015). https://doi.org/10.1007/s10115-014-0784-5

Download citation

Received: 12 June 2013
Revised: 01 May 2014
Accepted: 04 September 2014
Published: 01 October 2014
Issue Date: October 2015
DOI: https://doi.org/10.1007/s10115-014-0784-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classification of multivariate time series via temporal abstraction and time intervals mining

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Introducing the contrast profile: a novel time series primitive that allows real world classification

Fast classification of univariate and multivariate time series through shapelet discovery

Modeling and Processing of Time Interval Data for Data-Driven Decision Support

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Comparing the SingleKarmaLego algorithm to a sequential TIRPs detection algorithm

Appendix: Comparing the SingleKarmaLego algorithm to a sequential TIRPs detection algorithm

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now