Abstract
We present a new method for the understandable description of local temporal relationships in multivariate data, called Time Series Knowledge Mining (TSKM). We define the Time Series Knowledge Representation (TSKR) as a new language for expressing temporal knowledge in time interval data. The patterns have a hierarchical structure, with levels corresponding to the temporal concepts duration, coincidence, and partial order. The patterns are very compact, but offer details for each element on demand. In comparison with related approaches, the TSKR is shown to have advantages in robustness, expressivity, and comprehensibility. The search for coincidence and partial order in interval data can be formulated as instances of the well known frequent itemset problem. Efficient algorithms for the discovery of the patterns are adapted accordingly. A novel form of search space pruning effectively reduces the size of the mining result to ease interpretation and speed up the algorithms. Human interaction is used during the mining to analyze and validate partial results as early as possible and guide further processing steps. The efficacy of the methods is demonstrated using two real life data sets. In an application to sports medicine the results were recognized as valid and useful by an expert of the field.
Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Afrati F, Gionis A, Mannila H (2004) Approximating a collection of frequent sets. In: Kim W, Kohavi R, Gehrke J, DuMouchel W (eds) Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04). ACM Press, pp 12–19
Aggarwal CC (2001) A human-computer cooperative system for effective high dimensional clustering. In: Provost F, Srikant R (eds) Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data Mining (KDD’01). ACM Press, pp 221–226
Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In Buneman P, Jajodia S (eds) Proceedings of the 1993 ACM SIGMOD international conference on management of data. ACM Press, pp 207–216
Aiello M, Monz C, Todoran L and Worring M (2002). Document understanding for a broad class of documents. Int J Document Anal Recog 5(1): 1–16
Allen JF (1983). Maintaining knowledge about temporal intervals. Commun ACM 26(11): 832–843
Ankerst M, Ester M, Kriegel H-P (2000) Towards an effective cooperation of the user and the computer for classification. In: Ramakrishnan R, Stolfo S, Bayardo R, Parsa I (eds) Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’00). ACM Press, pp 179–188
Bayardo RJ (1998) Efficiently mining long patterns from databases. In: Tiwary A, Franklin M (eds) Proceedings of the 17th ACM SIGMOD symposium on principles of database systems (PODS’98). ACM Press, pp 85–93
Bellazi R, Larizza C, Magni P and Bellazi R (2005). Temporal data mining for the quality assessment of hemodialysis services. Artif Intell Med 34: 25–39
Boulicaut J-F, Bykowski A and Rigotti C (2003). Free-sets: a condensed representation of boolean data for the approximation of frequency queries. Data Min Knowl Disc 7(1): 5–22
Bykowski A, Rigotti C (2001) A condensed representation to find frequent patterns. In: Fan W (ed) Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (PODS’01). ACM Press, pp 267–273
Calders T, Goethals B (2003) Minimal k-free representations of frequent sets. In: Lavrac N, Gamberger D, Blockeel H, Todorovski L (eds) Proceedings of the 7th European conference on principles and practice of knowledge discovery in databases (PKDD’03). Springer, pp 71–82
Casas-Garriga G (2005) Summarizing sequential data with closed partial orders. In: Kargupta H, Srivastava J, Kamath C, Goodman A (eds) Proceedings of the 5th SIAM international conference on data mining (SDM’05). SIAM, pp 380–391
Chen G, Wu X, Zhu X (2006) Mining sequential patterns across data streams. Technical Report CS-05-04, University of Vermont, Burlington, VT, USA
Cheng J, Ke Y, Ng W (2006) δ-Tolerance closed frequent itemsets. In: Proceedings of the 6th IEEE international conference on data mining (ICDM’06). IEEE Press, pp 139–148
Cohen PR (2001) Fluent learning: elucidating the structure of episodes. In: Hoffmann F, Hand D, Adams N, Fisher D, Guimarães G (eds) Proceedings of the 4th international conference in intelligent data analysis (IDA’01). Springer, pp 268–277
Dubois D, Hüllermeier E and Prade H (2006). A systematic approach to the assessment of fuzzy association rules. Data Min Knowl Disc 13(2): 167–192
Fern A (2004) Learning models and formulas of a temporal event logic. PhD thesis, Purdue University, West Lafayette, IN, USA
Gionis A, Mannila H, Terzi E (2004) Clustered segmentations. In: Workshop on mining temporal and sequential data, 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04)
Grice H (1989) Studies in the way of words. Harvard University Press
Guimarães G (1998) Eine Methode zur Entdeckung von komplexen Mustern in Zeitreihen mit Neuronalen Netzen und deren Überführung in eine symbolische Wissensrepräsentation. PhD thesis, Philipps-University Marburg, Germany (German)
Guimarães G., Ultsch A (1997) A symbolic representation for pattern in time series using definitive clause grammars. In: Klar R, Opitz O (eds) Proceedings of the 20th annual conference of the german classification society (GfKl’96). Springer, pp 105–111
Guimarães G, Ultsch A (1999) A method for temporal knowledge conversion. In: Hand DJ, Kok JN, Berthold MR (eds) Proceedings of the 3rd international conference in intelligent data analysis (IDA’99). Springer, pp 369–380
Hoos O (2003). Bewegungsstruktur, Bewegungstechnik und Geschwindigkeitsregulation im ausdauerorientierten Inline-Skating. Görich & Weiershäuser, Marburg, Germany
Höppner F (2001) Discovery of temporal patterns – learning rules about the qualitative behaviour of time series. In: Raedt LD, Siebes A (eds) Proceedings of the 5th European conference on principles of data mining and knowledge discovery (PKDD’01). Springer, pp 192–203
Höppner F (2003) Knowledge discovery from sequential data. PhD thesis, Technical University Braunschweig, Germany
Höppner F and Klawonn F (2002). Finding informative rules in interval sequences. Intell. Data Anal 6(3): 237–255
Kam P-S, Fu AW-C (2000) Discovering temporal patterns for interval-based events. In: Kambayashi Y, Mohania MK, Tjoa AM (eds) Proceedings of the 2nd international conference on data warehousing and knowledge discovery (DaWaK’00). Springer, pp 317–326
Keogh E, Chu S, Hart D and Pazzani M (2004). Segmenting time series: a survey and novel approach. In: Last, M, Kandel, A, and Bunke, H (eds) Data mining in time series databases, chapter 1, pp 1–22. World Scientific, Singapore pp
Kryszkiewicz M (2001) Concise representation of frequent patterns based on disjunction-free generators. In: Cercone N, Lin T, Wu X (eds) Proceedings of the 1st IEEE international conference on data mining (ICDM’01). IEEE Press, pp 305–312
Last M, Klein Y and Kandel A (2001). Knowledge discovery in time series databases. IEEE Trans Syst Man Cybernet 31(1): 160–169
Lin M-Y, Lee S-Y (2002) Fast discovery of sequential patterns by memory indexing. In: Kambayashi Y, Winiwarter W, Arikawa M (eds) Proceedings of the 4th international conference on data warehousing and knowledge discovery (DaWaK’02). Springer, pp 150–160
Lin J, Keogh E, Lonardi S, Patel P (2002) Finding motifs in time series. In: Hand D, Keim D, Ng R (eds) Workshop on temporal data mining, 8th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’02).
Lin J, Keogh E, Lonardi S, Lankford JP, Nystrom DM (2004) Visually mining and monitoring massive time series. In: Kim W, Kohavi R, Gehrke J, DuMouchel W (eds) Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04). ACM Press, pp 460–469
Lucchese C, Orlando S and Perego R (2006). Fast and memory efficient mining of frequent closed itemsets. IEEE Trans Knowl Data Eng 18(1): 21–36
Mannila H, Toivonen H, Verkamo I (1995) Discovery of frequent episodes in event sequences. In: Fayyad UM, Uthurusamy R (eds) Proceedings of the 1st international conference on knowledge discovery and data mining (KDD’96). AAAI Press, pp 210–215
Mooney C, Roddick JF (2004) Mining relationships between interacting episodes. In: Berry MW, Dayal U, Kamath C, Skillicorn DB (eds) Proceedings of the 4th SIAM international conference on data mining (SDM’04). SIAM
Mörchen F (2006a) Algorithms for time series knowledge mining. In: Eliassi-Rad T, Ungar LH, Craven M, Gunopulos D (eds) Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06). ACM Press, pp 668–673
Mörchen F (2006b) A better tool than Allen’s relations for expressing temporal knowledge in interval data. In: Li T, Perng C, Wang H, Domeniconi C (eds) Workshop on temporal data mining at the 12th ACM SIGKDD international conference on knowledge discovery and data mining. pp 25–34
Mörchen F (2006c) Time series knowledge mining. PhD thesis Philipps-University Marburg Germany
Mörchen F, Ultsch A (2005) Optimizing time series discretization for knowledge discovery. In: Grossman R, Bayardo R, Bennett KP (eds) Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’05). ACM Press, pp 660–665
Mörchen F, Ultsch A, Hoos O (2004) Discovering interpretable muscle activation patterns with the Temporal Data Mining Method. In: Boulicaut J-F, Esposito F, Giannotti F, Pedreschi D (eds) Proceedings of the 8th European conference on principles and practice of knowledge discovery in databases (PKDD’04). Lecture notes in computer science. Springer, pp 512–514
Mörchen F, Ultsch A and Hoos O (2006). Extracting interpretable muscle activation patterns with time series knowledge mining. Int J Knowl-Based Intell Eng Syst 9(3): 197–208
Palpanas T, Cardle M, Gunopulos D, Keogh E, Zordan VB (2004a) Indexing large human motion databases. In: Nascimento MA, Özsu MT, Kossmann D, Miller RJ, Blakeley JA, Schiefer KB (eds) Proceedings of the 30th international conference on very large data bases (VLDB’04). Morgan Kaufmann, pp 780–791
Palpanas T, Vlachos M, Keogh E, Gunopulos D, Truppel W (2004b) Online amnesic approximation of streaming time series. In: Proceedings of the 20th international conference on data engineering (ICDE’04). IEEE Press, pp 338–349
Papadimitriou S, Sun J, Faloutsos C (2005) Streaming pattern discovery in multiple time-series. In: Böhm K, Jensen CS, Haas LM, Kersten ML, Larson P-Å, Ooi BC (eds) Proceedings of the 31st international conference on very large data bases (VLDB’05). Morgan Kaufmann, pp 697–708
Papaterou P, Kollios G, Sclaroff S, Gunopoulos D (2005) Discovering frequent arrangements of temporal intervals. In: Proceedings of the 5th IEEE international conference on data mining (ICDM’05). IEEE Press, pp 354–361
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceeding of the 7th international conference on database theory (ICDT’99). Springer, pp 398–416
Pei J, Tung AK, Han J (2001) Fault-tolerant frequent pattern mining: problems and challenges. In: Workshop on research issues in data mining and knowledge discovery, 20th ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems (PODS’01). IEEE Press
Pei J, Dong G, Zou W, Han J (2002) On computing condensed frequent pattern bases. In: Proceedings of the 2nd IEEE international conference on data mining (ICDM’02). IEEE Press, pp 378–385
Pei J, Liu J, Wang H, Wang K, Yu PS, Wang J (2005) Efficiently mining frequent closed partial orders. In: Proceedings of the 5th IEEE international conference on data mining (ICDM’05). IEEE Press, pp 753–756
Pei J, Wang H, Liu J, Wang K, Wang J and Yu PS (2006). Discovering frequent closed partial orders from strings. IEEE Trans Knowl Data Eng 18(11): 1467–1481
Pudi V, Haritsa JR (2003) Generalized closed itemsets for association rule mining. In: Dayal U, Ramamritham K, Vijayaraman TM (eds) Proceedings of the 19th international conference on data engineering (ICDE’03). IEEE Press, pp 714–716
Rainsford C, Roddick J (1999) Adding temporal semantics to association rules. In: Zytkow JM, Rauch J (eds) Proceedings of the 3rd European conference on principles of data mining and knowledge discovery (PKDD’99). Springer, pp 504–509
Roddick JF and Mooney CH (2005). Linear temporal sequences and their interpretation using midpoint relationships. IEEE Trans Knowl Data Eng 17(1): 133–135
Schwalb E, Vila L (1997) Temporal constraints: a survey. Technical report, ICS, University of California at Irvine, CA, USA
Seppänen JK, Mannila H (2004) Dense itemsets. In: Kim W, Kohavi R, Gehrke J, DuMouchel W (eds) Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’04). ACM Press, pp 683–688
Shneiderman B (1996) The eyes have it: a task by data type taxonomy for information visualizations. In: Proceedings of the 1996 IEEE symposium on visual languages. IEEE Press, p 336
Siskind JM (2001). Grounding the lexical semantics of verbs in visual perception using force dynamics and event logic. J Artif Intell Res 15: 31–90
Sripada SG, Reiter E, Hunter J (2003) Generating English summaries of time series data using the Gricean maxims. In: Getoor L, Senator TE, Domingos P, Faloutsos C (eds) Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’03). ACM Press, pp 187–196
Ultsch A (1996) Eine unifikationsbasierte Grammatik zur Beschreibung von komplexen Mustern in multivariaten Zeitreihen. Personal notes (German)
Ultsch A (1999) Data mining and knowledge discovery with emergent self-organizing feature maps for multivariate time series. In: Oja E, Kaski S (eds) Kohonen Maps. Elsevier, pp 33–46
Ultsch A (2004) Unification-based temporal grammar. Technical Report 37, Department of Mathematics and Computer Science, Philipps-University Marburg, Germany
Vilain M, Kautz HA, van Beek PG (1989) Constraint propagation algorithms for temporal reasoning: a revised report. In: Readings in qualitative reasoning about physical systems. Morgan Kaufmann, San Francisco, USA, pp 373–381
Villafane R, Hua KA, Tran D and Maulik B (2000). Knowledge discovery from series of interval events. J Intell Inform Syst 15(1): 71–89
Wang J, Han J (2004) BIDE: efficient mining of frequent closed sequences. In: Proceedings of the 20th international conference on data engineering (ICDE’04). IEEE Press, pp 79–90
Winarko E, Roddick JF (2007) ARMADA – an algorithm for discovering richer relative temporal association rules from interval-based data. Data Knowl Eng
Yahia SB, Hamrouni T and Mephu Nguifo E (2006). Frequent closed itemset based algorithms: A thorough structural and analytical survey. ACM SIGKDD Explor Newslett 8(1): 93–104
Yan X, Han J, Afshar R (2003) CloSpan: mining closed sequential patterns in large datasets. In: Barbará D, Kamath C (eds) Proceedings of the 3rd SIAM international conference on data mining (SDM’03). SIAM, pp 166–177
Yan X, Cheng H, Han J, Xin D (2005) Summarizing itemset patterns: a profile-based approach. In: Grossman R, Bayardo R, Bennett KP (eds) Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’05). ACM Press, pp 314–323
Yang C, Fayyad U, Bradley PS (2001) Efficient discovery of error-tolerant frequent itemsets in high dimensions. In: Provost F, Srikant R (eds) Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’01). ACM Press, pp 194–203
Zaki MJ, Hsiao C-J (2002) CHARM: an efficient algorithm for closed itemset mining. In: Grossman RL, Han J, Kumar V, Mannila H, Motwani R (eds) Proceedings of the 2nd SIAM international conference on data mining (SDM’02). SIAM, pp 457–473
Zaki MJ and Hsiao C-J (2005). Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4): 462–478
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Johannes Gehrke.
Rights and permissions
About this article
Cite this article
Mörchen, F., Ultsch, A. Efficient mining of understandable patterns from multivariate interval time series. Data Min Knowl Disc 15, 181–215 (2007). https://doi.org/10.1007/s10618-007-0070-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-007-0070-1