Analysis of Time Series Data with Predictive Clustering Trees

Džeroski, Sašo; Gjorgjioski, Valentin; Slavkov, Ivica; Struyf, Jan

doi:10.1007/978-3-540-75549-4_5

Sašo Džeroski¹,
Valentin Gjorgjioski¹,
Ivica Slavkov¹ &
…
Jan Struyf²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4747))

Included in the following conference series:

International Workshop on Knowledge Discovery in Inductive Databases

570 Accesses
16 Citations

Abstract

Predictive clustering is a general framework that unifies clustering and prediction. This paper investigates how to apply this framework to cluster time series data. The resulting system, Clus-TS, constructs predictive clustering trees (PCTs) that partition a given set of time series into homogeneous clusters. In addition, PCTs provide a symbolic description of the clusters. We evaluate Clus-TS on time series data from microarray experiments. Each data set records the change over time in the expression level of yeast genes as a response to a change in environmental conditions. Our evaluation shows that Clus-TS is able to cluster genes with similar responses, and to predict the time series based on the description of a gene. Clus-TS is part of a larger project where the goal is to investigate how global models can be combined with inductive databases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Classification for Time Series Data. An Unsupervised Approach Based on Reduction of Dimensionality

Article 11 May 2019

Finding Patterns in Time Series

Chaotic Time Series Prediction: Run for the Horizon

References

Blockeel, H., De Raedt, L., Ramon, J.: Top-down induction of clustering trees. In: 15th Int’l Conf. on Machine Learning, pp. 55–63 (1998)
Google Scholar
Curk, T., Zupan, B., Petrovič, U., Shaulsky, G.: Računalniško odkrivanje mehanizmov uravnavanja istražanja genov. In: Prvo srečanje slovenskih bioinformatikov, pp. 56–58 (2005)
Google Scholar
De Raedt, L.: A perspective on inductive databases. SIGKDD Explorations 4(2), 69–77 (2002)
Article Google Scholar
Ernst, J., Nau, G.J., Bar-Joseph, Z.: Clustering short time series gene expression data. Bioinformatics 21(Suppl. 1), 159–168 (2005)
Article Google Scholar
Ashburner, M., et al.: Gene Ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nature Genet. 25(1), 25–29 (2000)
Article Google Scholar
Ferri, C., Flach, P.A., Hernández-Orallo, J.: Learning decision trees using the area under the ROC curve. In: 19th Int’l Conf. on Machine Learning, pp. 139–146 (2002)
Google Scholar
Fromont, E., Blockeel, H., Struyf, J.: Integrating decision tree learning into inductive databases. In: KDID 2006. LNCS, vol. 4747, pp. 81–96. Springer, Heidelberg (2007)
Google Scholar
Garofalakis, M., Hyun, D., Rastogi, R., Shim, K.: Building decision trees with constraints. Data Mining and Knowledge Discovery 7(2), 187–214 (2003)
Article MathSciNet Google Scholar
Gasch, A., Spellman, P., Kao, C., Carmel-Harel, O., Eisen, M., Storz, G., Botstein, D., Brown, P.: Genomic expression program in the response of yeast cells to environmental changes. Mol. Biol. Cell. 11, 4241–4257 (2000)
Google Scholar
Imielinski, T., Mannila, H.: A database perspective on knowledge discovery. Communications of the ACM 39(11), 58–64 (1996)
Article Google Scholar
Kaufman, L., Rousseeuw, P.J. (eds.): Finding groups in data: An introduction to cluster analysis. Wiley, Chichester (1990)
Google Scholar
Lee, S.D., De Raedt, L.: An efficient algorithm for mining string data-bases under constraints. In: Goethals, B., Siebes, A. (eds.) KDID 2004. LNCS, vol. 3377, pp. 108–129. Springer, Heidelberg (2005)
Google Scholar
Liao, T.W.: Clustering of time series data – a survey. Pattern Recognition 38, 1857–1874 (2005)
Article MATH Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2007)
Google Scholar
Michalski, R.S., Stepp, R.E.: Learning from observation: conceptual clustering. In: Machine Learning: an Artificial Intelligence Approach, vol. 1, Tioga Publishing Company (1983)
Google Scholar
Mitasiunaité, I., Boulicaut, J.-F.: Looking for monotonicity properties of a similarity constraint on sequences. In: ACM Symposium of Applied Computing SAC’2006, Special Track on Data Mining, pp. 546–552. ACM Press, New York (2006)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann series in Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Raileanu, L.E., Stoffel, K.: Theoretical comparison between the Gini index and information gain criteria. Annals of Mathematics and Artificial Intelligence 41(1), 77–93 (2004)
Article MATH MathSciNet Google Scholar
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spokenword recognition. In: IEEE Transaction on Acoustics, Speech, and Signal Processing. LNAI, vol. ASSP-26, pp. 43–49. IEEE Computer Society Press, Los Alamitos (1978)
Google Scholar
Sese, J., Kurokawa, Y., Monden, M., Kato, K., Morishita, S.: Constrained clusters of gene expression profiles with pathological features. Bioinformatics 20, 3137–3145 (2004)
Article Google Scholar
Slavkov, I., Džeroski, S., Struyf, J., Loskovska, S.: Constrained clustering of gene expression profiles. In: Conf. on Data Mining and Data Warehouses (SiKDD 2005) at the 7th Int’l Multi-Conference on Information Society 2005, pp. 212–215 (2005)
Google Scholar
Struyf, J., Džeroski, S.: Constraint based induction of multi-objective regression trees. In: Bonchi, F., Boulicaut, J.-F. (eds.) KDID 2005. LNCS, vol. 3933, pp. 222–233. Springer, Heidelberg (2006)
Google Scholar
Todorovski, L., Cestnik, B., Kline, M., Lavrač, N., Džeroski, S.: Qualitative clustering of short time-series: A case study of firms reputation data. In: ECML/PKDD 2002 Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning, pp. 141–149 (2002)
Google Scholar
Torgo, L.: A comparative study of reliable error estimators for pruning regression trees. In: Coelho, H. (ed.) IBERAMIA 1998. LNCS (LNAI), vol. 1484, Springer, Heidelberg (1998)
Google Scholar
Ženko, B., Džeroski, S., Struyf, J.: Learning predictive clustering rules. In: Bonchi, F., Boulicaut, J.-F. (eds.) KDID 2005. LNCS, vol. 3933, pp. 234–250. Springer, Heidelberg (2006)
Google Scholar
Wagstaff, K.L.: Value, cost, and sharing: Open issues in constrained clustering. In: KDID 2006. LNCS, vol. 4747, pp. 24–41. Springer, Heidelberg (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Knowledge Technologies, Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia
Sašo Džeroski, Valentin Gjorgjioski & Ivica Slavkov
Dept. of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001 Leuven, Belgium
Jan Struyf

Authors

Sašo Džeroski
View author publications
You can also search for this author in PubMed Google Scholar
Valentin Gjorgjioski
View author publications
You can also search for this author in PubMed Google Scholar
Ivica Slavkov
View author publications
You can also search for this author in PubMed Google Scholar
Jan Struyf
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Sašo Džeroski Jan Struyf

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Džeroski, S., Gjorgjioski, V., Slavkov, I., Struyf, J. (2007). Analysis of Time Series Data with Predictive Clustering Trees. In: Džeroski, S., Struyf, J. (eds) Knowledge Discovery in Inductive Databases. KDID 2006. Lecture Notes in Computer Science, vol 4747. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75549-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-75549-4_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75548-7
Online ISBN: 978-3-540-75549-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics