Skip to main content

Approximate Clustering of Time Series Using Compact Model-Based Descriptions

  • Conference paper
Book cover Database Systems for Advanced Applications (DASFAA 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4947))

Included in the following conference series:

Abstract

Clustering time series is usually limited by the fact that the length of the time series has a significantly negative influence on the runtime. On the other hand, approximative clustering applied to existing compressed representations of time series (e.g. obtained through dimensionality reduction) usually suffers from low accuracy. We propose a method for the compression of time series based on mathematical models that explore dependencies between different time series. In particular, each time series is represented by a combination of a set of specific reference time series. The cost of this representation depend only on the number of reference time series rather than on the length of the time series. We show that using only a small number of reference time series yields a rather accurate representation while reducing the storage cost and runtime of clustering algorithms significantly. Our experiments illustrate that these representations can be used to produce an approximate clustering with high accuracy and considerably reduced runtime.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Guttman, A.: R-Trees: A dynamic index structure for spatial searching. In: Proceedings of the SIGMOD Conference, Boston, MA, pp. 47–57 (1984)

    Google Scholar 

  2. Faloutsos, C., Ranganathan, M., Maolopoulos, Y.: Fast Subsequence Matching in Time-series Databases. In: Proceedings of the SIGMOD Conference, Minneapolis, MN (1994)

    Google Scholar 

  3. Agrawal, R., Faloutsos, C., Swami, A.: Efficient Similarity Search in Sequence Databases. In: Proc. 4th Conf. on Foundations of Data Organization and Algorithms (1993)

    Google Scholar 

  4. Wichert, S., Fokianos, K., Strimmer, K.: Identifying Periodically Expressed Transcripts in Microarray Time Series Data. Bioinformatics 20(1), 5–20 (2004)

    Article  Google Scholar 

  5. Chan, K., Fu, W.: Efficient Time Series Matching by Wavelets. In: Proceedings of the 15th International Conference on Data Engineering (ICDE), Sydney, Australia (1999)

    Google Scholar 

  6. Yi, B.K., Faloutsos, C.: Fast Time Sequence Indexing for Arbitrary Lp Norms. In: Proceedings of the 26th International Conference on Very Large Data Bases (VLDB), Cairo, Egypt (2000)

    Google Scholar 

  7. Cai, Y., Ng, R.: Index Spatio-Temporal Trajectories with Chebyshev Polynomials. In: Proceedings of the SIGMOD Conference (2004)

    Google Scholar 

  8. Korn, F., Jagadish, H., Faloutsos, C.: Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences. In: Proceedings of the SIGMOD Conference, Tucson, AZ (1997)

    Google Scholar 

  9. Alter, O., Brown, P., Botstein, D.: Generalized Singular Value Decomposition for Comparative Analysis of Genome-Scale Expression Data Sets of two Different Organisms. Proc. Natl. Aca. Sci. USA 100, 3351–3356 (2003)

    Article  Google Scholar 

  10. Keogh, E., Chakrabati, K., Mehrotra, S., Pazzani, M.: Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases. In: Proceedings of the SIGMOD Conference, Santa Barbara, CA (2001)

    Google Scholar 

  11. Bar-Joseph, Z., Gerber, G., Jaakkola, T., Gifford, D., Simon, I.: Continuous Representations of Time Series Gene Expression Data. J. Comput. Biol. 3-4, 341–356 (2003)

    Article  Google Scholar 

  12. Ratanamahatana, C.A., Keogh, E., Bagnall, A.J., Lonardi, S.: A Novel Bit Level Time Series Representation with Implication for Similarity Search and Clustering. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, Springer, Heidelberg (2005)

    Google Scholar 

  13. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Academic Press, London (2001)

    Google Scholar 

  14. Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On Clustering Validation Techniques. Intelligent Information Systems Journal (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Jayant R. Haritsa Ramamohanarao Kotagiri Vikram Pudi

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kriegel, HP., Kröger, P., Pryakhin, A., Renz, M., Zherdin, A. (2008). Approximate Clustering of Time Series Using Compact Model-Based Descriptions. In: Haritsa, J.R., Kotagiri, R., Pudi, V. (eds) Database Systems for Advanced Applications. DASFAA 2008. Lecture Notes in Computer Science, vol 4947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78568-2_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78568-2_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78567-5

  • Online ISBN: 978-3-540-78568-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics