Skip to main content
Log in

Clustering multidimensional sequences in spatial and temporal databases

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Many environmental, scientific, technical or medical database applications require effective and efficient mining of time series, sequences or trajectories of measurements taken at different time points and positions forming large temporal or spatial databases. Particularly the analysis of concurrent and multidimensional sequences poses new challenges in finding clusters of arbitrary length and varying number of attributes. We present a novel algorithm capable of finding parallel clusters in different subspaces and demonstrate our results for temporal and spatial applications. Our analysis of structural quality parameters in rivers is successfully used by hydrologists to develop measures for river quality improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Agrawal R, Srikant R (1995) Mining sequential patterns. In: IEEE international conference on data engineering (ICDE), pp. 3–14

  2. Assent I, Krieger R, Müller E, Seidl T (2007) DUSC: dimensionality unbiased subspace clustering. In: IEEE international conference on data mining (ICDM), pp. 409–414

  3. Ayres J, Flannick J, Gehrke J, Yiu T (2002) Sequential pattern mining using a bitmap representation. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 429–435

  4. Bartusseck S (2005) Regelbasiertes Entscheidungsunterstützungssystem (DSS) zur Bewertung von Maß nahmenplänen gemäß EG-WRRL. Forum für Hydrologie und Wasserbewirtschaftung 10

  5. Brecheisen S, Kriegel H and Pfeifle M (2006). Multi-step density-based clustering. Knowl Inf Sys 9(3): 284–308

    Article  Google Scholar 

  6. Coatney M and Parthasarathy S (2005). MotifMiner: efficient discovery of common substructures in biochemical molecules. Knowl Inf Sys 7(2): 202–223

    Article  Google Scholar 

  7. Denton A (2004) Density-based clustering of time series subsequences. In: IEEE international conference on data mining (ICDM)

  8. Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 226–231

  9. Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: ACM SIGMOD international conference on management of data, pp. 419–429

  10. Georgia Forestry Commission (2005) Weather data retrieval. http://weather.gfc.state.ga.us

  11. Grahne G, Zhu J (2004) Mining frequent itemsets from secondary memory. In: IEEE international conference on data mining (ICDM), pp. 91–98

  12. Guha S, Rastogi R, Shim K (1999) A robust clustering algorithm for categorical attributes. In: IEEE international conference on data engineering (ICDE), pp. 512–521

  13. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: ACM SIGMOD international conference on management of data, pp. 1–12

  14. Hinneburg A, Keim D (1998) An efficient approach to clustering in large multimedia databases with noise. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 58–65

  15. Hinneburg A and Keim D (2003). A general approach to clustering in large databases with noise. Knowl Inf Sys 5(4): 387–415

    Article  Google Scholar 

  16. Kailing K, Kriegel H, Kröger P (2004) Density-connected subspace clustering for high-dimensional data. In: IEEE international conference on data mining (ICDM), pp. 246–257

  17. Kailing K, Kriegel H, Schonauer S, Seidl T (2004) Efficient similarity search for hierarchical data in large databases. In: international conference on extending database technology (EDBT), pp. 676–693

  18. Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Locally adaptive dimensionality reduction for indexing large time series databases. In: ACM SIGMOD international conference on management of data, pp. 151–162

  19. Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Workshop on research issues in data mining and knowledge discovery at ACM SIGMOD international conference on management of data, pp. 2–11

  20. LUA NRW (2003) River quality data, http://www.lua.nrw.de..

  21. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Berkeley symposium on mathematical statistics and probability, pp. 281–297

  22. Patel P, Keogh E, Lin J, Lonardi S (2002) Mining motifs in massive time series databases. In: IEEE international Conference on data mining (ICDM)

  23. Zaki M, Peters M, Assent I, Seidl T (2005) Clicks: An effective algorithm for mining subspace clusters in categorical datasets. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 355–356

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ira Assent.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Assent, I., Krieger, R., Glavic, B. et al. Clustering multidimensional sequences in spatial and temporal databases. Knowl Inf Syst 16, 29–51 (2008). https://doi.org/10.1007/s10115-007-0121-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-007-0121-3

Keywords

Navigation