Clustering multidimensional sequences in spatial and temporal databases

Assent, Ira; Krieger, Ralph; Glavic, Boris; Seidl, Thomas

doi:10.1007/s10115-007-0121-3

Clustering multidimensional sequences in spatial and temporal databases

Regular Paper
Published: 16 January 2008

Volume 16, pages 29–51, (2008)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Ira Assent¹,
Ralph Krieger¹,
Boris Glavic¹ &
…
Thomas Seidl¹

184 Accesses
15 Citations
Explore all metrics

Abstract

Many environmental, scientific, technical or medical database applications require effective and efficient mining of time series, sequences or trajectories of measurements taken at different time points and positions forming large temporal or spatial databases. Particularly the analysis of concurrent and multidimensional sequences poses new challenges in finding clusters of arbitrary length and varying number of attributes. We present a novel algorithm capable of finding parallel clusters in different subspaces and demonstrate our results for temporal and spatial applications. Our analysis of structural quality parameters in rivers is successfully used by hydrologists to develop measures for river quality improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal R, Srikant R (1995) Mining sequential patterns. In: IEEE international conference on data engineering (ICDE), pp. 3–14
Assent I, Krieger R, Müller E, Seidl T (2007) DUSC: dimensionality unbiased subspace clustering. In: IEEE international conference on data mining (ICDM), pp. 409–414
Ayres J, Flannick J, Gehrke J, Yiu T (2002) Sequential pattern mining using a bitmap representation. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 429–435
Bartusseck S (2005) Regelbasiertes Entscheidungsunterstützungssystem (DSS) zur Bewertung von Maß nahmenplänen gemäß EG-WRRL. Forum für Hydrologie und Wasserbewirtschaftung 10
Brecheisen S, Kriegel H and Pfeifle M (2006). Multi-step density-based clustering. Knowl Inf Sys 9(3): 284–308
Article Google Scholar
Coatney M and Parthasarathy S (2005). MotifMiner: efficient discovery of common substructures in biochemical molecules. Knowl Inf Sys 7(2): 202–223
Article Google Scholar
Denton A (2004) Density-based clustering of time series subsequences. In: IEEE international conference on data mining (ICDM)
Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 226–231
Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: ACM SIGMOD international conference on management of data, pp. 419–429
Georgia Forestry Commission (2005) Weather data retrieval. http://weather.gfc.state.ga.us
Grahne G, Zhu J (2004) Mining frequent itemsets from secondary memory. In: IEEE international conference on data mining (ICDM), pp. 91–98
Guha S, Rastogi R, Shim K (1999) A robust clustering algorithm for categorical attributes. In: IEEE international conference on data engineering (ICDE), pp. 512–521
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: ACM SIGMOD international conference on management of data, pp. 1–12
Hinneburg A, Keim D (1998) An efficient approach to clustering in large multimedia databases with noise. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 58–65
Hinneburg A and Keim D (2003). A general approach to clustering in large databases with noise. Knowl Inf Sys 5(4): 387–415
Article Google Scholar
Kailing K, Kriegel H, Kröger P (2004) Density-connected subspace clustering for high-dimensional data. In: IEEE international conference on data mining (ICDM), pp. 246–257
Kailing K, Kriegel H, Schonauer S, Seidl T (2004) Efficient similarity search for hierarchical data in large databases. In: international conference on extending database technology (EDBT), pp. 676–693
Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Locally adaptive dimensionality reduction for indexing large time series databases. In: ACM SIGMOD international conference on management of data, pp. 151–162
Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Workshop on research issues in data mining and knowledge discovery at ACM SIGMOD international conference on management of data, pp. 2–11
LUA NRW (2003) River quality data, http://www.lua.nrw.de..
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Berkeley symposium on mathematical statistics and probability, pp. 281–297
Patel P, Keogh E, Lin J, Lonardi S (2002) Mining motifs in massive time series databases. In: IEEE international Conference on data mining (ICDM)
Zaki M, Peters M, Assent I, Seidl T (2005) Clicks: An effective algorithm for mining subspace clusters in categorical datasets. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 355–356

Download references

Author information

Authors and Affiliations

Data Management and Exploration Group, RWTH Aachen University, Aachen, Germany
Ira Assent, Ralph Krieger, Boris Glavic & Thomas Seidl

Authors

Ira Assent
View author publications
You can also search for this author in PubMed Google Scholar
Ralph Krieger
View author publications
You can also search for this author in PubMed Google Scholar
Boris Glavic
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Seidl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ira Assent.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Assent, I., Krieger, R., Glavic, B. et al. Clustering multidimensional sequences in spatial and temporal databases. Knowl Inf Syst 16, 29–51 (2008). https://doi.org/10.1007/s10115-007-0121-3

Download citation

Received: 24 May 2007
Revised: 21 October 2007
Accepted: 15 November 2007
Published: 16 January 2008
Issue Date: July 2008
DOI: https://doi.org/10.1007/s10115-007-0121-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering multidimensional sequences in spatial and temporal databases

Abstract

Access this article

Similar content being viewed by others

A Nested Two-Stage Clustering Method for Structured Temporal Sequence Data

Overview of Efficient Clustering Methods for High-Dimensional Big Data Streams

Spatiotemporal clustering: a review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Clustering multidimensional sequences in spatial and temporal databases

Abstract

Access this article

Similar content being viewed by others

A Nested Two-Stage Clustering Method for Structured Temporal Sequence Data

Overview of Efficient Clustering Methods for High-Dimensional Big Data Streams

Spatiotemporal clustering: a review

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation