Abstract
Many environmental, scientific, technical or medical database applications require effective and efficient mining of time series, sequences or trajectories of measurements taken at different time points and positions forming large temporal or spatial databases. Particularly the analysis of concurrent and multidimensional sequences poses new challenges in finding clusters of arbitrary length and varying number of attributes. We present a novel algorithm capable of finding parallel clusters in different subspaces and demonstrate our results for temporal and spatial applications. Our analysis of structural quality parameters in rivers is successfully used by hydrologists to develop measures for river quality improvements.
Similar content being viewed by others
References
Agrawal R, Srikant R (1995) Mining sequential patterns. In: IEEE international conference on data engineering (ICDE), pp. 3–14
Assent I, Krieger R, Müller E, Seidl T (2007) DUSC: dimensionality unbiased subspace clustering. In: IEEE international conference on data mining (ICDM), pp. 409–414
Ayres J, Flannick J, Gehrke J, Yiu T (2002) Sequential pattern mining using a bitmap representation. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 429–435
Bartusseck S (2005) Regelbasiertes Entscheidungsunterstützungssystem (DSS) zur Bewertung von Maß nahmenplänen gemäß EG-WRRL. Forum für Hydrologie und Wasserbewirtschaftung 10
Brecheisen S, Kriegel H and Pfeifle M (2006). Multi-step density-based clustering. Knowl Inf Sys 9(3): 284–308
Coatney M and Parthasarathy S (2005). MotifMiner: efficient discovery of common substructures in biochemical molecules. Knowl Inf Sys 7(2): 202–223
Denton A (2004) Density-based clustering of time series subsequences. In: IEEE international conference on data mining (ICDM)
Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 226–231
Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: ACM SIGMOD international conference on management of data, pp. 419–429
Georgia Forestry Commission (2005) Weather data retrieval. http://weather.gfc.state.ga.us
Grahne G, Zhu J (2004) Mining frequent itemsets from secondary memory. In: IEEE international conference on data mining (ICDM), pp. 91–98
Guha S, Rastogi R, Shim K (1999) A robust clustering algorithm for categorical attributes. In: IEEE international conference on data engineering (ICDE), pp. 512–521
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: ACM SIGMOD international conference on management of data, pp. 1–12
Hinneburg A, Keim D (1998) An efficient approach to clustering in large multimedia databases with noise. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 58–65
Hinneburg A and Keim D (2003). A general approach to clustering in large databases with noise. Knowl Inf Sys 5(4): 387–415
Kailing K, Kriegel H, Kröger P (2004) Density-connected subspace clustering for high-dimensional data. In: IEEE international conference on data mining (ICDM), pp. 246–257
Kailing K, Kriegel H, Schonauer S, Seidl T (2004) Efficient similarity search for hierarchical data in large databases. In: international conference on extending database technology (EDBT), pp. 676–693
Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Locally adaptive dimensionality reduction for indexing large time series databases. In: ACM SIGMOD international conference on management of data, pp. 151–162
Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Workshop on research issues in data mining and knowledge discovery at ACM SIGMOD international conference on management of data, pp. 2–11
LUA NRW (2003) River quality data, http://www.lua.nrw.de..
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Berkeley symposium on mathematical statistics and probability, pp. 281–297
Patel P, Keogh E, Lin J, Lonardi S (2002) Mining motifs in massive time series databases. In: IEEE international Conference on data mining (ICDM)
Zaki M, Peters M, Assent I, Seidl T (2005) Clicks: An effective algorithm for mining subspace clusters in categorical datasets. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp. 355–356
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Assent, I., Krieger, R., Glavic, B. et al. Clustering multidimensional sequences in spatial and temporal databases. Knowl Inf Syst 16, 29–51 (2008). https://doi.org/10.1007/s10115-007-0121-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-007-0121-3