Abstract
Sensor networks have increased the amount and variety of temporal data available, requiring the definition of new techniques for data mining. Related research typically addresses the problems of indexing, clustering, classification, summarization, and anomaly detection. There is a wide range of techniques to describe and compare time series, but they focus on series’ values. This paper concentrates on a new aspect—that of describing oscillation patterns. It presents a technique for time series similarity search, and multiple temporal scales, defining a descriptor that uses the angular coefficients from a linear segmentation of the curve that represents the evolution of the analyzed series. This technique is generalized to handle co-evolution, in which several phenomena vary at the same time. Preliminary experiments with real datasets showed that our approach correctly characterizes the oscillation of single time series, for multiple time scales, and is able to compute the similarity among sets of co-evolving series.
Similar content being viewed by others
Notes
We use vs to stress that the vector uses symbolic representation.
References
Agrawal R, Lin KI, Sawhney HS, Shim K (1995) Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In: 21st VLSB conference, pp 490–501
Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E (2002) Wireless sensor networks: a survey. Comput Netw 38(4):393–422
Babcock B, Babu S, Datar M, Motwani R, Widom J (2003) Models and issues in data stream systems. Technical report, Department of Computer Science, Stanford University
Cai Y, Ng R (2004) Indexinials. In: Proc ACM SIGMOD conference spatio-temporal trajectories with Chebyshev polynom
Deshpande A, Guestrin C, Madden S (2004) Model-driven data acquisition in sensor networks. In: Proc 30th VLDB conference
Deshpande A, Madden S (2006) MauveDB: supporting model-based user views in database systems. In: Proc of the 2006 ACM SIGMOD conference, pp 73–84
Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In: Proc VLDB conference
Etien A, Salinesi C (2005) Managing requirements in a co-evolution context. In: RE05—Proceedings of the 13th international conference on requirements engineering
Faloutsos C (2002) Tutorial: sensor data mining: similarity search and pattern analysis. In: 28th VLDB conference
Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: Proceedings 1994 ACM SIGMOD conference, Minneapolis, MN, pp 419–429
Fu L, Soh L, Samal A (2008) Techniques for computing fitness of use (FoU) for time series datasets with applications in the geospatial domain. GeoInfo 12(1):91–93
Golab L, Oszu M (2003) Issues in data stream management. CM SIGMOD Rec 32:5–14
Han J, Kamber M (2002) Data mining: concepts and techniques. In: ACM SIGMOD, vol 31
Han J, Pei J, Mortazavi-Asl B, Chen Q, Dayal U, Hsu M (2000) Freespan: frequent pattern-projected sequential pattern mining. In: KDD ’00: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining, pp 355–359
Hugueney B (2003) Representations symboliques de longues series temporelles (Symbolic representations of long temporal series). PhD thesis, University Paris 6
Joliveau M, De Vuyst F (2007) Space-Time summarization of multisensor time series. Case of missing data. In: Int workshop on spatial and spatio-temporal data mining—SSTDM
Junkui L, Yuanzhen W (2007) APCAS: an approximate approach to adaptively segment time series streams. In: Advances in data and web management, vol 4505. Springer, Berlin, pp 554–565
Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Locally adaptive dimensionality reduction for indexing large time series databases. In: Proc ACM SIGMOD conference, pp 151–162
Keogh E, Pazzani M (1998) An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback. In Agrawal R, Stolorz P, Piatetsky-Shapiro G (eds) Fourth international conference on knowledge discovery and data mining (KDD’98). ACM, New York, pp 239–241
Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386
Keogh E, Smyth P (1997) A probabilistic approach to fast pattern matching in time series databases. In Heckerman D, Mannila H, Pregibon D, Uthurusamy R (eds) Third international conference on knowledge discovery and data mining. Newport Beach, CA, USA. AAAI, Menlo Park, pp 24–30
Keogh E, Xi X, Wei L, Ratanamahatana CA (2006) The UCR time series classification/clustering homepage. www.cs.ucr.edu/~eamonn/timeseriesdata/
Keogh EJ, Chu S, Hart D, Pazzani MJ (2001) An online algorithm for segmenting time series. In: ICDM ’01: Proceedings of the 2001 IEEE international conference on data mining. IEEE Computer Society, Washington, DC, pp 289–296
Keogh EJ, Pazzani MJ (2000) A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Knowledge discovery and data mining, current issues and new applications, 4th Pacific-Asia conference, PAKDD 2000, vol 1805. Springer, Berlin, pp 122–133
Korth H, Jagadish H, Faloutsos C (1997) Efficiently supporting ad hoc queries in large data sets of time sequences. In: Proc ACM SIGMOD conference
Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms. In: DMKD ’03: Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery. ACM, New York, pp 2–11
Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Discov 15:107–144
Mainwaring A, Culler D, Polastre J, Szewczyk R, Anderson J (2002) Wireless sensor networks for habitat monitoring. In: WSNA ’02: Proceedings of the 1st ACM international workshop on Wireless sensor networks and applications. ACM, New York, pp 88–97
Mariote L, Medeiros CB, Torres R (2007) Diagnosing similarity of oscillation trends in time series. In: International Workshop on spatial and spatio-temporal data mining—SSTDM. LNCS, pp 643–648
Mirmomeni M, Lucas C, Araabi B, Moshiri B (2007) Forecasting solar activity using co-evolution of models and tests. In: Proc 7th international conference on intelligent systems design and applications
Navarro G (2001) A guided tour to approximate string matching. ACM Comput Surv 33(1):31–88
Park S, Lee D, Chu WW (1999) Fast retrieval of similar subsequences in long sequence databases. In: KDEX ’99: Proceedings of the 1999 workshop on knowledge and data engineering exchange. IEEE Computer Society, Washington, DC, p 60
Patt-Shamir B (2007) A note on efficient aggregate queries in sensor networks. Theor Comp Sci 370(1–3):254–264
Rafiei D, Mendelzon A (1997) Similarity-based queries for time series data. In: SIGMOD ’97: Proceedings of the 1997 ACM SIGMOD international conference on management of data, pp 13–25
Sacchi L, Larizza C, Combi C, Bellazzi R (2007) Data mining with temporal abstractions: learning rules from time series. Data Min Knowl Discov 15(2):217–247
Shoshani A, Kawagoe K (1986) Temporal data management. In: Twelfth international conference on very large data bases table of contents, pp 79–88
Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22:1349–1380
Szewczyk R, Polastre J, Mainwaring A, Culler D (2004) Lessons from a sensor network expedition. In: Proceedings of the first European workshop on sensor networks (EWSN)
Torres RS, Falcao AX, Costa LF (2002) Shape description by image foresting transform. In: Digital signal processing, 2002. DSP 2002. 2002 14th International conference on, vol 2, pp 1089–1092
Wu H, Salzberg B, Sharp GC, Jiang SB, Shirato H, Kaeli D (2005) Subsequence matching on structured time series data. In: SIGMOD ’05: Proceedings of the 2005 ACM SIGMOD international conference on management of data. ACM, New York, pp 682–693
Yi B, Sidiropoulos ND, Johnson T, Jagadish HV, Faloutsos C, Biliris A (2000) Online data mining for co-evolving time sequences. In: ICDE ’00: Proceedings of the 16th international conference on data engineering. IEEE Computer Society, Washington, DC, p 13
Acknowledgements
This work was partially funded by CPqD Foundation, CAPES, FAPESP, CNPq grants and CNPq projects WebMAPS and RPG. It is also being partially funded by the Microsoft Research-FAPESP Virtual institute, under the eFarms project. We thank Jeferson Lobato Fernandes for providing us with experimental data.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mariote, L.E., Medeiros, C.B., Torres, R.d.S. et al. TIDES—a new descriptor for time series oscillation behavior. Geoinformatica 15, 75–109 (2011). https://doi.org/10.1007/s10707-010-0112-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-010-0112-5