Exact indexing for massive time series databases under time warping distance

Niennattrakul, Vit; Ruengronghirunya, Pongsakorn; Ratanamahatana, Chotirat Ann

doi:10.1007/s10618-010-0165-y

Exact indexing for massive time series databases under time warping distance

Published: 16 February 2010

Volume 21, pages 509–541, (2010)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Vit Niennattrakul¹,
Pongsakorn Ruengronghirunya¹ &
Chotirat Ann Ratanamahatana¹

419 Accesses
16 Citations
Explore all metrics

Abstract

Among many existing distance measures for time series data, Dynamic Time Warping (DTW) distance has been recognized as one of the most accurate and suitable distance measures due to its flexibility in sequence alignment. However, DTW distance calculation is computationally intensive. Especially in very large time series databases, sequential scan through the entire database is definitely impractical, even with random access that exploits some index structures since high dimensionality of time series data incurs extremely high I/O cost. More specifically, a sequential structure consumes high CPU but low I/O costs, while an index structure requires low CPU but high I/O costs. In this work, we therefore propose a novel indexed sequential structure called TWIST (Time Warping in Indexed Sequential sTructure) which benefits from both sequential access and index structure. When a query sequence is issued, TWIST calculates lower bounding distances between a group of candidate sequences and the query sequence, and then identifies the data access order in advance, hence reducing a great number of both sequential and random accesses. Impressively, our indexed sequential structure achieves significant speedup in a querying process. In addition, our method shows superiority over existing rival methods in terms of query processing time, number of page accesses, and storage requirement with no false dismissal guaranteed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scalable data series subsequence matching with ULISSE

Article 04 July 2020

Speeding up similarity search under dynamic time warping by pruning unpromising alignments

Article 12 March 2018

Speeding up dynamic time warping distance for sparse time series data

Article 28 October 2017

References

Assent I, Krieger R, Afschari F, Seidl T (2008) The TS-tree: efficient time series search and retrieval. In: Proceedings of 11th international conference on extending database technology (EDBT 2008), Nantes, France, pp 252–263
Bagnall AJ, Ratanamahatana CA, Keogh EJ, Lonardi S, Janacek GJ (2006) A bit level representation for time series data mining with shape based similarity. Data Min Knowl Discov 13(1): 11–40
Article MathSciNet Google Scholar
Beckmann N, Kriegel HP, Schneider R, Seeger B (1990) The R*-tree: an efficient and robust access method for points and rectangles. In: Proceedings of the 1990 ACM SIGMOD international conference on management of data (SIGMOD 90), Atlantic City, pp 322–331
Berchtold S, Keim DA, Kriegel HP (1996) The X-tree : an index structure for high-dimensional data. In: Proceedings of 22nd international conference on very large data bases (VLDB 96), Mumbai (Bombay), India, pp 28–39
Berndt DJ, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: The 1994 AAAI workshop on knowledge discovery in databases, Seattle, Washington, pp 359–370
Chu S, Keogh EJ, Hart D, Pazzani MJ (2002) Iterative deepening dynamic time warping for time series. In: Proceedings of the second SIAM international conference on data mining (SDM 2002), Arlington, VA, USA
Ciaccia P, Patella M, Zezula P (1997) M-tree: an efficient access method for similarity search in metric spaces. In: Proceedings of 23rd international conference on very large data bases (VLDB 97), Athens, Greece, pp 426–435
Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In: Proceedings of 34th international conference on very large data bases (VLDB 2008), Auckland, New Zealand
Faloutsos C, Ranganathan M, Manolopoulos Y (1994) Fast subsequence matching in time-series databases. In: Proceedings of the 1994 ACM SIGMOD international conference on management of data (SIGMOD 94), Minneapolis, Minnesota, pp 419–429
Guttman A (1984) R-trees: A dynamic index structure for spatial searching. In: Yormark B (eds) Proceedings of Annual Meeting SIGMOD’84. ACM Press, Boston, pp 47–57
Google Scholar
Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust Speech Signal Process 23(1): 67–72
Article Google Scholar
Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3): 358–386
Article Google Scholar
Keogh EJ, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Discov 7(4): 349–371
Article MathSciNet Google Scholar
Keogh EJ, Pazzani MJ (2000) Scaling up dynamic time warping for datamining applications. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining (KDD 2000), New York, NY, pp 285–289. doi:10.1145/347090.347153
Keogh EJ, Chakrabarti K, Pazzani MJ, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3): 263–286
Article MATH Google Scholar
Kim SW, Park S, Chu WW (2001) An index-based approach for similarity search supporting time warping in large sequence databases. In: Proceedings of the 17th international conference on data engineering (ICDE 2001), Heidelberg, Germany, pp 607–614
Lin J, Keogh EJ, Wei L, Lonardi S (2007) Experiencing SAX: a novel symbolic representation of time series. Data Min Knowl Discov 15(2): 107–144
Article MathSciNet Google Scholar
Loh WK, Kim SW, Whang KY (2004) A subsequence matching algorithm that supports normalization transform in time-series databases. Data Min Knowl Discov 9(1): 5–28
Article MathSciNet Google Scholar
MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Cam LML, Neyman J (eds) Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol 1. University of California Press, pp 281–297
Moody GB, Mark RG (1983) A new method for detecting atrial fibrillation using RR intervals. Comput Cardiol 10: 227–230
Google Scholar
Ratanamahatana CA, Keogh EJ (2004) Making time-series classification more accurate using learned constraints. In: Proceedings of 4th SIAM international conference on data mining (SDM 2004), Lake Buena Vista, Florida, USA, pp 11–22
Ratanamahatana CA, Keogh EJ (2005) Three myths about dynamic time warping data mining. In: Proceedings of 2005 SIAM international data mining conference (SDM 2005), Newport Beach, CL, USA, pp 506–510
Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1): 43–49
Article MATH Google Scholar
Sakurai Y, Yoshikawa M, Faloutsos C (2005) FTW: fast similarity search under the time warping distance. In: Proceedings of 24th ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems, Baltimore, ML, USA, pp 326–337
Sakurai Y, Faloutsos C, Yamamuro M (2007) Stream monitoring under the time warping distance. In: Proceedings of IEEE 23rd international conference on data engineering (ICDE 2007), Istanbul, Turkey, pp 1046–1055
Vlachos M, Yu PS, Castelli V, Meek C (2006) Structural periodic measures for time-series data. Data Min Knowl Discov 12(1): 1–28
Article MathSciNet Google Scholar
Wang X, Smith KA, Hyndman RJ (2006) Characteristic-based clustering for time series data. Data Min Knowl Discov 13(3): 335–364
Article MathSciNet Google Scholar
Weber R, Schek HJ, Blott S (1998) A quantitative analysis and performance study for similarity-search methods in high-dimensional spaces. In: Gupta A, Shmueli O, Widom J (eds) Proceedings of 24th international conference on very large data bases (VLDB 98). Morgan Kaufmann, New York City, NY, pp 194–205
Google Scholar
Yi BK, Jagadish HV, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: Proceedings of 14th international conference on data engineering (ICDE 98), Orlando, FL, USA, pp 201–208
Yianilos PN (1993) Data structures and algorithms for nearest neighbor search in general metric spaces. In: Proceedings of 4th annual ACM-SIAM symposium on discrete algorithms (SODA 93), society for industrial and applied mathematics, Philadelphia, PA, USA, pp 311–321
Zhu Y, Shasha D (2003) Warping indexes with envelope transforms for query by humming. In: Proceedings of the 2003 ACM SIGMOD international conference on management of data (SIGMOD 2003), San Diego, CA, USA, pp 181–192

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Chulalongkorn University, Bangkok, Thailand
Vit Niennattrakul, Pongsakorn Ruengronghirunya & Chotirat Ann Ratanamahatana

Authors

Vit Niennattrakul
View author publications
You can also search for this author in PubMed Google Scholar
Pongsakorn Ruengronghirunya
View author publications
You can also search for this author in PubMed Google Scholar
Chotirat Ann Ratanamahatana
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chotirat Ann Ratanamahatana.

Additional information

Responsible editor: Eamonn Keogh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Niennattrakul, V., Ruengronghirunya, P. & Ratanamahatana, C.A. Exact indexing for massive time series databases under time warping distance. Data Min Knowl Disc 21, 509–541 (2010). https://doi.org/10.1007/s10618-010-0165-y

Download citation

Received: 15 June 2009
Accepted: 21 January 2010
Published: 16 February 2010
Issue Date: November 2010
DOI: https://doi.org/10.1007/s10618-010-0165-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exact indexing for massive time series databases under time warping distance

Abstract

Access this article

Similar content being viewed by others

Scalable data series subsequence matching with ULISSE

Speeding up similarity search under dynamic time warping by pruning unpromising alignments

Speeding up dynamic time warping distance for sparse time series data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Exact indexing for massive time series databases under time warping distance

Abstract

Access this article

Similar content being viewed by others

Scalable data series subsequence matching with ULISSE

Speeding up similarity search under dynamic time warping by pruning unpromising alignments

Speeding up dynamic time warping distance for sparse time series data

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation