Abstract
Because time series are a ubiquitous and increasingly prevalent type of data, there has been much research effort devoted to time series data mining recently. As with all data mining problems, the key to effective and scalable algorithms is choosing the right representation of the data. Many high level representations of time series have been proposed for data mining. In this work, we introduce a new technique based on a bit level approximation of the data. The representation has several important advantages over existing techniques. One unique advantage is that it allows raw data to be directly compared to the reduced representation, while still guaranteeing lower bounds to Euclidean distance. This fact can be exploited to produce faster exact algorithms for similarly search. In addition, we demonstrate that our new representation allows time series clustering to scale to much larger datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aach, J., Church, G.: Aligning gene expression time series with time warping algorithms. Bioinformatics (17), 551–556 (1981)
Berndt, D., Clifford, J.: Using dynamic time warping to find patterns in time series. In: AAAI 1994 Workshop on Knowledge Discovery in Databases, pp. 229–248 (1994)
Keogh, E., Kasetty, S.: On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. In: The 8th ACM SIGKDD, pp. 102–111 (2002)
Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: VLDB, pp. 385–394 (2000)
Keogh, E., Lonardi, S., Ratanamahatana, C.: Towards Parameter-Free Data Mining. In: Proceedings of SIGKDD (2004)
Kedem, B., Slud, E.: On Goodness of Fit of Time Series Models: An Application of Higher Order Crossings. Biometrika 68, 551–556 (1981)
Ratanamahatana, C.A., Keogh, E., Bagnall, A.J., Lonardi, S.: A Novel Bit Level Time Se-ries Representation with Implication of Similarity Search and Clustering (2004), http://www.cs.ucr.edu/downloads/techrpt/TR_clippedpaper.pdf
Keogh, E., Folias, T.: The UCR time Series Data Mining archive (2002), http://www.cs.ucr.edu/~eamonn/TSDMA
Galan, R.F., Sachse, S., Galizia, C.G., Herz, A.V.M.: Odor-driven attractor dynamics in the antennal lobe allow for simple and rapid olfactory pattern classification. Neural Computation (2004)
Bagnall, A.J., Janacek, G.: Clustering time series from ARMA models with clipped data. In: SIGKDD (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ratanamahatana, C., Keogh, E., Bagnall, A.J., Lonardi, S. (2005). A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering. In: Ho, T.B., Cheung, D., Liu, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2005. Lecture Notes in Computer Science(), vol 3518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11430919_90
Download citation
DOI: https://doi.org/10.1007/11430919_90
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26076-9
Online ISBN: 978-3-540-31935-1
eBook Packages: Computer ScienceComputer Science (R0)