Abstract
Similarity of objects is one of the crucial concepts in several applications, including data mining. For complex objects, similarity is nontrivial to define. In this paper we present an intuitive model for measuring the similarity between two time series. The model takes into account outliers, different scaling functions, and variable sampling rates. Using methods from computational geometry, we show that this notion of similarity can be computed in polynomial time. Using statistical approximation techniques, the algorithms can be speeded up considerably. We give preliminary experimental results that show the naturalness of the notion.
Research done while the author was visiting the Univ. of Helsinki.
Research done while the author was visiting the Univ. of Helsinki.
Chapter PDF
Similar content being viewed by others
Keywords
- Line Segment
- Computational Geometry
- Data Mining Application
- Preliminary Experimental Result
- Longe Common Subsequence
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
R. Agrawal, C. Faloutsos, and A. Swami. Efficient similarity search in sequence databases. In Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms (FODO'93), Chicago, 1993.
R. Agrawal, K.-I. Lin, H. S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. In Proceedings of the 21st International Conference on Very Large Data Bases (VLDB'95), pages 490–501, Zurich, Swizerland, 1995.
A. V. Aho. Algorithms for finding patterns in strings. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity, pages 255–400. Elsevier Science Publishers B.V (North-Holland), Amsterdam, 1990.
D. J. Berndt and J. Clifford. Finding patterns in time series: A dynamic programming approach. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 229–248. AAAI Press, Menlo Park, CA, 1996.
B. Bollobás, G. Das, D. Gunopulos, and H. Mannila. Time-series similarity problems and well-separated geometric sets. In ACM Computational Geometry Conference, 1997.
C. Faloutsos, M. Ranganathan, and Y. Manolopoulos. Fast subsequence matching in time-series databases. In SIGMOD'94, May 1994.
D. Goldin and P. Kanellakis. On similarity queries for time-series data: constraint specification and implementations. In Int. Conf. on the Principles and Practice of Constraint Programming, pages 137–153, 1995.
D. C. Hoaglin, F. Mosteller, and J. W. Tukey, editors. Understanding Robust and Exploratory Data Analysis. Wiley, 1982.
H. Jagadish, A. O. Mendelzon, and T. Milo. Similarity-based queries. In Proceedings of the Fourteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS'95), pages 36–45, 1995.
D. Sankoff and J. B. Kruskal. Time Warps, String Edits and Macromolecules: the Theory and Practice of Sequence Comparison. Addison Wesley, 1983.
H. Shatkay and S. Zdonik. Approximate queries and representations for large data sequences. In ICDE'96, 1996.
D. A. White and R. Jain. Algorithms and strategies for similarity retrieval. Technical Report VCL-96-101, Visual Computing Laboratory, University of California, San Diego, 9500 Gilman Drive, Mail Code 0407, La Jolla, CA 92093-0407, July 1996.
N. Yazdani and Z. M. Ozsoyoglu. Sequence matching of images. In Proceedings of the 8th International Conference on Scientific and Statistical Database Management, Stockholm, pages 53–62, 1996.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Das, G., Gunopulos, D., Mannila, H. (1997). Finding similar time series. In: Komorowski, J., Zytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1997. Lecture Notes in Computer Science, vol 1263. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63223-9_109
Download citation
DOI: https://doi.org/10.1007/3-540-63223-9_109
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63223-8
Online ISBN: 978-3-540-69236-2
eBook Packages: Springer Book Archive