ABSTRACT
The efficiency of searching scaling-invariant and shifting-invariant shapes in a set of massive time series data can be improved if searching is performed on an approximated sequence which involves less data but contains all the significant features. However, commonly used smoothing techniques, such as moving averages and best-fitting polylines, usually miss important peaks and troughs and deform the time series. In addition, these techniques are not robust, as they often requires users to supply a set of smoothing parameters which has direct effect on the resultant approximation pattern. To address these problems, an algorithm to construct a lattice structure as an underlying framework for pattern matching is proposed in this paper. As inputs, the algorithm takes a time series and users' requirements of level of detail. The algorithm then identifies all the important peaks and troughs (known as controlm points) in the time series and classifies the points into appropriate layers of the lattice structure. The control points in each layer of the structure form an approximation pattern an yet preserve the overall shape of the original series with approximation error lies within certain bound. The lower the layer, the more precise the approximation pattern is. Putting in another way, the algorithm takes different levels of data smoothing into account. Also, the lattice structure can be indexed to further improve the performance of pattern matching.
- 1.R. Agrawal, C. Faloutsos, and A. Swami. Efficient similarity search in sequence databases. In Proc. of the Fourth Intl. Conf. On Foundations of Data Organization and Algorithm, pages 69-84, 1993.]] Google ScholarDigital Library
- 2.K. P. Chan and W. C. Fu. Efficient time series matching by wavelets. In International Conference on Data Engineering, 1999.]]Google ScholarDigital Library
- 3.K. W. Chu, S. K. Lam, and M. H. Wong. An efficient hash-based algorithm for sequence data searching. The Computer Journal, pages 402415, 1998.]]Google Scholar
- 4.K. W. Chu and M. H. Wong. Fast time-series searching with scaling and shifting. In PODS, 1999.]] Google ScholarDigital Library
- 5.C. Faloutsos, M. Ranganthan, and Y. Manolopoulos. Fast subsequence matching in time-series databses. In Proc. of the ACM SIGMOD Conference on Management of Data, pages 419429, 1994.]] Google ScholarDigital Library
- 6.E. Keogh. A fast and robust method for pattern matching in time series database. In In PTOC. of 9th International Conference on Tools with Artificial Intelligence, 1997.]]Google Scholar
- 7.E. Keogh and P. Smyth. A probabilistic approach to fast pattern matching in time series databases. In In Proc. of the 3rd international conference of Kowledge Discovery and Data Mining, pages 24-30, 1997.]]Google Scholar
- 8.E. J. Keogh and M. J. Pazzani. An indexing scheme for fast similarity search in large time series datases. In In Proc. Eleventh International Conference on Scientific and Statistical Database Management, pages 239-241, 1999.]] Google ScholarDigital Library
- 9.E. J. Keogh and M. J. Pazzani. Relevance feedback retrieval of time series data. In Proc. of the and Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 183-190, 1999.]] Google ScholarDigital Library
- 10.F. Korn, H. Jagadish, and C. Faloutsos. Efficient supporting adhoc queries in large datasets of time qequences. In Procs. of the ACM SIGMOD Conference on Management of Data, 1997.]] Google ScholarDigital Library
- 11.S. K. Lam and M. H. Wong. A fast projection algorithm for sequence data searching. In Data and Knowledge Engineering 28, pages 321-339, 1998.]] Google ScholarDigital Library
- 12.S. Park, W. W. Chu, J. Yoon, and C. Hsu. Efficient search for similar subsequences of different lengths. In Proc. of the 15th International Conference on Data Engneering, March 2000.]] Google ScholarDigital Library
- 13.Pavlidis, T., Horowitz, and S. Segmentation of plane curves. In IEEE 'Pransactions on Computers Vol. C-23 No 8, August 1974.]]Google Scholar
- 14.C. S. Perng, H. Wang, S. R. Zhang, and D. Parker. Landmarks: A new model for similarity-based pattern querying in time series databases. In Proc. of the 15th International Conference on Data Engineeting, March 2000.]] Google ScholarDigital Library
- 15.J. Schwager. Schwager on Futures, Technical Analysis. John Wiley & Sons, 1998.]]Google Scholar
- 16.R. Weber, H. Schek, and S. Blott. A quantitative analysis and performance study for similarity search method in high dimensional spaces. In In Proceedings of the 24th International Conference on Ve'ery Large Data Bases (VLDB), pages pp. 194-205, 1998.]] Google ScholarDigital Library
- 17.B. K. Yi, H. Jagadis, and C.Faloutsos. Supporting fast search in time series for movement patterns in multiple scale. In Proc. of the 1998 ACM 7th International Conference on Information and Knowledge Management, pages 251-258, 1998.]] Google ScholarDigital Library
- 18.B. K. Yi, H. Jagadish, and C. Faloutsos. Efficient retrieval of similar time sequences under time warping. In International Conference on Data Engineering, pages 201-208, 1998.]] Google ScholarDigital Library
Index Terms
- Efficient and robust feature extraction and pattern matching of time series by a lattice structure
Recommendations
Bit-Parallel Tree Pattern Matching Algorithms for Unordered Labeled Trees
WADS '09: Proceedings of the 11th International Symposium on Algorithms and Data StructuresThe following tree pattern matching problem is considered: Given two unordered labeled trees <em>P</em> and <em>T</em> , find all occurrences of <em>P</em> in <em>T</em> . Here <em>P</em> and <em>T</em> are called a <em>pattern tree</em> and a <em>...
Comments