Fast Normalization-Transformed Subsequence Matching in Time-Series Databases

Yang-Sae MOON
Jinho KIM

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E90-D    No.12    pp.2007-2018
Publication Date: 2007/12/01
Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e90-d.12.2007
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Data Mining
Keyword: 
data mining,  time-series databases,  subsequence matching,  normalization transform,  

Full Text: PDF(1.7MB)>>
Buy this Article



Summary: 
Normalization transform is known to be very useful for finding the overall trend of time-series data since it enables finding sequences with similar fluctuation patterns. Previous subsequence matching methods with normalization transform, however, would incur index overhead both in storage space and in update maintenance since they should build multiple indexes for supporting query sequences of arbitrary length. To solve this problem, we adopt a single-index approach in the normalization-transformed subsequence matching that supports query sequences of arbitrary length. For the single-index approach, we first provide the notion of inclusion-normalization transform by generalizing the original definition of normalization transform. To normalize a window, the inclusion-normalization transform uses the mean and the standard deviation of a subsequence that includes the window while the original transform uses those of the window itself. Next, we formally prove the correctness of the proposed normalization-transformed subsequence matching method that uses the inclusion-normalization transform. We then propose subsequence matching and index-building algorithms to implement the proposed method. Experimental results for real stock data show that our method improves performance by up to 2.52.8 times compared with the previous method.


open access publishing via