Elsevier

Applied Soft Computing

Volume 96, November 2020, 106631
Applied Soft Computing

Exact Dynamic Time Warping calculation for weak sparse time series

https://doi.org/10.1016/j.asoc.2020.106631Get rights and content

Highlights

  • This paper addresses the accurate and fast DTW calculation algorithm on weak sparse time series.

  • The mathematical proof is given to prove the accuracy of this algorithm.

  • Several examples with different practical prospects are given to show the effectiveness of the proposed accurate and fast DTW calculation algorithm.

Abstract

The Dynamic Time Warping (DTW) technique is widely used in time series data mining. However, it should be pointed out that the calculation complexity of DTW is very high. In this paper, we propose an accurate and fast DTW calculation algorithm on weak sparse time series (WSTS). The algorithm takes the advantage of the weak sparse property, and it shows a remarkable time saving in DTW calculation. In addition, it should be emphasized that this algorithm for DTW calculation is an accurate one, which is one of the main contributions of this paper. The mathematical proof is also given to prove the accuracy of this algorithm. Several examples with different practical prospects are given to show the effectiveness of the proposed accurate and fast DTW calculation algorithm.

Introduction

With the increasing data analysis requirements, time series data is extremely important, especially for modern big data world. Time series data can be found in a wide range of practical domains, such as human activities, physiological signals, financial recordings, and other nature produces [1], [2], [3]. Dozens of researches on time series can be found in the literature. Among them, the Dynamic Time Warping (DTW) technique has been used in plenty of time series research works.

DTW is a distance measure which is different from Euclidean distance. It takes the advantage of the warping technique which is able to capture similarity property between time series data, see Fig. 1. DTW has already been applied in many research fields and has shown great application prospects in industry. In [4], voice recognition, a popular research topic in artificial intelligence, was investigated by using DTW. In [5], fingerprint verification system, which is widely used in security systems, was proposed based on DTW. In [6], word images scanned from historical documents were matched by using DTW. New applications have been studied by researchers recently. In [7], DTW was used to construct an approach for online generator coherency identification in controlled islanding study. In [8], time-weighted DTW was used for mapping croplands based on high resolution remote sensing data. In [9], the authors used the DTW technique to detect the deviation of the signals of Ground Penetrating Radar.

Although there exist many applications by using DTW method, as pointed by Mueen in [10], DTW algorithm is inefficient in the distance calculation on sparse time series due to the reason that the DTW algorithm does not take the advantage of the sparsity. The same inefficient phenomena will also happen in applying DTW to the distance calculation on weak sparse time series (WSTS). In this paper, WSTS denotes a time series which contains plenty of consecutive repeated values. WSTS can be widely found in various domains, such as, the coordinate of an airplane (one airplane would stay at an airport for a long time for boarding or refueling); the recorded temperature of a working refrigerator; a household watching records on TV; the daily body activity status of a human (rest, walking or running). In this paper, we study the accurate and fast DTW calculation algorithm on WSTS.

In recent years, plenty of researches on the fast DTW calculation can be found in the literature. Sakoe–Chiba Band [11] and Itakura Parallelogram [12] are two most commonly used constraints to speed up DTW calculation. The obtained DTW warp path is limited by constraint and may not be the globally optimal warp path, which leads to approximate calculation rather than accurate one. These two constraints work poorly when the path strayed far from a linear warp. In [13], [14], [15], [16], the authors speeded up DTW calculation for time series data by using data abstraction. The modified methods are approximate algorithms by reducing time series data in size or mapping lower-resolution path to full resolution. As the level of abstraction increased, the calculation becomes increasingly inaccurate. In [17] and [18], the authors proposed new techniques for fast indexing time series data by using DTW. The proposed techniques are DTW accelerating applications, but they cannot accelerate the actual DTW calculation. In [19], a multi-level approach which recursively projected a solution from a coarser resolution was used. The authors obtained approximate approaches based on reduced time series data. In [20], the authors described several popular accelerating DTW approaches and made these approaches more efficient by approximations. However, the accuracy of their approaches would loss under some scenarios. Although there exist many researches on accelerating the DTW calculation, those researches do not obtain the accurate DTW distance as the original one. What is more, they are not focused on DTW calculation for weak sparse time series. In this paper, we investigate the DTW calculation on weak sparse time series which takes the advantage of the weak sparse property. In addition, the accurate DTW distance is obtained and mathematical proof is also given.

The rest of this paper is organized as follows. In Section 2, the problem statement is given. In Section 3, the definition of time series, weak sparse time series and DTW are presented. In Section 4, the accurate and fast algorithm for calculating DTW distance on weak sparse time series is developed. In Section 5, the validation studies are conducted to show the effectiveness of the proposed algorithm for DTW calculation on weak sparse time series. Section 6 concludes this paper and shows some future directions.

Section snippets

Problem statement

DTW is a distance measure which is able to capture similarity property between time series data by taking advantage of warping path. It has been widely used in various fields, such as biometric data, chemical engineering, medicine, industry, finance, and so on. However, DTW has a high-level computational complexity, especially inefficient for weak sparse time series. It needs redundant calculations on the weak sparse data. In this paper, we derive an accurate and fast DTW algorithm for weak

Time series

A time series X={x1,x2,,xn} is defined as a list of observations in temporal order, which is made at equal intervals.

Weak sparse time series (WSTS)

In this paper, we define the weak sparse time series as a time series which contains some consecutive repeated values. X={x1,,xi,xi,,xiAiofxi,,xj,xj,,xjAjofxj,,xN}.For the WSTS X which is given in (1), there exists at least one value like xi, which repeats Ai times. The A0 is set to be 0 for later use.

Dynamic time warping (DTW)

Given two time series X={x1,x2,,xn} and Y={y1,y2,,ym}, the DTW

Approach

In this section, the algorithm for fast DTW distance calculation on weak sparse time series is proposed. In addition, a mathematical proof is given to show the accuracy of the proposed algorithm.

Datasets

In this section, the proposed algorithm is evaluated on six weak sparse time series datasets, which are Earthquake, Computers, RefrigerationDevices, ScreenType, TwoPatterns and Wafer. These six datasets are those datasets with the sparsity property comes from UCR benchmark datasets [1]. Fig. 9, Fig. 10, Fig. 11, Fig. 12, Fig. 13, Fig. 14 are the sample time series of these six datasets which are used to show the sparsity. The experiments are implemented in Python 3.5 and performed on the

Conclusions and future works

In this paper, an accurate and fast algorithm for DTW distance calculation on weak sparse time series is developed. The mathematical proof is given to ensure the accuracy of the proposed algorithm. The performance of the proposed algorithm has been evaluated on six datasets. The validation study shows that the proposed algorithm saves plenty of time in DTW calculation on weak sparse time series and gives exact DTW distance. In addition, a definition of the weak sparse factor is given for

CRediT authorship contribution statement

Lei Ge: Conceptualization, Data curation, Formal analysis, Writing - original draft. Shun Chen: Conceptualization, Supervision, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (25)

  • A. Mueen, N. Chavoshi, N. Abu-El-Rub, H. Hamooni, A. Minnich, AWarp: fast warping distance for sparse time series, in:...
  • SakoeH. et al.

    Dynamic programming algorithm optimization for spoken word recognition

    IEEE Trans. Acoust. Speech Signal Process.

    (1978)
  • Cited by (13)

    • Time series clustering via matrix profile and community detection

      2022, Advanced Engineering Informatics
      Citation Excerpt :

      Similarity measurement is the basis of time series clustering. A few methods for measuring similarity of time series have been developed, such as Manhattan distance [17,18], Euclidean distance (ED) [19], shape-based distance (SBD) [20], slope-based similarity measure [21], dynamic time warping (DTW) [1,22], and Pearson correlation coefficient [23]. At present, the most commonly used methods are ED and DTW.

    • Anticipatory transport system with hybrid linear and nonlinear forecasting using streaming wafer process data

      2022, Applied Soft Computing
      Citation Excerpt :

      The distance between the time series is larger when they are less similar [34]. For subsequence pattern matching, we utilize UCR-DTW to identify subsequences within a longer sequence that are similar to another shorter query sequence [35,36]. We set three steps for the subsequence queries based on UCR-DTW.

    View all citing articles on Scopus
    View full text