Skip to main content
Log in

NSPRING: the SPRING extension for subsequence matching of time series supporting normalization

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Mining sequences and patterns in time series data streams is fast becoming a common practice in today’s world. The rapid progress of data collection and web technologies yields tremendous growth of flowing data in various complex forms that need to be analyzed in real time. Traditional data mining methods that typically require the process data to be scanned repeatedly are not feasible for stream data applications. However, new techniques like SPRING attempt to address these challenges by identifying sequences of patterns on time series streams, thus reducing the complexity to be linear in both time and space. Unfortunately, SPRING does not support data normalization, which renders it to be not applicable for most data sets. In this paper, we are proposing an approach called NSPRING based on SPRING that extends the advantages of SPRING, e.g., low in time and space complexity, while it can support normalization. Furthermore, NSPRING retains similar mining accuracy to SPRING.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Sakurai Y, Faloutsos C, Yamamuro M (2007) Stream monitoring under the time warping distance. In: Proceedings of the 23rd International Conference on Data Engineering, pp 1046–1055

  2. Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 262–270

  3. Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Discov 7(4):349–371

    Article  MathSciNet  Google Scholar 

  4. Rodpongpun S, Niennattrakul V, Ratanamahatana CA (2011) Efficient subsequence search on streaming data based on time warping distance. Comput Inf Technol 5(1):2

    Google Scholar 

  5. Ratanamahatana CA, Keogh E (2004) Everything you know about dynamic time warping is wrong. In: Third Workshop on Mining Temporal and Sequential Data, pp 22–25

  6. Alon J, Athitsos V, Yuan Q, Sclaroff S (2009) A unified framework for gesture recognition and spatiotemporal gesture segmentation. IEEE Trans Pattern Anal Mach Intell 31(9):1685–1699

    Article  Google Scholar 

  7. Ihm S-Y, Nasridinov A, Lee J-H, Park Y-H (2014) Efficient duality-based subsequent matching on time-series data in green computing. J Supercomput 69(3):1039–1053

    Article  Google Scholar 

  8. Aach J, Church GM (2001) Aligning gene expression time series with time warping algorithms. Bioinformatics 17(6):495–508

    Article  Google Scholar 

  9. Yi B-K, Jagadish H, Faloutsos C (1998) Efficient retrieval of similar time sequences under time warping. In: 14th International Conference on Data Engineering, pp 201–208

  10. Itakura F (1975) Minimum prediction residual principle applied to speech recognition. IEEE Trans Acoust, Speech Signal Process 23(1):67–72

    Article  Google Scholar 

  11. Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust, Speech Signal Process 26(1):43–49

    Article  MATH  Google Scholar 

  12. Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386

    Article  Google Scholar 

  13. Keogh E, Wei L, Xi X, Vlachos M, Lee S-H, Protopapas P (2009) Supporting exact indexing of arbitrarily rotated shapes and periodic time series under euclidean and warping distance measures. VLDB J 18(3):611–630

    Article  Google Scholar 

  14. Sart D, Mueen A, Najjar W, Keogh E, Niennattrakul V (2010) Accelerating dynamic time warping subsequence search with gpus and fpgas. In: IEEE 10th International Conference on Data Mining (ICDM), pp 1001–1006

  15. Wong TSF, Wong MH (2003) Efficient subsequence matching for sequences databases under time warping. In: Proceedings of Seventh International Database Engineering and Applications Symposium, pp 139–148

  16. Peng Z, Liang S, Yan J, Hong HW, Qiang YS (2008) Fast similarity matching on data stream with noise. In: IEEE 24th International Conference on Data Engineering Workshop (ICDEW), pp 194–199

  17. Zhou M, Wong MH (2008) Efficient online subsequence searching in data streams under dynamic time warping distance. In: IEEE 24th International Conference on Data Engineering (ICDE), pp 686–695

  18. Niennattrakul V, Wanichsan D, Ratanamahatana CA (2010) Accurate subsequence matching on data stream under time warping distance. In: New Frontiers in Applied Data Mining, pp 156–167

  19. Papapetrou P, Athitsos V, Potamias M, Kollios G, Gunopulos D (2011) Embedding-based subsequence matching in time-series databases. ACM Trans Database Syst (TODS) 36(3):17

    Article  Google Scholar 

  20. Agrawal R, Faloutsos C, Swami AN (1993) Efficient similarity search in sequence databases. In: Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms, FODO ’93, pp 69–84

  21. Chan K-P, Fu A-C (1999) Efficient time series matching by wavelets. In: Proceedings of the 15th International Conference on Data Engineering, pp 126–133

  22. Keogh EJ, Pazzani MJ (2000) A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Knowledge Discovery and Data Mining Current Issues and New Applications, pp 122–133

  23. Patel P, Keogh E, Lin J, Lonardi S (2002) Mining motifs in massive time series databases. In: Proceedings of IEEE International Conference on Data Mining, pp 370–377

  24. Shieh J, Keogh E (2008) \(i\)sax: indexing and mining terabyte sized time series. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 623–631

  25. Chu KKW, Wong MH (1999) Fast time-series searching with scaling and shifting. In: Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp 237–248

  26. Zhou M, Wong M-H, Chu K-W (2006) A geometrical solution to time series searching invariant to shifting and scaling. Knowl Inf Syst 9(2):202–229

    Article  Google Scholar 

  27. Zhu Y, Shasha D (2003) Warping indexes with envelope transforms for query by humming. In: Proceedings of the 2003 ACM SIGMOD international conference on Management of data, pp 181–192

  28. Keogh E, Zhu Q, Hu B, Hao Y, Xi X, Wei L, Ratanamahatana CA (2011) www.cs.ucr.edu/~eamonn/time_series_data/ The ucr time series classification/clustering homepage. www.cs.ucr.edu/~eamonn/time_series_data/

Download references

Acknowledgments

The authors are thankful for the financial support from the research grant “Temporal Data Stream Mining by Using Incrementally Optimized Very Fast Decision Forest (iOVFDF),” Grant no. MYRG2015-00128-FST, offered by the University of Macau, FST, and RDAO. We would also like to acknowledge the kind assistance of Ms. Katlin Kreamer-Tonin for proofreading this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon Fong.

Appendix

Appendix

See Table 5.

Table 5 List of revised UCR data sets

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gong, X., Fong, S., Chan, J.H. et al. NSPRING: the SPRING extension for subsequence matching of time series supporting normalization. J Supercomput 72, 3801–3825 (2016). https://doi.org/10.1007/s11227-015-1525-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-015-1525-6

Keywords

Navigation