Skip to main content
Log in

Discovery of time series \(k\)-motifs based on multidimensional index

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Time series motifs are frequently occurring but previously unknown subsequences of a longer time series. Discovering time series motifs is a crucial task in time series data mining. In time series motif discovery algorithm, finding nearest neighbors of a subsequence is the basic operation. To make this basic operation efficient, we can make use of some advanced multidimensional index structure for time series data. In this paper, we propose two novel algorithms for discovering motifs in time series data: The first algorithm is based on \(\hbox {R}^{*}\)-tree and early abandoning technique and the second algorithm makes use of a dimensionality reduction method and state-of-the-art Skyline index. We demonstrate that the effectiveness of our proposed algorithms by experimenting on real datasets from different areas. The experimental results reveal that our two proposed algorithms outperform the most popular method, random projection, in time efficiency while bring out the same accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

References

  1. Beckman N, Kriege, H, Schneider R, Seeger B (1990) The \(\text{ R }^{\ast }\)-tree: an efficient and robust access method for points and rectangles. In: Proceedings of 1990 ACM-SIGMOD conference, Atlantic City, NJ, pp 322–331

  2. Buhler J, Tompa M (2001) Finding motifs using random projections. In: Proceedings of the 5th annual international conference on computational biology, pp 69–76

  3. Buza K, Thieme LS (2010) motif-based classification of time series with bayesian networks and svms. In: Fink A et al (eds) Advances in data analysis, data handling and business intelligences, studies in classification, data analysis, knowledge organization. Springer, Berlin, pp 105–114

    Google Scholar 

  4. Castro N, Azevedo P (2010) Multiresolution motif discovery in time series. In: Proceedings of SIAM international conference on data mining, April 29–May 1, Columbus, OH, USA

  5. Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: Proceedings of the 9th International conference on knowledge discovery and data mining (KDD’03), pp 493–498

  6. Ferreira P, Azevedo P, Silva C, Brito R (2006) Mining approximate motifs in time series. In: Proceedings of the 9th international conference on discovery science, pp 89–101

  7. Faloutsos C, Ranganathan R, Manolopoulos Y (1994) Fast subsequence matching in time series databases. In: Proceedings of ACM SIGMOD conference, May, pp 419–429

  8. Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: Proceedings of the ACM SIGMOD international conference on management of data, June 18–21, pp 47–57

  9. Gruber C, Coduro M, Sick B (2006) Signature verification with dynamic RBF networks and time series motifs. In: Proceedings of 10th international workshop on Frontiers in handwriting recognition

  10. Jiang Y, Li C, Han J (2009) Stock temporal prediction based on time series motifs. In: Proceedings of 8th international conference on machine learning and cybernetics, Baoding, China, July 12–15

  11. Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001) Dimensionality reduction for fast similarity search in large time series databases. Knowl Inf Syst 3(3):263–286

    Article  MATH  Google Scholar 

  12. Keogh E, Chakrabarti K, Pazzani, M, Mehrotra S (2001) Locally adaptive dimensionality reduction for indexing large time series databases. In: Proceedings of ACM SIGMOD conference on management of data, Santa Barbara, CA, May 21–24, pp 151–162

  13. Keogh E, Zhu Q, Hu B, Hao Y, Xi X, Wei L, Ratanamahatana CA (2011) The UCR time series classification/clustering homepage. http://www.cs.ucr.edu/~eamonn/time_series_data

  14. Lin J, Keogh E, Lonardi S, Patel P (2002) Finding motifs in time series. In: Proceedings of 2nd workshop on temporal data mining. Edmonton, Alberta, Canada

  15. Lin J, Keogh E, Lonardi S, Chiu, B (2003) A symbolic representation of time series with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery

  16. Li Q, Lopez IFV, Moon B (2004) Skyline index for time series data. IEEE Trans Knowl Data Eng 16(6):669–684

  17. Meng J, Yuan J, Hans H, Wu Y (2008) Mining motifs from human motion. In: Proceedings of eurographics

  18. Mueen A, Keogh E, Zhu Q, Cash S, Westover B (2009) Exact discovery of time series motifs. In: Proceedings of SIAM international conference on data mining, pp 473–484

  19. Phu L, Anh DT (2011) Motif-based method for initialization the k-Means clustering for time series data. In: Wang D, Reynolds M. (eds) Proceedings of 24th Australasian joint conference (AI 2011), Perth, Australia, Dec 5–8. LNAI 7106, Springer, Berlin, pp 11–20

  20. Pratt KB, Fink E (2002) Search for patterns in compressed time series. Int J Image Graph 2(1):89–106

    Article  Google Scholar 

  21. Ratanamahatana CA, Keogh E, Bagnall AJ, Lonardi S (2004) A novel bit level time series representation with implications for similarity search and clustering. In: Proceedings of PAKDD, Hanoi, Vietnam

  22. Schlüter T, Conrad S (2012) Hidden Markov Model-based time series prediction using motifs for detecting inter-time-serial correlations. In: Proceedings ACM symposium on applied computing (SAC), Riva del Garda (Trento), Italy

  23. Son NT, Anh DT (2011) Time series similarity search based on middle points and clipping. In: Proceedings of 3rd conference on data mining and optimization (DMO 2011), Putrajaya, Malaysia, June 28–29, pp 13–19

  24. Son NT, Anh DT (2012) Discovering time series motifs based on multidimensional index and early abandoning. In: Proceedings of 4th international conference on computational collective intelligence (ICCCI 2012) Part 1, Ho Chi Minh City, Vietnam, November, LNAI 7653, Springer, Berlin, pp 72–82

  25. Tanaka Y, Uehara K (2003) Discover motifs in multi dimensional time series using the principal component analysis and the MDL principle. In: Proceedings of 3rd international conference on machine learning and data mining in pattern recognition, Leipzig, Germany, July 5–7, pp 252–265

  26. Tanaka Y, Iwamoto K, Uehara K (2005) Discovery of time series motif from multi-dimensional data based on MDL principle. Mach Learn 58:269–300

    Article  MATH  Google Scholar 

  27. Tang H, Liao S (2008) Discovering original motifs with different lengths from time series. Knowl Based Syst 21(7):666–671

    Article  Google Scholar 

  28. Xi X, Keogh E, Li W, Mafra-neto A (2007) Finding motifs in a database of shapes. In: Proceedings of SDM 2007, LNCS 4721, Springer, Heidelberg, pp 249–260

  29. Yankov D, Keogh E, Medina J, Chiu B, Zordan V (2007) Detecting motifs under uniform scaling. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 844–853

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Duong Tuan Anh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Son, N.T., Anh, D.T. Discovery of time series \(k\)-motifs based on multidimensional index. Knowl Inf Syst 46, 59–86 (2016). https://doi.org/10.1007/s10115-014-0814-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-014-0814-3

Keywords

Navigation