Skip to main content

Advertisement

Log in

Exact multi-length scale and mean invariant motif discovery

Applied Intelligence Aims and scope Submit manuscript

Abstract

Discovering approximately recurrent motifs (ARMs) in timeseries is an active area of research in data mining. Exact motif discovery is defined as the problem of efficiently finding the most similar pairs of timeseries subsequences and can be used as a basis for discovering ARMs. The most efficient algorithm for solving this problem is the MK algorithm which was designed to find a single pair of timeseries subsequences with maximum similarity at a known length. This paper provides three extensions of the MK algorithm that allow it to find the top K similar subsequences at multiple lengths using both the Euclidean distance metric and scale invariant normalized version of it. The proposed algorithms are then applied to both synthetic data and real-world data with a focus on discovery of ARMs in human motion trajectories.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Baldwin DA, Baird JA (1999) Early Social Cognition, chap. Action analysis: A gateway to intentional inference, pp. 215–240. Lawrence Erlbaum Associates, Inc., Hillsdale, NJ

  2. Baldwin DA, Baird JA, Saylor MM, Clark MA (2001) Infants detect structure in human ac-tion: A first step toward understanding others? intentions Child Dev 72:708–718

    Article  Google Scholar 

  3. Buhler J, Tompa M (2001) Finding motifs using random projections. In: 5th Internatinal Conference on Computational Biology. pp. 69–76

  4. Catalano J, Armstrong T, Oates T (2006) Discovering patterns in real-valued time series. In: Knowledge Discovery in Databases: PKDD 2006. pp. 462–469

  5. Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 493–498. ACM, New York, NY,USA

  6. CMU Cmu motion capture dataset, http://mocap.cs.cmu.edu

  7. Jensen KL, Styczynxki MP, Rigoutsos I, Stephanopoulos GN (2006) A generic motif discovery algorithm for sequenctial data. BioInformatics 22(1):21–28

    Article  Google Scholar 

  8. Keogh E, Lin J, Fu A (2005) Hot sax: efficiently finding the most unusual time series subsequence. Data Mining, Fifth IEEE International Conference on pp. 8

  9. Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. J Data Min Knowl Discov 7(4):349–371

    Article  MathSciNet  Google Scholar 

  10. Kipp M (2001) Anvil - a generic annotation tool for multimodal dialogue. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech). pp. 1367–1370

  11. Lin J, Keogh E, Lonardi S, Patel P (2002) Finding motifs in time series. In: In the 2nd workshop on temporal data mining, at the 8th ACM SIGKDD international. pp. 53–68

  12. Minnen D, Starner T, Essa I, Isbell C (2007) Improving activity discovery with automatic neighborhood estimation. In: Int. Joint Conf. on Artificial Intelligence

  13. Mohammad Y, Ohmoto Y, Nishida T (2012) Gstex: Greedy stem extension for free-length constrained motif discovery. In: 25th IEA/AIE conference. pp. 417–426

  14. Mohammad Y, Nishida T (2009) Constrained motif discovery in time series. N Gener Comput 27(4):319–346

    Article  MATH  Google Scholar 

  15. Mohammad Y, Nishida T (2009) Learning interaction structure using a hierarchy of dynamical systems. In: IEA/AIE. pp. 253– 258

  16. Mohammad Y, Nishida T (2012) Fluid imitation: Discovering what to imitate. Int J Soc Robot 4(4):369–382

    Article  Google Scholar 

  17. Mohammad Y, Nishida T (2012) Unsupervised discovery of basic human actions from activity recording datasets. In: Proceedings of the IEEE/SICE Intl Symposium on System Integration

  18. Mohammad Y, Nishida T (2013) Approximately recurring motif discovery using shift density estimation. In: IEA/AIE. pp. 141–150

  19. Mohammad Y, Nishida T (2014) Exact discovery of length-range motifs. In: The 6th Asian Conference on Intelligent Information and Database Systems(ACIIDS). pp. 23–32

  20. Mohammad Y, Nishida T (2014) Scale invariant multi-length motif discovery. In: Modern Advances in Applied Intelligence, pp. 417–426. Springer

  21. Mohammad Y, Nishida T (2014) Shift density estimation based approximately recurring motif discovery. Appl Intell:1–23

  22. Mohammad Y, Nishida T, Okada S (2009) Unsupervised simultaneous learning of gestures, actions and their associations for human-robot interaction. In: Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems. pp. 2537–2544. IROS’09, IEEE Press, Piscataway, NJ, USA. http://dl.acm.org/citation.cfm?id=1733023.1733155

  23. Mohammad YFO, Ohmoto Y, Nishida T (2012) G-stex: Greedy stem extension for free-length constrained motif discovery. In: IEA/AIE. pp. 417–426

  24. Mueen A (2013) Enumeration of time series motifs of all lengths. In: IEEE 13th International Conference on Data Mining (ICDM), 2013. IEEE

  25. Mueen A, Keogh EJ, Zhu Q, Cash S, Westover MB (2009) Exact discovery of time series motifs. In: SDM. pp. 473– 484

  26. Oates T (2002) Peruse: An unsupervised algorithm for finding recurring patterns in time series. In: International Conference on Data Mining. pp. 330–337

  27. Pantic M, Pentland A, Nijholt A, Huang T (2007) Machine understanding of human behavior. In: IJCAI 2007 Workshop on AI for Human Computing (AI4HC’07). pp. 13–24. University of Twente, Centre for Telematics and Information Technology (CTIT). http://doc.utwente.nl/64116/

  28. Patel P, Keogh E, Lin J, Lonardi S Mining motifs in massive time series databases. IEEE International Conference on Data Mining pp. 370–377 (2002), http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1183925

  29. Rakthanmanon T, Keogh EJ, Lonardi S, Evans S (2011) Time series epenthesis: Clustering time series streams requires ignoring some data. In: 2011 IEEE 11th International Conference on Data Mining (ICDM). pp. 547–556. IEEE

  30. Saylor MM, Baldwin DA, Baird JA, LaBounty J (2007) Infants? on-line segmentation of dynamic human action. J Cogn Dev 8(1):113–128

    Article  Google Scholar 

  31. Spelke ES (1979) Perceiving bimodally specified events in infancy. Dev Psychobiol 15:626–636

    Article  Google Scholar 

  32. Tang H, Liao SS (2008) Discovering original motifs with different lengths from time series. Know-Based Syst 21(7):666– 671

    Article  Google Scholar 

  33. Vahdatpour A, Amini N, Sarrafzadeh M (2009) Toward unsupervised activity discovery using multi-dimensional motif detection in time series. In: IJCAI. pp. 1261–1266

Download references

Acknowledgments

This study has been partially supported by JSPS Grant-in-Aid for JSPS Postdoctoral Fellows P12046, JSPS KAKENHI Grant Number 24240023 and 15K12098, the Center of Innovation Program from JST, and AFOSR/AOARD Grant No. FA2386-14-1-0005.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yasser Mohammad.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohammad, Y., Nishida, T. Exact multi-length scale and mean invariant motif discovery. Appl Intell 44, 322–339 (2016). https://doi.org/10.1007/s10489-015-0684-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-015-0684-8

Keywords

Navigation