Abstract
Discovering approximately recurrent motifs (ARMs) in timeseries is an active area of research in data mining. Exact motif discovery is defined as the problem of efficiently finding the most similar pairs of timeseries subsequences and can be used as a basis for discovering ARMs. The most efficient algorithm for solving this problem is the MK algorithm which was designed to find a single pair of timeseries subsequences with maximum similarity at a known length. This paper provides three extensions of the MK algorithm that allow it to find the top K similar subsequences at multiple lengths using both the Euclidean distance metric and scale invariant normalized version of it. The proposed algorithms are then applied to both synthetic data and real-world data with a focus on discovery of ARMs in human motion trajectories.
Similar content being viewed by others
References
Baldwin DA, Baird JA (1999) Early Social Cognition, chap. Action analysis: A gateway to intentional inference, pp. 215–240. Lawrence Erlbaum Associates, Inc., Hillsdale, NJ
Baldwin DA, Baird JA, Saylor MM, Clark MA (2001) Infants detect structure in human ac-tion: A first step toward understanding others? intentions Child Dev 72:708–718
Buhler J, Tompa M (2001) Finding motifs using random projections. In: 5th Internatinal Conference on Computational Biology. pp. 69–76
Catalano J, Armstrong T, Oates T (2006) Discovering patterns in real-valued time series. In: Knowledge Discovery in Databases: PKDD 2006. pp. 462–469
Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 493–498. ACM, New York, NY,USA
CMU Cmu motion capture dataset, http://mocap.cs.cmu.edu
Jensen KL, Styczynxki MP, Rigoutsos I, Stephanopoulos GN (2006) A generic motif discovery algorithm for sequenctial data. BioInformatics 22(1):21–28
Keogh E, Lin J, Fu A (2005) Hot sax: efficiently finding the most unusual time series subsequence. Data Mining, Fifth IEEE International Conference on pp. 8
Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. J Data Min Knowl Discov 7(4):349–371
Kipp M (2001) Anvil - a generic annotation tool for multimodal dialogue. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech). pp. 1367–1370
Lin J, Keogh E, Lonardi S, Patel P (2002) Finding motifs in time series. In: In the 2nd workshop on temporal data mining, at the 8th ACM SIGKDD international. pp. 53–68
Minnen D, Starner T, Essa I, Isbell C (2007) Improving activity discovery with automatic neighborhood estimation. In: Int. Joint Conf. on Artificial Intelligence
Mohammad Y, Ohmoto Y, Nishida T (2012) Gstex: Greedy stem extension for free-length constrained motif discovery. In: 25th IEA/AIE conference. pp. 417–426
Mohammad Y, Nishida T (2009) Constrained motif discovery in time series. N Gener Comput 27(4):319–346
Mohammad Y, Nishida T (2009) Learning interaction structure using a hierarchy of dynamical systems. In: IEA/AIE. pp. 253– 258
Mohammad Y, Nishida T (2012) Fluid imitation: Discovering what to imitate. Int J Soc Robot 4(4):369–382
Mohammad Y, Nishida T (2012) Unsupervised discovery of basic human actions from activity recording datasets. In: Proceedings of the IEEE/SICE Intl Symposium on System Integration
Mohammad Y, Nishida T (2013) Approximately recurring motif discovery using shift density estimation. In: IEA/AIE. pp. 141–150
Mohammad Y, Nishida T (2014) Exact discovery of length-range motifs. In: The 6th Asian Conference on Intelligent Information and Database Systems(ACIIDS). pp. 23–32
Mohammad Y, Nishida T (2014) Scale invariant multi-length motif discovery. In: Modern Advances in Applied Intelligence, pp. 417–426. Springer
Mohammad Y, Nishida T (2014) Shift density estimation based approximately recurring motif discovery. Appl Intell:1–23
Mohammad Y, Nishida T, Okada S (2009) Unsupervised simultaneous learning of gestures, actions and their associations for human-robot interaction. In: Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems. pp. 2537–2544. IROS’09, IEEE Press, Piscataway, NJ, USA. http://dl.acm.org/citation.cfm?id=1733023.1733155
Mohammad YFO, Ohmoto Y, Nishida T (2012) G-stex: Greedy stem extension for free-length constrained motif discovery. In: IEA/AIE. pp. 417–426
Mueen A (2013) Enumeration of time series motifs of all lengths. In: IEEE 13th International Conference on Data Mining (ICDM), 2013. IEEE
Mueen A, Keogh EJ, Zhu Q, Cash S, Westover MB (2009) Exact discovery of time series motifs. In: SDM. pp. 473– 484
Oates T (2002) Peruse: An unsupervised algorithm for finding recurring patterns in time series. In: International Conference on Data Mining. pp. 330–337
Pantic M, Pentland A, Nijholt A, Huang T (2007) Machine understanding of human behavior. In: IJCAI 2007 Workshop on AI for Human Computing (AI4HC’07). pp. 13–24. University of Twente, Centre for Telematics and Information Technology (CTIT). http://doc.utwente.nl/64116/
Patel P, Keogh E, Lin J, Lonardi S Mining motifs in massive time series databases. IEEE International Conference on Data Mining pp. 370–377 (2002), http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1183925
Rakthanmanon T, Keogh EJ, Lonardi S, Evans S (2011) Time series epenthesis: Clustering time series streams requires ignoring some data. In: 2011 IEEE 11th International Conference on Data Mining (ICDM). pp. 547–556. IEEE
Saylor MM, Baldwin DA, Baird JA, LaBounty J (2007) Infants? on-line segmentation of dynamic human action. J Cogn Dev 8(1):113–128
Spelke ES (1979) Perceiving bimodally specified events in infancy. Dev Psychobiol 15:626–636
Tang H, Liao SS (2008) Discovering original motifs with different lengths from time series. Know-Based Syst 21(7):666– 671
Vahdatpour A, Amini N, Sarrafzadeh M (2009) Toward unsupervised activity discovery using multi-dimensional motif detection in time series. In: IJCAI. pp. 1261–1266
Acknowledgments
This study has been partially supported by JSPS Grant-in-Aid for JSPS Postdoctoral Fellows P12046, JSPS KAKENHI Grant Number 24240023 and 15K12098, the Center of Innovation Program from JST, and AFOSR/AOARD Grant No. FA2386-14-1-0005.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mohammad, Y., Nishida, T. Exact multi-length scale and mean invariant motif discovery. Appl Intell 44, 322–339 (2016). https://doi.org/10.1007/s10489-015-0684-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-015-0684-8