skip to main content
research-article

Latent Time-Series Motifs

Published: 20 July 2016 Publication History

Abstract

Motifs are the most repetitive/frequent patterns of a time-series. The discovery of motifs is crucial for practitioners in order to understand and interpret the phenomena occurring in sequential data. Currently, motifs are searched among series sub-sequences, aiming at selecting the most frequently occurring ones. Search-based methods, which try out series sub-sequence as motif candidates, are currently believed to be the best methods in finding the most frequent patterns.
However, this paper proposes an entirely new perspective in finding motifs. We demonstrate that searching is non-optimal since the domain of motifs is restricted, and instead we propose a principled optimization approach able to find optimal motifs. We treat the occurrence frequency as a function and time-series motifs as its parameters, therefore we learn the optimal motifs that maximize the frequency function. In contrast to searching, our method is able to discover the most repetitive patterns (hence optimal), even in cases where they do not explicitly occur as sub-sequences. Experiments on several real-life time-series datasets show that the motifs found by our method are highly more frequent than the ones found through searching, for exactly the same distance threshold.

References

[1]
André E. X. Brown, Eviatar I. Yemini, Laura J. Grundy, Tadas Jucikas, and William R. Schafer. 2013. A dictionary of behavioral motifs reveals clusters of genes affecting Caenorhabditis elegans locomotion. Proc. Natl. Acad. Sci. 110, 2 (2013), 791--796.
[2]
Jeremy Buhler and Martin Tompa. 2001. Finding motifs using random projections. In Proceedings of the 5th Annual International Conference on Computational Biology (RECOMB’01). ACM, New York, NY, 69--76.
[3]
N. Castro and P. Azevedo. 2010. Multiresolution motif discovery in time series. In Proceedings of the SIAM International Conference on Data Mining (SDM’10), Columbus, Ohio. SIAM, 665--676.
[4]
N. Castro and P. Azevedo. 2011. Time series motifs statistical significance. In Proceedings of the SIAM International Conference on Data Mining (SDM’11), Mesa, Arizona. SIAM, 687--698.
[5]
Nuno C. Castro and Paulo J. Azevedo. 2012. Significant motifs in time series. Stat. Anal. Data Min. 5, 1 (Feb. 2012), 35--53.
[6]
Joe Catalano, Tom Armstrong, and Tim Oates. 2006. Discovering patterns in real-valued time series. In Proceedings of the Knowledge Discovery in Databases (PKDD’06), Johannes Frnkranz, Tobias Scheffer, and Myra Spiliopoulou (Eds.). Lecture Notes in Computer Science, Vol. 4213. Springer Berlin, 462--469.
[7]
Bill Chiu, Eamonn Keogh, and Stefano Lonardi. 2003. Probabilistic discovery of time series motifs. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’03). ACM, New York, NY, 493--498.
[8]
Thomas H. Cormen, Clifford Stein, Ronald L. Rivest, and Charles E. Leiserson. 2001. Introduction to Algorithms (2nd ed.). McGraw-Hill Higher Education.
[9]
John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12 (July 2011), 2121--2159.
[10]
Pedro G. Ferreira, Paulo J. Azevedo, Candida G. Silva, and Rui M. M. Brito. 2006. Mining approximate motifs in time series. In Discovery Science, Ljupco Todorovski, Nada Lavrac, and Klaus P. Jantke (Eds.). Lecture Notes in Computer Science, Vol. 4265. Springer Berlin, 89--101. 11893318_12
[11]
A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. Ch. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley. 2000 (June 13). PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation 101, 23 (2000 (June 13)), e215--e220.
[12]
Hoang Thanh Lam, Ninh Dang Pham, and Toon Calders. 2011. Online Discovery of Top-k Similar Motifs in Time Series Data. Chapter 86, 1004--1015.
[13]
Yuan Li and Jessica Lin. 2010. Approximate variable-length time series motif discovery using grammar inference. In Proceedings of the 10th International Workshop on Multimedia Data Mining (MDMKDD’10). ACM, New York, NY, Article 10, 9 pages.
[14]
Yuan Li, Jessica Lin, and Tim Oates. 2012. Visualizing variable-length time series motifs. In Proceedings of the 12th SIAM International Conference on Data Mining, Anaheim, California, April 26--28. SIAM / Omnipress, 895--906.
[15]
Zheng Liu, JeffreyXu Yu, Xuemin Lin, Hongjun Lu, and Wei Wang. 2005. Locating motifs in time-series data. In Advances in Knowledge Discovery and Data Mining, TuBao Ho, David Cheung, and Huan Liu (Eds.). Lecture Notes in Computer Science, Vol. 3518. Springer Berlin, 343--353. 11430919_41
[16]
Michael Lones. 2011. Sean luke: Essentials of metaheuristics. Genetic Program. Evol. Mach. 12, 3 (2011), 333--334.
[17]
David Minnen, Charles Isbell, Irfan Essa, and Thad Starner. 2007a. Detecting subdimensional motifs: An efficient algorithm for generalized multivariate pattern discovery. In Proceedings of the 7th IEEE International Conference on Data Mining (ICDM’07). IEEE Computer Society, Washington, DC, 601--606.
[18]
David Minnen, Charles L. Isbell, Irfan Essa, and Thad Starner. 2007b. Discovering multivariate motifs using subsequence density estimation and greedy mixture learning. In Proceedings of the 22nd National Conference on Artificial Intelligence - Volume 1 (AAAI’07). AAAI Press, 615--620. http://dl.acm.org/ citation.cfm?id=1619645.1619744
[19]
Yasser Mohammad and Toyoaki Nishida. 2014. Exact discovery of length-range motifs. In Intelligent Information and Database Systems, NgocThanh Nguyen, Boonwat Attachoo, Bogdan Trawiski, and Kulwadee Somboonviwat (Eds.). Lecture Notes in Computer Science, Vol. 8398. Springer International Publishing, 23--32.
[20]
Abdullah Mueen. 2015. Enumeration of time series motifs of all lengths. Knowledge and Information Systems Archive 45, 1 (Oct. 2015), 105--132.
[21]
Abdullah Mueen and Eamonn Keogh. 2010. Online discovery and maintenance of time series motifs. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). ACM, New York, NY, 1089--1098.
[22]
Abdullah Mueen, Eamonn Keogh, and Nima Bigdely-Shamlo. 2009a. Finding time series motifs in disk-resident data. In Proceedings of the 9th IEEE International Conference on Data Mining (ICDM’09). IEEE Computer Society, Washington, DC, 367--376.
[23]
Abdullah Mueen, Eamonn J. Keogh, Qiang Zhu, Sydney Cash, and M. Brandon Westover. 2009b. Exact discovery of time series motifs. In Proceedings of the SIAM International Conference on Data Mining (SDM’09). SIAM, 12 Pages.
[24]
T. Oates. 2002. PERUSE: An unsupervised algorithm for finding recurring patterns in time series. In Proceedings of the IEEE International Conference on Data Mining, 2002 (ICDM’03). IEEE, 330--337.
[25]
Pranav Patel, Eamonn Keogh, Jessica Lin, and Stefano Lonardi. 2002. Mining motifs in massive time series databases. In Proceedings of the IEEE International Conference on Data Mining (ICDM’02). IEEE Computer Society, Washington, DC, 370--377. http://dl.acm.org/citation.cfm?id=844380.844710
[26]
Majed Sahli, Essam Mansour, and Panos Kalnis. 2014. ACME: A scalable parallel system for extracting frequent patterns from a very long sequence. VLDB J. 23, 6 (2014), 871--893. 10.1007/s00778-014-0370-1
[27]
Zeeshan Syed, Collin Stultz, Manolis Kellis, Piotr Indyk, and John Guttag. 2010. Motif discovery in physiological datasets: A methodology for inferring predictive elements. ACM Trans. Knowl. Discov. Data 4, 1, Article 2 (Jan. 2010), 23 pages.
[28]
S. Tata and J. M. Patel. 2008. FLAME: Shedding light on hidden frequent patterns in sequence datasets. In Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE’08). IEEE, 1343--1345.
[29]
Alireza Vahdatpour, Navid Amini, and Majid Sarrafzadeh. 2009. Toward unsupervised activity discovery using multi-dimensional motif detection in time series. In Proceedings of the 21st International Joint Conference on Artifical Intelligence (IJCAI’09). Morgan Kaufmann Publishers Inc., San Francisco, CA, 1261--1266. http://dl.acm.org/citation.cfm?id=1661445.1661647
[30]
Dragomir Yankov, Eamonn Keogh, Jose Medina, Bill Chiu, and Victor Zordan. 2007. Detecting time series motifs under uniform scaling. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07). ACM, New York, NY, 844--853.
[31]
MyatSu Yin, Songsri Tangsripairoj, and Benjarath Pupacdi. 2014. Variable length motif-based time series classification. In Recent Advances in Information and Communication Technology, Sirapat Boonkrong, Herwig Unger, and Phayung Meesad (Eds.). Advances in Intelligent Systems and Computing, Vol. 265. Springer International Publishing, 73--82.
[32]
S. Yingchareonthawornchai, H. Sivaraks, T. Rakthanmanon, and C. A. Ratanamahatana. 2013. Efficient proper length time series motif discovery. In Proceedings of the 13th IEEE International Conference on Data Mining (ICDM). IEEE, 1265--1270.

Cited By

View all
  • (2024)Persistence-Based Motif Discovery in Time SeriesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.341730336:11(6814-6827)Online publication date: 1-Nov-2024
  • (2024)LoCoMotif: discovering time-warped motifs in time seriesData Mining and Knowledge Discovery10.1007/s10618-024-01032-z38:4(2276-2305)Online publication date: 1-Jul-2024
  • (2022)AppEKG: A Simple Unifying View of HPC Applications in Production2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)10.1109/PMBS56514.2022.00017(129-134)Online publication date: Nov-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 11, Issue 1
February 2017
288 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/2974720
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2016
Accepted: 01 May 2016
Revised: 01 April 2016
Received: 01 June 2015
Published in TKDD Volume 11, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Time series
  2. motifs
  3. repeated patterns

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Seventh Framework Programme of the European Commission, through the REDUCTION
  • Deutsche Forschungsgemeinschaft within the project HyLAP

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)1
Reflects downloads up to 08 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Persistence-Based Motif Discovery in Time SeriesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.341730336:11(6814-6827)Online publication date: 1-Nov-2024
  • (2024)LoCoMotif: discovering time-warped motifs in time seriesData Mining and Knowledge Discovery10.1007/s10618-024-01032-z38:4(2276-2305)Online publication date: 1-Jul-2024
  • (2022)AppEKG: A Simple Unifying View of HPC Applications in Production2022 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)10.1109/PMBS56514.2022.00017(129-134)Online publication date: Nov-2022
  • (2022)Window Size Selection in Unsupervised Time Series Analytics: A Review and BenchmarkAdvanced Analytics and Learning on Temporal Data10.1007/978-3-031-24378-3_6(83-101)Online publication date: 19-Sep-2022
  • (2021)Graph-Based Stock Recommendation by Time-Aware Relational Attention NetworkACM Transactions on Knowledge Discovery from Data10.1145/345139716:1(1-21)Online publication date: 20-Jul-2021
  • (2021)Fast data series indexing for in-memory dataThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00677-230:6(1041-1067)Online publication date: 18-Jun-2021
  • (2021)Unsupervised and scalable subsequence anomaly detection in large data seriesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00655-830:6(909-931)Online publication date: 3-Mar-2021
  • (2020)Matrix profile goes MAD: variable-length motif and discord discovery in data seriesData Mining and Knowledge Discovery10.1007/s10618-020-00685-w34:4(1022-1071)Online publication date: 7-May-2020
  • (2019)Discovering Optimal Variable-length Time Series Motifs in Large-scale Wearable Recordings of Human Bio-behavioral SignalsICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2019.8682427(7615-7619)Online publication date: May-2019
  • (2019)MACD-Histogram-based Fully Convolutional Neural Networks for Classifying Time Series2019 6th International Conference on Control, Decision and Information Technologies (CoDIT)10.1109/CoDIT.2019.8820629(1049-1054)Online publication date: Apr-2019
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media