skip to main content
10.1145/3534678.3539271acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

Matrix Profile XXIV: Scaling Time Series Anomaly Detection to Trillions of Datapoints and Ultra-fast Arriving Data Streams

Published: 14 August 2022 Publication History

Abstract

Time series anomaly detection remains one of the most active areas of research in data mining. In spite of the dozens of creative solutions proposed for this problem, recent empirical evidence suggests that time series discords, a relatively simple twenty-year old distance-based technique, remains among the state-of-art techniques. While there are many algorithms for computing the time series discords, they all have limitations. First, they are limited to the batch case, whereas the online case is more actionable. Second, these algorithms exhibit poor scalability beyond tens of thousands of datapoints. In this work we introduce DAMP, a novel algorithm that addresses both these issues. DAMP computes exact left-discords on fast arriving streams, at up to 300,000 Hz using a commodity desktop. This allows us to find time series discords in datasets with trillions of datapoints for the first time. We will demonstrate the utility of our algorithm with the most ambitious set of time series anomaly detection experiments ever conducted.

Supplemental Material

MP4 File
There is increasing independent evidence that time series discords (also known as Matrix Profile) are state-of-the-art for anomaly detection. The Matrix Profile is reasonably fast up to say 100,000 datapoints, but what if you have a million, or a billion, or a trillion datapoints? Good news! You can now handle such massive datasets, or data streams arriving at over 100,000 Hz, using a novel anomaly detection framework, called DAMP (Discord Aware Matrix Profile). DAMP computes exact left-discords on fast arriving streams, at up to 300,000 Hz on a commodity desktop. We will demonstrate the utility of our algorithm with the most ambitious set of time series anomaly detection experiments ever conducted.

References

[1]
Aubet, F., Zügner, D. and Gasthaus, J. Monte Carlo EM for Deep Time Series Anomaly Detection. arXiv preprint arXiv:2112.14436.
[2]
Audibert, J., Marti, S., Guyard, F. and Zuluaga, M.A., From Univariate to Multivariate Time Series Anomaly Detection with Non-Local Information. in Advanced Analytics and Learning on Temporal Data, (2021), Springer, 186--194.
[3]
Boniol, P., et. al. Unsupervised and scalable subsequence anomaly detection in large data series. The VLDB Journal. 1--23.
[4]
Case Western Reserve University Bearing Data Center. Accessed: Nov. 15, 2021. [Online]. Available: https://csegroups.case.edu/ bearingdatacenter/home
[5]
CNC Crashes. Video. (15 Feb 2018). Retrieved December 20, 2021 from https://youtu.be/t2tBtZCa7j4?t=205
[6]
DAMP (2022). https://sites.google.com/view/discord-aware-matrix-profile
[7]
Higham, Nicholas (2002). Accuracy and Stability of Numerical Algorithms (2 ed). ISBN: 978-0--89871--521--7
[8]
Hundman, K., et al. Detecting Spacecraft Anomalies Using LSTMs and Nonparametric Dynamic Thresholding. in, (2018), ACM SIGKDD, 387--395.
[9]
Kirti, R. and Karadi, R. Cardiac tamponade: atypical presentations after cardiac surgery. Acute medicine, 11 (2). 93--96.
[10]
Mueen, A., et. al. The fastest similarity search algorithm for time series under Euclidean distance. Retrieved January, 2022 from www.cs.unm.edu/~mueen/FastestSimilaritySearch.html
[11]
Nakamura, T., Imamura, M., Mercer, R. and Keogh, E., MERLIN: Parameter-Free Discovery of Arbitrary Length Anomalies in Massive Time Series Archives. in, (2020), IEEE, 1190--1195.
[12]
Neupane, D. and Seok, J. Bearing Fault Detection and Diagnosis Using Case Western Reserve University Dataset With Deep Learning Approaches: A Review. IEEE Access, 8. 93155--93178.
[13]
Thill, M., Konen, W. and Bäck, T., Time Series Encodings with Temporal Convolutional Networks. in Bioinspired Methods and Their Applications, (2020), Springer, 161--173.
[14]
Wastewater News. Valentine's Day Storm Slams California, Pushing Water Agencies to the Edge. Retrieved Dec 1 2021 from www.news.cornell.edu/Chronicle/00/5.18.00/wireless_class.html
[15]
Wikipedia. Leap year problem. Retrieved December 1, 2021 from https://en.wikipedia.org/wiki/Leap_year_problem
[16]
Wu, R. and Keogh, E. Current Time Series Anomaly Detection Benchmarks are Flawed and are Creating the Illusion of Progress. IEEE TKDE (2021) 1.
[17]
Zheng, X., et al. PSML: A Multi-scale Dataset for Machine Learning in Decarbonized Energy Grids. arXiv:2110.06324.
[18]
Zhu, Y., Yeh, C.M., Zimmerman, Z., Kamgar, K. and Keogh, E., Matrix profile XI: SCRIMP++: time series motif discovery at interactive speeds. in 2018 IEEE ICDM, (2018), IEEE, 837--846.

Cited By

View all
  • (2025)A severe local flood and social events show a similar impact on human mobilitynpj Complexity10.1038/s44260-025-00030-62:1Online publication date: 18-Feb-2025
  • (2025)Enhancement of the Local Outlier Factor Algorithm for Anomaly Detection in Time SeriesDynamics of Information Systems10.1007/978-3-031-81010-7_12(171-188)Online publication date: 26-Feb-2025
  • (2024)A Survey of Advanced Border Gateway Protocol Attack Detection TechniquesSensors10.3390/s2419641424:19(6414)Online publication date: 3-Oct-2024
  • Show More Cited By

Index Terms

  1. Matrix Profile XXIV: Scaling Time Series Anomaly Detection to Trillions of Datapoints and Ultra-fast Arriving Data Streams

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    August 2022
    5033 pages
    ISBN:9781450393850
    DOI:10.1145/3534678
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 August 2022

    Check for updates

    Author Tags

    1. anomaly detection
    2. streaming data
    3. time series

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    KDD '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,020
    • Downloads (Last 6 weeks)157
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)A severe local flood and social events show a similar impact on human mobilitynpj Complexity10.1038/s44260-025-00030-62:1Online publication date: 18-Feb-2025
    • (2025)Enhancement of the Local Outlier Factor Algorithm for Anomaly Detection in Time SeriesDynamics of Information Systems10.1007/978-3-031-81010-7_12(171-188)Online publication date: 26-Feb-2025
    • (2024)A Survey of Advanced Border Gateway Protocol Attack Detection TechniquesSensors10.3390/s2419641424:19(6414)Online publication date: 3-Oct-2024
    • (2024)Anomaly Detection in Gas Turbines Using Outlet Energy Analysis with Cluster-Based Matrix ProfileEnergies10.3390/en1703065317:3(653)Online publication date: 30-Jan-2024
    • (2024)METER: A Dynamic Concept Adaptation Framework for Online Anomaly DetectionProceedings of the VLDB Endowment10.14778/3636218.363623317:4(794-807)Online publication date: 5-Mar-2024
    • (2024)CutAddPaste: Time Series Anomaly Detection by Exploiting Abnormal KnowledgeProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671739(3176-3187)Online publication date: 25-Aug-2024
    • (2024)Lithium-Ion Battery State of Health Estimation by Matrix Profile Empowered Online Knee Onset IdentificationIEEE Transactions on Transportation Electrification10.1109/TTE.2023.326598110:1(1935-1946)Online publication date: Mar-2024
    • (2024)Diner: Interpretable Anomaly Detection for Seasonal Time Series in Web ServicesIEEE Transactions on Services Computing10.1109/TSC.2024.342289417:5(2248-2260)Online publication date: Sep-2024
    • (2024)Adversarial Graph Neural Network for Multivariate Time Series Anomaly DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.341989136:12(7612-7626)Online publication date: Dec-2024
    • (2024)Matrix Profile for Anomaly Detection on Multidimensional Time Series2024 IEEE International Conference on Data Mining (ICDM)10.1109/ICDM59182.2024.00114(911-916)Online publication date: 9-Dec-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media