research-article

LRZ Convolution: An Algorithm for Automatic Anomaly Detection in Time-series Data

Author:
Arunprasad P. Marathe

Huawei Technologies Canada, Canada

Huawei Technologies Canada, Canada
View Profile

SSDBM '20: Proceedings of the 32nd International Conference on Scientific and Statistical Database ManagementJuly 2020Article No.: 1Pages 1–12https://doi.org/10.1145/3400903.3400904

Published:30 July 2020Publication History

SSDBM '20: Proceedings of the 32nd International Conference on Scientific and Statistical Database Management

Pages 1–12

ABSTRACT

Automatic anomaly detection is a hard but practically useful problem. With telemetry data sizes growing constantly, experts will rely increasingly on automation to bring anomalies to their attention. In this paper, anomaly transition points (called change points elsewhere), are determined using a novel application of a somewhat obscure statistical score called “z-score of mean difference”. Use of this score yields a practical linear-time algorithm called LRZ Convolution with sound statistical underpinnings, and which does not require data normality. Each anomaly transition point is accompanied by a set of explanatory predicates that can form a good starting point for determining an anomaly’s root causes. Careful experimental evaluation and performance in two independent domains show promising results. A preliminary comparison with a well-known machine learning algorithm called Support Vector Machines (SVM) yields a highly favorable outcome.

References

2014 People’s Climate March 2019. Wikipedia. Retrieved January 22, 2020 from https://en.wikipedia.org/wiki/2014_People%27s_Climate_MarchGoogle Scholar
Sabyasachi Basu and Martin Meckesheimer. 2007. Automatic outlier detection for time series: an application to sensor data. Knowledge and Information Systems 11, 2 (2007), 137–154. https://doi.org/10.1007/s10115-006-0026-6Google ScholarCross Ref
Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM Comput. Surv. 41, 3 (2009), 15:1–15:58. https://doi.org/10.1145/1541880.1541882Google ScholarDigital Library
Tamraparni Dasu, Shankar Krishnan, Suresh Venkatasubramanian, and Ke Yi. 2006. An information-theoretic approach to detecting changes in multi-dimensional data streams. In In Proc. Symp. on the Interface of Statistics, Computing Science, and Applications.Google Scholar
Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and Philippe Cudré-Mauroux. 2013. OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases. PVLDB 7, 4 (2013), 277–288. https://doi.org/10.14778/2732240.2732246Google ScholarDigital Library
Sudipto Guha, Nina Mishra, Gourav Roy, and Okke Schrijvers. 2016. Robust Random Cut Forest Based Anomaly Detection On Streams. In Proceedings of the 33rd International Conference on Machine Learning, Vol. 48. 2712–2721.Google ScholarDigital Library
Victoria J. Hodge and Jim Austin. 2004. A Survey of Outlier Detection Methodologies. Artificial Intelligence Review 22 (2004), 85–126.Google ScholarDigital Library
Daniel Kifer, Shai Ben-David, and Johannes Gehrke. 2004. Detecting Change in Data Streams. In (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004. 180–191. https://doi.org/10.1016/B978-012088469-8.50019-XGoogle Scholar
NYC Taxi & Limousine Commission 2020. TLC Trip Record Data. Retrieved January 21, 2020 from https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.pageGoogle Scholar
Scikit Learn 2019. Novelty and Outlier Detection. Retrieved May 12, 2020 from https://scikit-learn.org/stable/modules/outlier_detection.html#outlier-detectionGoogle Scholar
Scikit Learn 2019. Support Vector Machines. Retrieved May 12, 2020 from https://scikit-learn.org/stable/modules/svm.htmlGoogle Scholar
The Comprehensive R Archive Network 2019. Two sample Z-tests. Retrieved January 7, 2020 from https://cran.r-project.org/web/packages/distributions3/vignettes/two-sample-z-test.htmlGoogle Scholar
Transaction Processing Performance Council 1992. TPC-C. Retrieved January 7, 2020 from http://www.tpc.org/tpcc/Google Scholar
Voice of America 2015. New York Police Department Funeral - January 4, 2015. Retrieved January 23, 2020 from https://www.voacambodia.com/a/new-york-police-department-funeral-january-4-2015/2585277.htmlGoogle Scholar
Charles J. Wheelan. 2013. Naked Statistics. W. W. Norton & Company, New York, NY.Google Scholar
Wieërs, Dag 2016. Dstat: Versatile resource statistics tool. Retrieved January 7, 2020 from http://dag.wiee.rs/personal/Google Scholar
Wikipedia 2019. Kernel (image processing). Retrieved January 13, 2020 from https://en.wikipedia.org/wiki/Kernel_(image_processing)Google Scholar
Wikipedia 2020. Apache Spark. Retrieved May 13, 2020 from https://en.wikipedia.org/wiki/Apache_SparkGoogle Scholar
Wikipedia 2020. Signal processing. Retrieved January 14, 2020 from https://en.wikipedia.org/wiki/Signal_processingGoogle Scholar
Dong Young Yoon, Ning Niu, and Barzan Mozafari. 2016. DBSherlock: A Performance Diagnostic Tool for Transactional Databases. In Proceedings of the 2016 International Conference on Management of Data. 1599–1614. https://doi.org/10.1145/2882903.2915218Google ScholarDigital Library

Recommendations

Deep learning for anomaly detection in multivariate time series: Approaches, applications, and challenges
Abstract
Anomaly detection has recently been applied to various areas, and several techniques based on deep learning have been proposed for the analysis of multivariate time series. In this study, we classify the anomalies into three types, ...
Highlights
- The methods for anomaly detection on multivariate time series are reviewed.
- The ...
Read More
Exact variable-length anomaly detection algorithm for univariate and multivariate time series

The problem of anomaly detection in time series has received a lot of attention in the past two decades. However, existing techniques cannot locate where the anomalies are within anomalous time series, or they require users to provide the length of ...
Read More
Reconstruct Anomaly to Normal: Adversarially Learned and Latent Vector-Constrained Autoencoder for Time-Series Anomaly Detection
PRICAI 2021: Trends in Artificial Intelligence
Abstract
Time-series Anomaly Detection has important applications, such as credit card fraud detection and machine fault detection. Anomaly detection based on the generative model generally detect samples with high reconstruction errors as anomalies. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

SSDBM '20: Proceedings of the 32nd International Conference on Scientific and Statistical Database Management
July 2020
241 pages
ISBN:9781450388146
DOI:10.1145/3400903

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 30 July 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
anomaly detection
convolution
data science
experimentation
performance measurement
statistical significance
time-series data
z-score
z-score of mean difference
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate56of146submissions,38%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 141
  Total Downloads
- Downloads (Last 12 months)14
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

LRZ Convolution: An Algorithm for Automatic Anomaly Detection in Time-series Data

SSDBM '20: Proceedings of the 32nd International Conference on Scientific and Statistical Database Management

ABSTRACT

References

Cited By

Recommendations

Deep learning for anomaly detection in multivariate time series: Approaches, applications, and challenges

Exact variable-length anomaly detection algorithm for univariate and multivariate time series

Reconstruct Anomaly to Normal: Adversarially Learned and Latent Vector-Constrained Autoencoder for Time-Series Anomaly Detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

LRZ Convolution: An Algorithm for Automatic Anomaly Detection in Time-series Data

SSDBM '20: Proceedings of the 32nd International Conference on Scientific and Statistical Database Management

ABSTRACT

References

Cited By

Recommendations

Deep learning for anomaly detection in multivariate time series: Approaches, applications, and challenges

Exact variable-length anomaly detection algorithm for univariate and multivariate time series

Reconstruct Anomaly to Normal: Adversarially Learned and Latent Vector-Constrained Autoencoder for Time-Series Anomaly Detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media