Statistical technique for online anomaly detection using Spark over heterogeneous data from multi-source VMware performance data | IEEE Conference Publication | IEEE Xplore

Statistical technique for online anomaly detection using Spark over heterogeneous data from multi-source VMware performance data


Abstract:

Anomaly detection refers to the identification of patterns in a dataset that do not conform to expected patterns. Depending on the domain, the non-conformant patterns are...Show More

Abstract:

Anomaly detection refers to the identification of patterns in a dataset that do not conform to expected patterns. Depending on the domain, the non-conformant patterns are assigned various tags, e.g. anomalies, outliers, exceptions, malwares and so forth. Online anomaly detection aims to detect anomalies in data flowing in a streaming fashion. Such stream data is commonplace in today's cloud data centers that house a large array of virtual machines(VM) producing vast amounts of performance data in real-time. Sophisticated detection mechanism will likely entail collation of data from heterogeneous sources with diversified data format and semantics. Therefore, detection of performance anomaly in this context requires a distributed framework with high throughput and low latency. Apache Spark is one such framework that represents the bleeding-edge amongst its contemporaries. In this paper, we have taken up the challenge of anomaly detection in VMware based cloud data centers. We have employed a Chi-square based statistical anomaly detection technique in Spark. We have demonstrated how to take advantage of the high processing power of Spark to perform anomaly detection on heterogeneous data using statistical techniques. Our approach is optimally designed to cope with the heterogeneity of input data streams and the experiments we conducted testify to its efficacy in online anomaly detection.
Date of Conference: 27-30 October 2014
Date Added to IEEE Xplore: 08 January 2015
Electronic ISBN:978-1-4799-5666-1
Conference Location: Washington, DC, USA

Contact IEEE to Subscribe

References

References is not available for this document.