ABSTRACT
This paper presents a novel methodology based on first principles of statistics and statistical learning for anomaly detection in industrial processes and IoT environments. We present a 5-level analytical pipeline that cleans, smooths, and eliminates redundancies from the data, and identifies outliers as well as the features that contribute most to these anomalies. We show how smoothing can make our methodology less sensitive to short-lived anomalies that might be, e.g., due to sensor noise. We validate the methodology on a dataset freely available in the literature. Our results show that we can identify all anomalies in the considered dataset, with the ability of controlling the amount of false positives. This work is the result of a research project co-funded by the Tuscany Region and a company leader in the paper and nonwovens sector. Although the methodology was developed for this domain, we consider here a dataset from a different industrial sector. This shows that our methodology can be generalized to other contexts with similar constraints on limited resources, interpretability, time, and budget.
- Alsmeyer, G. Chebyshev's Inequality. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011, pp. 239--240.Google Scholar
- Blázqez-García, A., Conde, A., Mori, U., and Lozano, J. A. A review on outlier/anomaly detection in time series data. ACM Computing Surveys (CSUR) 54, 3 (2021), 1--33.Google Scholar
- Cabana, E., Lillo, R. E., and Laniado, H. Multivariate outlier detection based on a robust mahalanobis distance with shrinkage estimators. Statistical Papers 62, 4 (nov 2019), 1583--1609.Google ScholarCross Ref
- Campbell, N. A. Robust procedures in multivariate analysis i: Robust covariance estimation. Journal of the Royal Statistical Society. Series C (Applied Statistics) 29, 3 (1980), 231--237.Google Scholar
- Choi, K., Yi, J., Park, C., and Yoon, S. Deep learning for anomaly detection in time-series data: review, analysis, and guidelines. IEEE Access (2021).Google Scholar
- Craney, T. A., and Surles, J. G. Model-dependent variance inflation factor cutoff values. Quality Engineering 14, 3 (2002), 391--403.Google ScholarCross Ref
- Garthwaite, P., and Koch, I. Evaluating the contributions of individual variables to a quadratic form. Australian & New Zealand Journal of Statistics 58 (03 2016).Google ScholarCross Ref
- Hubert, M., and Van der Veeken, S. Outlier detection for skewed data. Journal of Chemometrics 22, 3--4 (2008), 235--246.Google ScholarCross Ref
- James, G., Witten, D., Hastie, T., and Tibshirani, R. An introduction to statistical learning: with applications in r.Google Scholar
- Kamoi, R., and Kobayashi, K. Why is the mahalanobis distance effective for anomaly detection?, 2020.Google Scholar
- Mahalanobis, P. C. On the generalized distance in statistics. National Institute of Science of India.Google Scholar
- Maronna, R. A., Martin, R. D., Yohai, V. J., and Salibián-Barrera, M. Robust statistics: theory and methods (with R). John Wiley & Sons, 2019.Google Scholar
- Rousseeuw, P. J., and Leroy, A. M. Robust regression and outlier detection. John Wiley & sons, 2005.Google Scholar
- Rousseeuw, P. J., and van Zomeren, B. C. Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association 85, 411 (1990), 633--639.Google Scholar
- Su, Y., Zhao, Y., Niu, C., Liu, R., Sun, W., and Pei, D. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & amp; Data Mining (New York, NY, USA, 2019), KDD '19, Association for Computing Machinery, p. 2828--2837.Google Scholar
- Tao, L., Liu, H., Zhang, J., Su, X., Li, S., Hao, J., Lu, C., Suo, M., and Wang, C. Associated Fault Diagnosis of Power Supply Systems Based on Graph Matching: A Knowledge and Data Fusion Approach. Mathematics 10, 22 (November 2022), 1--28.Google Scholar
- Tiku, M. L., Islam, M. Q., and Qumsiyeh, S. B. Mahalanobis distance under non-normality. Statistics 44, 3 (2010), 275--290.Google ScholarCross Ref
- Todeschini, R., Ballabio, D., Consonni, V., Sahigara, F., and Filzmoser, P. Locally centred mahalanobis distance: A new distance measure with salient features towards outlier detection. Analytica Chimica Acta 787 (2013), 1--9.Google ScholarCross Ref
- Wölfel, M., and Ekenel, H. K. Feature weighted mahalanobis distance: Improved robustness for gaussian classifiers. In 2005 13th European Signal Processing Conference (2005), pp. 1--4.Google Scholar
Index Terms
- Towards Novel Statistical Methods for Anomaly Detection in Industrial Processes
Recommendations
Network anomaly detection based on probabilistic analysis
In this paper, we propose a method to detect network intrusions using anomaly detection technique based on probabilistic analysis. Victim's computers under attack show various symptoms such as degradation of TCP throughput, increase in CPU usage, ...
Applications of fault detection methods to industrial processes
Components of industrial processes are often affected by un-permitted or un-expected deviations from normal operation behaviour. The fault detection task consists of determination of the fault present in a system and the time of detection. In addition ...
State-aware anomaly detection for industrial control systems
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied ComputingAnomaly detection for industrial control systems (ICS) can leverage process data to detect malicious derivations from expected process behavior. We propose state-aware anomaly detection that uses state dependent detection thresholds, which provide ...
Comments