Multivariate Time Series Anomaly Detection: Fancy Algorithms and Flawed Evaluation Methodology

El Amine Sehili, Mohamed; Zhang, Zonghua

doi:10.1007/978-3-031-68031-1_1

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14247))

Included in the following conference series:

Technology Conference on Performance Evaluation and Benchmarking

324 Accesses

Abstract

Multivariate Time Series (MVTS) anomaly detection is a long-standing and challenging research topic that has attracted tremendous research effort from both industry and academia in recent years. However, a careful study of the literature makes us realize that 1) the community is active but not as organized as other sibling machine learning communities such as Computer Vision (CV) and Natural Language Processing (NLP), and 2) most proposed solutions are evaluated using either inappropriate or highly flawed protocols, with an apparent lack of scientific foundation. So flawed is one very popular protocol, the so-called point-adjust protocol, that a random guess can be shown to systematically outperform all algorithms developed so far. In this paper, we review and evaluate a number of recent algorithms using more robust protocols and discuss how a normally good protocol may have weaknesses in the context of MVTS anomaly detection and how to mitigate them. We also share our concerns about benchmark datasets, experiment design and evaluation methodology we observe in many works. Furthermore, we propose a simple, yet challenging, baseline algorithm based on Principal Components Analysis (PCA) that surprisingly outperforms many recent deep learning based approaches on popular benchmark datasets. The main objective of this work is to stimulate more effort towards important aspects of the research such as data, experiment design, evaluation methodology and result interpretability, as opposed to putting the highest weight on the design of increasingly more complex and “fancier” algorithms (Code repository associated with this paper can be found at https://github.com/amsehili/MVTSEvalPaper).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Assumption based on the description shared by datasets’ publishers. In any case, no anomaly labels are provided for the training fold of these datasets, and almost all approaches using them are unsupervised.
2.
These values are not arbitrary but are close to what we observe in popular benchmark datasets, especially for a $A=500$.
3.
Machine learning students are usually advised to use F1 as a better alternative to the accuracy score for unbalanced datasets because using the latter would yield a high score for a trivial algorithm that predicts the dominant class for every input.
4.
This situation has subtle links to the one of imbalanced datasets for which the F1 score is usually recommended in the first place.
5.
By interpretation we mean understandinghow many anomalous events the algorithm detects and how many false alarm it raises. This has nothing to do with model interpretability whose goal is to answer why/how an algorithm made a given decision.
6.
Also check out this issue and related ones on AnomalyTransformer’s official repository: https://github.com/thuml/Anomaly-Transformer/issues/34.

References

Abdulaal, A., Liu, Z., Lancewicki, T.: Practical approach to asynchronous multivariate time series anomaly detection and localization. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 2485–2494 (2021)
Google Scholar
Ahmed, C.M., Palleti, V.R., Mathur, A.P.: Wadi: a water distribution testbed for research in the design of secure cyber physical systems. In: Proceedings of the 3rd International Workshop on Cyber-Physical Systems for Smart Water Networks, pp. 25–28 (2017)
Google Scholar
Audibert, J., Michiardi, P., Guyard, F., Marti, S., Zuluaga, M.A.: USAD: unsupervised anomaly detection on multivariate time series. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3395–3404 (2020)
Google Scholar
Carmona, C.U., Aubet, F.X., Flunkert, V., Gasthaus, J.: Neural contextual anomaly detection for time series. arXiv preprint arXiv:2107.07702 (2021)
Chen, X., et al.: Daemon: unsupervised anomaly detection and interpretation for multivariate time series. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp. 2225–2230. IEEE (2021)
Google Scholar
Chen, Z., Chen, D., Zhang, X., Yuan, Z., Cheng, X.: Learning graph structures with transformer for multivariate time series anomaly detection in IoT. IEEE Internet Things J. 9, 9179–9189 (2021)
Article Google Scholar
Deng, A., Hooi, B.: Graph neural network-based anomaly detection in multivariate time series. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 4027–4035 (2021)
Google Scholar
Garg, A., Zhang, W., Samaran, J., Savitha, R., Foo, C.S.: An evaluation of anomaly detection and diagnosis in multivariate time series. IEEE Trans. Neural Netw. Learn. Syst. 33(6), 2508–2517 (2021)
Article MathSciNet Google Scholar
Han, S., Woo, S.S.: Learning sparse latent graph representations for anomaly detection in multivariate time series. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 2977–2986 (2022)
Google Scholar
Hundman, K., Constantinou, V., Laporte, C., Colwell, I., Soderstrom, T.: Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 387–395 (2018)
Google Scholar
Kim, S., Choi, K., Choi, H.S., Lee, B., Yoon, S.: Towards a rigorous evaluation of time-series anomaly detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, no. 7, pp. 7194–7201 (2022)
Google Scholar
Mathur, A.P., Tippenhauer, N.O.: Swat: a water treatment testbed for research and training on ICS security. In: 2016 International Workshop on Cyber-Physical Systems for Smart Water Networks (CySWater), pp. 31–36. IEEE (2016)
Google Scholar
Pan, J., Ji, W., Zhong, B., Wang, P., Wang, X., Chen, J.: Duma: dual mask for multivariate time series anomaly detection. IEEE Sensors J. (2022)
Google Scholar
Su, Y., Zhao, Y., Niu, C., Liu, R., Sun, W., Pei, D.: Robust anomaly detection for multivariate time series through stochastic recurrent neural network. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2828–2837 (2019)
Google Scholar
Tuli, S., Casale, G., Jennings, N.R.: TranAD: deep transformer networks for anomaly detection in multivariate time series data. Proc. VLDB 15(6), 1201–1214 (2022)
Article Google Scholar
Wu, R., Keogh, E.: Current time series anomaly detection benchmarks are flawed and are creating the illusion of progress. IEEE Trans. Knowl. Data Eng. 35, 2421–2429 (2021)
Google Scholar
Xu, J., Wu, H., Wang, J., Long, M.: Anomaly transformer: time series anomaly detection with association discrepancy. arXiv preprint arXiv:2110.02642 (2021)
Zhang, C., et al.: A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 1409–1416 (2019)
Google Scholar
Zhang, W., Zhang, C., Tsung, F.: Grelen: multivariate time series anomaly detection from the perspective of graph relational learning. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-2022, pp. 2390–2397 (2022)
Google Scholar
Zong, B., et al.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Huawei Technologies France, 8 Quai du Point du Jour, 92100, Boulogne-Billancourt, France
Mohamed El Amine Sehili & Zonghua Zhang

Authors

Mohamed El Amine Sehili
View author publications
You can also search for this author in PubMed Google Scholar
Zonghua Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zonghua Zhang .

Editor information

Editors and Affiliations

Advanced Micro Devices Inc., Santa Clara, CA, USA
Raghunath Nambiar
Oracle Corporation, Redwood Shores, CA, USA
Meikel Poess

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

El Amine Sehili, M., Zhang, Z. (2024). Multivariate Time Series Anomaly Detection: Fancy Algorithms and Flawed Evaluation Methodology. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. TPCTC 2023. Lecture Notes in Computer Science, vol 14247. Springer, Cham. https://doi.org/10.1007/978-3-031-68031-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-68031-1_1
Published: 22 September 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-68030-4
Online ISBN: 978-3-031-68031-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multivariate Time Series Anomaly Detection: Fancy Algorithms and Flawed Evaluation Methodology