Skip to main content
Log in

Semisupervised anomaly detection of multivariate time series based on a variational autoencoder

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In a large-scale cloud environment, many key performance indicators (KPIs) of entities are monitored in real time. These multivariate time series consist of high-dimensional, high-noise, random and time-dependent data. As a common method implemented in artificial intelligence for IT operations (AIOps), time series anomaly detection has been widely studied and applied. However, the existing detection methods cannot fully consider the influence of multiple factors and cannot quickly and accurately detect anomalies in multivariate KPIs of entities. Concurrently, fine-grained root cause locations cannot be determined for detected anomalies and often require abundant normal data that are difficult to obtain for model training. To solve these problems, we propose a long short-term memory (LSTM)-based semisupervised variational autoencoder (VAE) anomaly detection strategy called LR-SemiVAE. First, LR-SemiVAE uses VAE to perform feature dimension reduction and reconstruction of multivariate time series data and judges whether the entity is abnormal by calculating the reconstruction probability score. Second, by introducing an LSTM network into the VAE encoder and decoder, the model can fully learn the time dependence of multivariate time series. Then, LR-SemiVAE predicts the data labels by introducing a classifier to reduce the dependence on the original labeled data during model training. Finally, by proposing a new evidence lower bound (ELBO) loss function calculation method, LR-SemiVAE pays attention to the normal pattern and ignores the abnormal pattern during training to reduce the time cost of removing random anomaly and noise data. However, due to the limitations of LSTM in learning the long-term dependence of time series data, based on LR-SemiVAE, we propose a transformer-based semisupervised VAE anomaly detection and location strategy called RT-SemiVAE for cluster systems with complex service dependencies. This method learns the long-term dependence of multivariate time series by introducing a parallel multihead attention mechanism transformer, while LSTM is used to capture short-term dependence, and the introduction of parallel computing also markedly reduces model training time. After RT-SemiVAE detects entity anomalies, it traces the root entities according to the obtained service dependence graph and locates the root causes at the indicator level. We verify the strategies by using public data sets and constructing a system prototype. Experimental results show that compared with existing baseline methods, the LR-SemiVAE and RT-SemiVAE strategies can detect anomalies more quickly and accurately and perform fine-grained and accurate localization of the root causes of anomalies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. https://webscope.sandbox.yahoo.com/catalog.php?datatype=s&did=70

  2. https://github.com/microservices-demo/microservices-demo

  3. https://github.com/chaosblade-io/chaosblade

References

  1. Borghesi A et al (2019) A semisupervised autoencoder-based approach for anomaly detection in high performance computing systems. Eng Appl Artif Intell 85:634–644

    Article  Google Scholar 

  2. Notaro P, Cardoso J, Gerndt M (2021) A survey of AIOps methods for failure management. ACM Trans on Intell Sys and Tech (TIST) 12.6:1–45

    Google Scholar 

  3. He S et al (2021) A survey on automated log analysis for reliability engineering. ACM Comp Surveys (CSUR) 54.6:1–37

    Google Scholar 

  4. Yadav RB, Kumar PS, Dhavale SV (2020) A survey on log anomaly detection using deep learning. 2020 8th International Conference on Reliability, Infocom Technologies and Optimi zation (Trends and Future directions)(ICRITO). IEEE

  5. Blázquez-García A et al (2021) A review on outlier/anomaly detection in time series data. ACM Comp Surveys (CSUR) 54.3:1–33

    Google Scholar 

  6. Kingma DP, Welling M (2014) Auto-encoding variational bayes. In: ICLR

  7. Goodfellow I et al (2014) Generative adversarial nets. Advances in neural information processing systems. vol 27

  8. Hundman K et al (2018) Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining

  9. Li D et al (2019) MAD-GAN: multivariate anomaly detection for time series data with generative adversarial networks. International conference on artificial neural networks. Springer, Cham

  10. Park D, Hoshi Y, Kemp CC (2018) A multimodal anomaly detector for robot-assisted feeding using an lstm-based variational autoencoder. IEEE Robotics and Automation Lett 3.3:1544–1551

    Article  Google Scholar 

  11. Lin S et al (2020) Anomaly detection for time series using vae-lstm hybrid model. ICASSP 2020-2020 IEEE international conference on acoustics speech and signal processing. (ICASSP) IEEE

  12. Niu Z, Yu K, Wu X (2020) LSTM-Based VAE-GAN for time-series anomaly detection. Sensors 20.13:3738

    Article  Google Scholar 

  13. Razavi-Far R et al (2018) Information fusion and semi-supervised deep learning scheme for diagnosing gear faults in induction machine systems. IEEE Trans on Industrial Elect 66.8:6331–6342

    Google Scholar 

  14. Xu Haowen et al (2018) Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications. Proceedings of the 2018 world wide web conference

  15. Lindemann B‘ et al (2021) A survey on anomaly detection for technical systems using LSTM networks. Comp in Industry 131:103498

    Article  Google Scholar 

  16. Ergen T, Kozat SS (2019) Unsupervised anomaly detection with LSTM neural networks. IEEE Trans on Neural Networks and Learning Sys 31.8:3127–3141

    MathSciNet  Google Scholar 

  17. Zhou X et al (2020) Variational LSTM enhanced anomaly detection for industrial big data. IEEE Trans on Industrial Informatics 17.5:3469–3477

    Google Scholar 

  18. Huang F et al (2018) Multimodal network embedding via attention based multi-view variational autoencoder. Proceedings of the 2018 ACM on international conference on multimedia retrieval

  19. Lin S et al (2020) Anomaly detection for time series using vae-lstm hybrid model. ICASSP 2020-2020 IEEE international conferenc on acoustics, speech and signal processing (ICASSP). IEEE

  20. Maleki S, Maleki S, Jennings NR (2021) Unsupervised anomaly detection with LSTM autoencoders using statistical data-filtering. Applied Soft Computing 108:107443

    Article  Google Scholar 

  21. Geiger A et al (2020) TadGAN: time series anomaly detection using generative adversarial networks. 2020 IEEE international conference on big data (Big Data). IEEE

  22. Bashar MA, Nayak R (2020) TANoGAN: time series anomaly detection with generative adversarial networks. 2020 IEEE symposium series on computational intelligence (SSCI). IEEE

  23. Vaswani A et al (2017) Attention is all you need. Advances in neural information processing systems. vol 30

  24. Phongwattana T, Chan JH (2019) Development of biomedical corpus enlargement platform using BERT for bio-entity recognition. International conference on neural information processing. Springer Cham

  25. He J et al (2019) HSI-BERT: hyperspectral image classification using the bidirectional encoder representation from transformers. IEEE Trans on Geoscience and Remote Sensing 58.1:165–178

    Google Scholar 

  26. Ziyu Z, Wang Q (2019) R-transformer network based on position and self-attention mechanism for aspect-level sentiment classification. IEEE Access 7:127754–127764

    Article  Google Scholar 

  27. Bian J et al (2019) A novel and efficient CVAE-GAN-based approach with informative manifold for semi-supervised anomaly detection. IEEE Access 7:88903–88916

    Article  Google Scholar 

  28. Das A et al (2020) An End-to-End Approach for Benchmarking Time-Series Models Using Autoencoders. Proceedings of the Global AI Congress 2019. Springer, Singapore

    Google Scholar 

  29. Zhang S et al (2020) Semi-supervised bearing fault diagnosis and classification using variational autoencoder-based deep generative models. IEEE Sensors J 21.5:6476–6486

    Google Scholar 

  30. Song M, Zhang C, Haihong E (2018) An auto scaling system for API gateway based on Kubernetes. 2018 IEEE 9th international conference on software engineering and service science (ICSESS). IEEE

  31. Chang C-C et al (2017) A kubernetes-based monitoring platform for dynamic cloud resource provisioning. GLOBECOM 2017-2017 IEEE global communications conference. IEEE

  32. Lee J-W et al (2019) Collaborative distillation for top-N recommendation. 2019 IEEE international conference on data mining (ICDM). IEEE

  33. Wu Z et al (2020) A comprehensive survey on graph neural networks. IEEE Trans on Neural Networks and Learning Systems 32.1:4–24

    MathSciNet  Google Scholar 

Download references

Acknowledgements

This work is supported in part by the National Key Research and Development Project of China under Grant 2017YFC1602005 and 2018YFB1404404, the Natural Science Foundation of China under Grant 62162003 and 61762008, and the Innovation Project of Guangxi Graduate Education under Grant YCSW2022075.

Author information

Authors and Affiliations

Authors

Contributions

Ningjiang Chen: Conceptualization, Methodology, Resources, Supervision, Funding acquisition. Huan Tu: Software, Investigation, Resources, Writing - review & editing, Visualization. Xiaoyan Duan: Validation, Formal analysis, Writing original draft. Liangqing Hu: Data collection and curation. Chengxiang Guo: Data processing and analysis.

Corresponding author

Correspondence to Ningjiang Chen.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, N., Tu, H., Duan, X. et al. Semisupervised anomaly detection of multivariate time series based on a variational autoencoder. Appl Intell 53, 6074–6098 (2023). https://doi.org/10.1007/s10489-022-03829-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03829-1

Keywords

Navigation