ABSTRACT
Abstract: The anomaly detection technique is increasingly applied in various security fields and the effectiveness and efficiency of anomaly detection models have become vitally important issues. Deep learning models are widely used to detect anomalies due to their flexibility and learning ability. However, in order to improve the performance of anomaly detection models, information used for model training and detecting is most significant. Previous methods involve the usage of system logs and traces, but mostly only focus on one single type of data source. And combining the logs and traces appropriately to retrieve comprehensive information for anomaly detection is still challenging. We propose LogTraceAD, a novel anomaly detection method that utilizes the logs and traces to generate a graph, and leverages a variational autoencoder-based graph representation learning model to complete feature learning. Then the feature data containing information from both types of data can be used for anomaly detection. We conduct the experiment on a publicly available dataset that contains 23,334 anomalies in 7,705,050 logs and 132,485 traces and compare the performance of the proposed method with several previous approaches. The result shows our method can achieve a 24% and 27% improvement respectively compared to methods using only logs or traces, and will not cause high overhead.
- 2021. Log Parser. https://github.com/logpai/logparser.Google Scholar
- 2021. S-VAE. https://github.com/muhanzhang/D-VAE.Google Scholar
- 2021. SVDD. https://github.com/lukasruff/Deep-SVDD.Google Scholar
- Ida Bifulco, Stefano Cirillo, Christian Esposito, Roberta Guadagni, and Giuseppe Polese. 2021. An intelligent system for focused crawling from Big Data sources. Expert Systems with Applications 184 (2021), 115560.Google ScholarDigital Library
- Samuel R Bowman, Luke Vilnis, Oriol Vinyals, Andrew M Dai, Rafal Jozefowicz, and Samy Bengio. 2015. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349 (2015).Google Scholar
- Jeanderson Candido, Maurício Aniche, and Arie van Deursen. 2019. Contemporary software monitoring: A systematic literature review. arXiv e-prints (2019), arXiv–1912.Google Scholar
- Ayan Chatterjee and Bestoun S Ahmed. 2022. IoT anomaly detection methods and applications: A survey. Internet of Things 19 (2022), 100568.Google ScholarCross Ref
- Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017, Bhavani M. Thuraisingham, David Evans, Tal Malkin, and Dongyan Xu (Eds.). ACM, 1285–1298. https://doi.org/10.1145/3133956.3134015Google ScholarDigital Library
- Qiang Fu, Jian-Guang Lou, Yi Wang, and Jiang Li. 2009. Execution anomaly detection in distributed systems through unstructured log analysis. In 2009 ninth IEEE international conference on data mining. IEEE, 149–158.Google Scholar
- Qiang Fu, Jieming Zhu, Wenlu Hu, Jian-Guang Lou, Rui Ding, Qingwei Lin, Dongmei Zhang, and Tao Xie. 2014. Where do developers log? an empirical study on logging practices in industry. In Companion Proceedings of the 36th International Conference on Software Engineering. 24–33.Google ScholarDigital Library
- Muneeb Ul Hassan, Mubashir Husain Rehmani, and Jinjun Chen. 2022. Anomaly detection in blockchain networks: A comprehensive survey. IEEE Communications Surveys & Tutorials (2022).Google Scholar
- Pinjia He, Jieming Zhu, Shilin He, Jian Li, and Michael R Lyu. 2016. An evaluation study on log parsing and its use in log mining. In 2016 46th annual IEEE/IFIP international conference on dependable systems and networks (DSN). IEEE, 654–661.Google Scholar
- Shilin He, Pinjia He, Zhuangbin Chen, Tianyi Yang, Yuxin Su, and Michael R. Lyu. 2021. A Survey on Automated Log Analysis for Reliability Engineering. ACM Comput. Surv. 54, 6 (2021), 130:1–130:37. https://doi.org/10.1145/3460345Google ScholarDigital Library
- Shilin He, Jieming Zhu, Pinjia He, and Michael R Lyu. 2016. Experience report: System log analysis for anomaly detection. In 2016 IEEE 27th international symposium on software reliability engineering (ISSRE). IEEE, 207–218.Google ScholarCross Ref
- Imelda Imelda, Arief Ramdhan Kurnianto, 2023. Naïve Bayes and TF-IDF for Sentiment Analysis of the Covid-19 Booster Vaccine. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 7, 1 (2023), 1–6.Google Scholar
- Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2015. Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493 (2015).Google Scholar
- Yinglung Liang, Yanyong Zhang, Hui Xiong, and Ramendra Sahoo. 2007. Failure prediction in ibm bluegene/l event logs. In Seventh IEEE International Conference on Data Mining (ICDM 2007). IEEE, 583–588.Google ScholarDigital Library
- Ping Liu, Haowen Xu, Qianyu Ouyang, Rui Jiao, Zhekang Chen, Shenglin Zhang, Jiahai Yang, Linlin Mo, Jice Zeng, Wenman Xue, 2020. Unsupervised detection of microservice trace anomalies through service-level deep bayesian networks. In 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE). IEEE, 48–58.Google ScholarCross Ref
- Jian-Guang Lou, Qiang Fu, Shengqi Yang, Ye Xu, and Jiang Li. 2010. Mining Invariants from Console Logs for System Problem Detection.. In USENIX annual technical conference. 1–14.Google Scholar
- Weibin Meng, Ying Liu, Yuheng Huang, Shenglin Zhang, Federico Zaiter, Bingjin Chen, and Dan Pei. 2020. A semantic-aware representation framework for online log analysis. In 2020 29th International Conference on Computer Communications and Networks (ICCCN). IEEE, 1–7.Google ScholarCross Ref
- Weibin Meng, Ying Liu, Yichen Zhu, Shenglin Zhang, Dan Pei, Yuqing Liu, Yihao Chen, Ruizhi Zhang, Shimin Tao, Pei Sun, and Rong Zhou. 2019. LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, Sarit Kraus (Ed.). ijcai.org, 4739–4745. https://doi.org/10.24963/ijcai.2019/658Google ScholarCross Ref
- Animesh Nandi, Atri Mandal, Shubham Atreja, Gargi B Dasgupta, and Subhrajit Bhattacharya. 2016. Anomaly detection using program control flow graph mining from execution logs. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 215–224.Google ScholarDigital Library
- Sasho Nedelkoski, Jorge Cardoso, and Odej Kao. 2019. Anomaly detection from system tracing data using multimodal deep learning. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). IEEE, 179–186.Google ScholarCross Ref
- Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).Google Scholar
- Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.Google ScholarCross Ref
- Li Qizheng, Hu Weilin, and Dai Hao. 2023. Research on automatic classification of Chinese papers based on LDA model and TF-IDF algorithm. (2023).Google Scholar
- Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Müller, and Marius Kloft. 2018. Deep one-class classification. In International conference on machine learning. PMLR, 4393–4402.Google Scholar
- Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513–523.Google Scholar
- Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4–24.Google ScholarCross Ref
- Lin Yang, Junjie Chen, Zan Wang, Weijing Wang, Jiajun Jiang, Xuyuan Dong, and Wenbin Zhang. 2021. PLELog: Semi-Supervised Log-Based Anomaly Detection via Probabilistic Label Estimation. In 43rd IEEE/ACM International Conference on Software Engineering: Companion Proceedings, ICSE Companion 2021, Madrid, Spain, May 25-28, 2021. IEEE, 230–231. https://doi.org/10.1109/ICSE-Companion52605.2021.00106Google ScholarDigital Library
- Chenxi Zhang, Xin Peng, Chaofeng Sha, Ke Zhang, Zhenqing Fu, Xiya Wu, Qingwei Lin, and Dongmei Zhang. 2022. DeepTraLog: Trace-log combined microservice anomaly detection through graph-based deep learning. In Proceedings of the 44th International Conference on Software Engineering. 623–634.Google ScholarDigital Library
- Muhan Zhang, Shali Jiang, Zhicheng Cui, Roman Garnett, and Yixin Chen. 2019. D-vae: A variational autoencoder for directed acyclic graphs. Advances in Neural Information Processing Systems 32 (2019).Google Scholar
- Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, and Michael R Lyu. 2019. Tools and benchmarks for automated log parsing. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 121–130.Google ScholarDigital Library
Index Terms
- LogTraceAD: Anomaly Detection from Both Logs and Traces with Graph Representation Learning
Recommendations
Unsupervised Anomaly Detection on Microservice Traces through Graph VAE
WWW '23: Proceedings of the ACM Web Conference 2023The microservice architecture is widely employed in large Internet systems. For each user request, a few of the microservices are called, and a trace is formed to record the tree-like call dependencies among microservices and the time consumption at ...
Toward Explainable Deep Anomaly Detection
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data MiningAnomaly explanation, also known as anomaly localization, is as important as, if not more than, anomaly detection in many real-world applications. However, it is challenging to build explainable detection models due to the lack of anomaly-supervisory ...
GAN-based anomaly detection: A review
Graphical abstractDisplay Omitted
Highlights- This review reconsiders the anomaly and gives criteria and challenges for anomaly detection.
AbstractSupervised learning algorithms have shown limited use in the field of anomaly detection due to the unpredictability and difficulty in acquiring abnormal samples. In recent years, unsupervised or semi-supervised anomaly-detection ...
Comments