skip to main content
10.1145/3605801.3605824acmotherconferencesArticle/Chapter ViewAbstractPublication PagescncitConference Proceedingsconference-collections
research-article

LogTraceAD: Anomaly Detection from Both Logs and Traces with Graph Representation Learning

Published:09 August 2023Publication History

ABSTRACT

Abstract: The anomaly detection technique is increasingly applied in various security fields and the effectiveness and efficiency of anomaly detection models have become vitally important issues. Deep learning models are widely used to detect anomalies due to their flexibility and learning ability. However, in order to improve the performance of anomaly detection models, information used for model training and detecting is most significant. Previous methods involve the usage of system logs and traces, but mostly only focus on one single type of data source. And combining the logs and traces appropriately to retrieve comprehensive information for anomaly detection is still challenging. We propose LogTraceAD, a novel anomaly detection method that utilizes the logs and traces to generate a graph, and leverages a variational autoencoder-based graph representation learning model to complete feature learning. Then the feature data containing information from both types of data can be used for anomaly detection. We conduct the experiment on a publicly available dataset that contains 23,334 anomalies in 7,705,050 logs and 132,485 traces and compare the performance of the proposed method with several previous approaches. The result shows our method can achieve a 24% and 27% improvement respectively compared to methods using only logs or traces, and will not cause high overhead.

References

  1. 2021. Log Parser. https://github.com/logpai/logparser.Google ScholarGoogle Scholar
  2. 2021. S-VAE. https://github.com/muhanzhang/D-VAE.Google ScholarGoogle Scholar
  3. 2021. SVDD. https://github.com/lukasruff/Deep-SVDD.Google ScholarGoogle Scholar
  4. Ida Bifulco, Stefano Cirillo, Christian Esposito, Roberta Guadagni, and Giuseppe Polese. 2021. An intelligent system for focused crawling from Big Data sources. Expert Systems with Applications 184 (2021), 115560.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Samuel R Bowman, Luke Vilnis, Oriol Vinyals, Andrew M Dai, Rafal Jozefowicz, and Samy Bengio. 2015. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349 (2015).Google ScholarGoogle Scholar
  6. Jeanderson Candido, Maurício Aniche, and Arie van Deursen. 2019. Contemporary software monitoring: A systematic literature review. arXiv e-prints (2019), arXiv–1912.Google ScholarGoogle Scholar
  7. Ayan Chatterjee and Bestoun S Ahmed. 2022. IoT anomaly detection methods and applications: A survey. Internet of Things 19 (2022), 100568.Google ScholarGoogle ScholarCross RefCross Ref
  8. Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, Dallas, TX, USA, October 30 - November 03, 2017, Bhavani M. Thuraisingham, David Evans, Tal Malkin, and Dongyan Xu (Eds.). ACM, 1285–1298. https://doi.org/10.1145/3133956.3134015Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Qiang Fu, Jian-Guang Lou, Yi Wang, and Jiang Li. 2009. Execution anomaly detection in distributed systems through unstructured log analysis. In 2009 ninth IEEE international conference on data mining. IEEE, 149–158.Google ScholarGoogle Scholar
  10. Qiang Fu, Jieming Zhu, Wenlu Hu, Jian-Guang Lou, Rui Ding, Qingwei Lin, Dongmei Zhang, and Tao Xie. 2014. Where do developers log? an empirical study on logging practices in industry. In Companion Proceedings of the 36th International Conference on Software Engineering. 24–33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Muneeb Ul Hassan, Mubashir Husain Rehmani, and Jinjun Chen. 2022. Anomaly detection in blockchain networks: A comprehensive survey. IEEE Communications Surveys & Tutorials (2022).Google ScholarGoogle Scholar
  12. Pinjia He, Jieming Zhu, Shilin He, Jian Li, and Michael R Lyu. 2016. An evaluation study on log parsing and its use in log mining. In 2016 46th annual IEEE/IFIP international conference on dependable systems and networks (DSN). IEEE, 654–661.Google ScholarGoogle Scholar
  13. Shilin He, Pinjia He, Zhuangbin Chen, Tianyi Yang, Yuxin Su, and Michael R. Lyu. 2021. A Survey on Automated Log Analysis for Reliability Engineering. ACM Comput. Surv. 54, 6 (2021), 130:1–130:37. https://doi.org/10.1145/3460345Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Shilin He, Jieming Zhu, Pinjia He, and Michael R Lyu. 2016. Experience report: System log analysis for anomaly detection. In 2016 IEEE 27th international symposium on software reliability engineering (ISSRE). IEEE, 207–218.Google ScholarGoogle ScholarCross RefCross Ref
  15. Imelda Imelda, Arief Ramdhan Kurnianto, 2023. Naïve Bayes and TF-IDF for Sentiment Analysis of the Covid-19 Booster Vaccine. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 7, 1 (2023), 1–6.Google ScholarGoogle Scholar
  16. Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2015. Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493 (2015).Google ScholarGoogle Scholar
  17. Yinglung Liang, Yanyong Zhang, Hui Xiong, and Ramendra Sahoo. 2007. Failure prediction in ibm bluegene/l event logs. In Seventh IEEE International Conference on Data Mining (ICDM 2007). IEEE, 583–588.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ping Liu, Haowen Xu, Qianyu Ouyang, Rui Jiao, Zhekang Chen, Shenglin Zhang, Jiahai Yang, Linlin Mo, Jice Zeng, Wenman Xue, 2020. Unsupervised detection of microservice trace anomalies through service-level deep bayesian networks. In 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE). IEEE, 48–58.Google ScholarGoogle ScholarCross RefCross Ref
  19. Jian-Guang Lou, Qiang Fu, Shengqi Yang, Ye Xu, and Jiang Li. 2010. Mining Invariants from Console Logs for System Problem Detection.. In USENIX annual technical conference. 1–14.Google ScholarGoogle Scholar
  20. Weibin Meng, Ying Liu, Yuheng Huang, Shenglin Zhang, Federico Zaiter, Bingjin Chen, and Dan Pei. 2020. A semantic-aware representation framework for online log analysis. In 2020 29th International Conference on Computer Communications and Networks (ICCCN). IEEE, 1–7.Google ScholarGoogle ScholarCross RefCross Ref
  21. Weibin Meng, Ying Liu, Yichen Zhu, Shenglin Zhang, Dan Pei, Yuqing Liu, Yihao Chen, Ruizhi Zhang, Shimin Tao, Pei Sun, and Rong Zhou. 2019. LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, Sarit Kraus (Ed.). ijcai.org, 4739–4745. https://doi.org/10.24963/ijcai.2019/658Google ScholarGoogle ScholarCross RefCross Ref
  22. Animesh Nandi, Atri Mandal, Shubham Atreja, Gargi B Dasgupta, and Subhrajit Bhattacharya. 2016. Anomaly detection using program control flow graph mining from execution logs. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 215–224.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Sasho Nedelkoski, Jorge Cardoso, and Odej Kao. 2019. Anomaly detection from system tracing data using multimodal deep learning. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). IEEE, 179–186.Google ScholarGoogle ScholarCross RefCross Ref
  24. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32 (2019).Google ScholarGoogle Scholar
  25. Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.Google ScholarGoogle ScholarCross RefCross Ref
  26. Li Qizheng, Hu Weilin, and Dai Hao. 2023. Research on automatic classification of Chinese papers based on LDA model and TF-IDF algorithm. (2023).Google ScholarGoogle Scholar
  27. Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Müller, and Marius Kloft. 2018. Deep one-class classification. In International conference on machine learning. PMLR, 4393–4402.Google ScholarGoogle Scholar
  28. Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513–523.Google ScholarGoogle Scholar
  29. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4–24.Google ScholarGoogle ScholarCross RefCross Ref
  30. Lin Yang, Junjie Chen, Zan Wang, Weijing Wang, Jiajun Jiang, Xuyuan Dong, and Wenbin Zhang. 2021. PLELog: Semi-Supervised Log-Based Anomaly Detection via Probabilistic Label Estimation. In 43rd IEEE/ACM International Conference on Software Engineering: Companion Proceedings, ICSE Companion 2021, Madrid, Spain, May 25-28, 2021. IEEE, 230–231. https://doi.org/10.1109/ICSE-Companion52605.2021.00106Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Chenxi Zhang, Xin Peng, Chaofeng Sha, Ke Zhang, Zhenqing Fu, Xiya Wu, Qingwei Lin, and Dongmei Zhang. 2022. DeepTraLog: Trace-log combined microservice anomaly detection through graph-based deep learning. In Proceedings of the 44th International Conference on Software Engineering. 623–634.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Muhan Zhang, Shali Jiang, Zhicheng Cui, Roman Garnett, and Yixin Chen. 2019. D-vae: A variational autoencoder for directed acyclic graphs. Advances in Neural Information Processing Systems 32 (2019).Google ScholarGoogle Scholar
  33. Jieming Zhu, Shilin He, Jinyang Liu, Pinjia He, Qi Xie, Zibin Zheng, and Michael R Lyu. 2019. Tools and benchmarks for automated log parsing. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 121–130.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. LogTraceAD: Anomaly Detection from Both Logs and Traces with Graph Representation Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      CNCIT '23: Proceedings of the 2023 2nd International Conference on Networks, Communications and Information Technology
      June 2023
      253 pages
      ISBN:9798400700620
      DOI:10.1145/3605801

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 August 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)63
      • Downloads (Last 6 weeks)3

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format