Skip to main content

Advertisement

Log in

A Survey On Log Research Of AIOps: Methods and Trends

  • Published:
Mobile Networks and Applications Aims and scope Submit manuscript

Abstract

With the development of Artificial Intelligence (AI), Internet of Things (IoT), cloud computing, new-generation mobile communication, etc., digital transformation is changing the technical architecture of IT systems. It brings more requirements for performance and reliability. The traditional human-dependent development and maintenance methods are overwhelmed, and need to transform to Artificial Intelligence for IT Operations (AIOps). As one of the most useful data resources in IT system, the log plays an important role in AIOps. There are many research on enhancing log quality, analyzing log structure, understanding system behavior, helping users to mine the effective information in logs. Based on the characteristics of logs and different strategies, this paper reviews and categorizes the existing works around the three key processes in the log processing framework of log enhancement, log parsing, and log analysis in academia, and establishes evaluation indicators for comparison and summary. Finally, we discussed the potential directions and future development trends.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Gartner https://www.gartner.com

  2. ELK https://www.elastic.co/elk-stack

  3. GrayLog https://www.graylog.org

  4. Borthakur D (2007) The hadoop distributed file system: Architecture and design. Hadoop Project Website 11:21

    Google Scholar 

  5. Fu Q, Zhu JM, Hu WL, Lou JG, Ding R, Lin QW, Zhang DM, Xie T (2014) Where do developers log? An empirical study on logging practices in industry. Int Conf Softw Eng 24–33

  6. Yuan D, Park S, Zhou Y (2012) Characterizing logging practices in opensource software. Int Conf Softw Eng 102–112

  7. Astekin M, Zengin H, Sozer H (2018) Evaluation of distributed machine learning algorithms for anomaly detection from Large-Scale system logs: a case study. IEEE Int Conf Big Data 2071–2077

  8. Hadoop http://hadoop.apache.org

  9. Splunk http://www.splunk.com

  10. Yuan D, Park S, Huang P, Liu Y, Lee MM, Tang XM, Zhou YY, Savage S (2012) Be conservative: enhancing failure diagnosis with proactive logging. USENIX Conf Oper Sys Des Implement 293–306

  11. Barik T, Deline R, Drucker S, Fisher D (2016) The bones of the system: a case study of logging and telemetry at microsoft. Int Conf Softw Eng 92–101

  12. Zhang X, Xu Y, Lin QW, Qiao B, Zhang HY, Dang YN, Xie CY, Yang XS, Cheng Q, Li Z, Chen JJ, He XT, Yao R, Lou JG, Chintalapati M, Shen F, Zhang DM (2019) Robust Log-Based anomaly detection on unstable log data. ACM Joint European Softw Eng Conf Symp Found Softw Eng 807–817

  13. RFC5424 http://tools.ietf.org/html/rfc5424

  14. Log4j http://logging.apache.org/log4j

  15. Li H, Shang W, Hassan AE (2017) Which log level should developers choose for a new logging statement? Empir Softw Eng 22(4):1684–1716

    Article  Google Scholar 

  16. Chen B, Jiang ZMJ (2017) Characterizing logging practices in Java-based open source software projects - a replication study in Apache Software Foundation. Empir Softw Eng 22(1):330– 374

    Article  Google Scholar 

  17. Yuan D, Zheng J, Park S, Zhou Y, Savage S (2012) Improving software diagnosability via log enhancement. ACM Trans Comput Sys 30(1):1–28

    Article  Google Scholar 

  18. Rahman F, Bird C, Devanbu P (2012) Clones: What is that smell? Empir Softw Eng 17 (4-5):503–530

    Article  Google Scholar 

  19. Li Z, Chen TH, Yang J, Shang W (2019) DLFInder: Characterizing and detecting duplicate logging code smells. Int Conf Softw Eng 152–163

  20. Liu Z, Xia X, Lo D, Xing Z, Li S (2019) Which variables should i log? IEEE Trans Softw Eng PP(99):1–1

    Google Scholar 

  21. He P, Chen Z, He S, Lyu M (2018) Characterizing the natural language descriptions in software logging statements. ACM/IEEE Int Conf Autom Softw Eng 178–189

  22. Zhu J, He P, Fu Q, Zhang H, Lyu MR, Zhang D (2015) Learning to log: helping developers make informed logging decisions. Int Conf Softw Eng 1:415–425

    Google Scholar 

  23. Li H, Shang W, Zou Y, Hassan AE (2017) Towards just-in-time suggestions for log changes. Empir Softw Eng 22(4):1831–1865

    Article  Google Scholar 

  24. Cinque M, Cotroneo D, Pecchia A (2012) Event logs for the analysis of software failures: a Rule-Based approach. IEEE Trans Softw Eng 39(6):806–821

    Article  Google Scholar 

  25. Zhao X, Rodrigues K, Luo Y, Stumm M, Yuan D, Zhou Y (2017) Log20: Fully automated optimal placement of log printing statements under specified overhead threshold. Symp Oper Sys Princip 565–581

  26. A beginners’ guide to logstash grok, https://logz.io/blog/logstash-grok

  27. Xu W, Huang L, Fox A, Patterson D, Jordan M (2010) Detecting Large-Scale system problems by mining console logs. SOSP’09. 2009

  28. Nagappan M, Wu K, Vouk MA Efficiently extracting operational profiles from execution logs using suffix arrays. Int Conf Mach Learn 37-46

  29. Vaarandi R (2003) A data clustering algorithm for mining patterns from event logs. IEEE Workshop IP Oper Manag 119–126

  30. Fu Q, Lou JG, Wang Y, Li J (2009) Execution anomaly detection in distributed systems through unstructured log analysis. IEEE Int Conf Data Min 149–158

  31. Nagappan M, Vouk MA (2010) Abstracting log lines to log event types for mining software system logs. IEEE Work Conf Mining Softw Reposit 114–117

  32. Makanju AAO, Zincir-Heywood AN, Milios EE (2009) Clustering event logs using iterative partitioning. ACM SIGKDD Int Conf Knowl Discov Data Min 1255–1264

  33. Mizutani M (2013) Incremental mining of system log format. IEEE Int Conf Serv Comput 595–602

  34. Vaarandi R, Pihelgas M (2015) Logcluster - a data clustering and pattern mining algorithm for event logs. Int Conf Netw Serv Manag 1–7

  35. Shima K (2016) Length matters: Clustering system log messages using length of words. arXiv:1611.03213

  36. Hamooni H, Debnath B, Xu J, Zhang H, Jiang G, Mueen A (2016) Logmine: fast pattern recognition for log analytics. ACM Int Conf Inf Knowl Manag 1573–1582

  37. Du M, Li F (2016) Spell: Streaming parsing of system event logs. Int Conf Data Min 859–864

  38. He P, Zhu J, Zheng Z, Lyu MR (2017) Drain: an online log parsing approach with fixed depth tree. IEEE Int Conf Web Serv 33–40

  39. Zhang SL, Meng WB, Bu JH, Yang S, Liu Y, Pei D, Xu J, Chen Y, Dong H, Qu XP, Song L (2017) Syslog processing for switch failure diagnosis and prediction in datacenter networks. Int Symp Qual Serv 1–10

  40. Messaoudi S, Panichella A, Bianculli D, Briand L, Sasnauskas R (2018) A search-based approach for accurate identification of log message formats. Conf Prog Comprehen 167–177

  41. Zhu J, He S, Liu J, He P, Xie Q, Zheng Z, Lyu MR (2019) Tools and benchmarks for automated log parsing. Int Conf Softw Eng Softw Eng Pract 121–130

  42. Meng W, Liu Y, Zhu Y, Zhang S, Pei D, Liu Y, Chen Y, Zhang R, Tao S, Sun P, Zhou R (2019) Loganomaly: unsupervised detection of sequential and quantitative anomalies in unstructured logs. Int Joint Conf Artif Intell 7:4739–4745

    Google Scholar 

  43. Du M, Li F, Zheng G, Srikumar V (2017) Deeplog: Anomaly detection and diagnosis from system logs through deep learning. ACM SIGSAC Conf Comput Commun Secur 1285–1298

  44. Chen AR (2019) An empirical study on leveraging logs for debugging production failures. Int Conf Softw Eng Companion Proc 126–128

  45. Zhou X, Peng X, Xie T, Sun J, Ji C, Liu D, Xiang Q, He C (2019) Latent error prediction and fault localization for microservice applications by learning from system trace logs. ACM Joint Meeting European Softw Eng Conf Symp Found Softw Eng 683–694

  46. Liang Y, Zhang Y, Xiong H, Sahoo R (2007) Failure prediction in ibm bluegene/l event logs. IEEE Int Conf Data Min 583– 588

  47. Farshchi M, Schneider JG, Weber I, Grundy J (2015) Experience report: Anomaly detection of cloud application operations using log and cloud metric correlation analysis. Int Symp Softw Reliab Eng 24–34

  48. Sipos R, Fradkin D, Moerchen F, Wang Z (2014) Log-based predictive maintenance. ACM SIGKDD Int Conf Knowl Discov Data Min 1867–1876

  49. He S, Lin Q, Lou JG, Zhang H, Lyu MR, Zhang D (2018) Identifying impactful service system problems via log analysis. ACM Joint Meeting European Softw Eng Conf Symp Found Softw Eng 60–70

  50. Chen M, Zheng AX, Lloyd J, Jordan MI, Brewer E (2004) Failure diagnosis using decision trees. Int Conf Auton Comput 36–43

  51. Amar A, Rigby PC (2019) Mining historical test logs to predict bugs and localize faults in the test logs. Int Conf Softw Eng 140–151

  52. Zhao Z, Cerf S, Birke R, Robu B, Bouchenak S, Mokhtar SB, Chen LY (2019) Robust anomaly detection on unreliable data. Annual IEEE/IFIP Int Conf Depend Sys Netw 630–637

  53. Yuan Y, Shi W, Liang B, Qin B (2019) An approach to cloud execution failure diagnosis based on exception logs in OpenStack. Int Conf Cloud Comput 124–131

  54. Zhang K, Xu J, Min MR, Jiang G, Pelechrinis K, Zhang H (2016) Automated IT system failure prediction: a deep learning approach. IEEE Int Conf Big Data 1291–1300

  55. Lou JG, Fu Q, Yang S, Xu Y, Li J (2010) Mining invariants from console logs for system problem detection. USENIX Annual Tech Conf 1–14

  56. Pande A, Ahuja V (2017) WEAC: Word embeddings for anomaly classification from event logs. Int Conf Big Data 1095–1100

  57. Lin Q, Zhang H, Lou JG, Zhang Y, Chen X (2016) Log clustering based problem identification for online service systems. Int Conf Softw Eng Companion 102–111

  58. Brown A, Tuor A, Hutchinson B, Nichols N (2018) Recurrent neural network attention mechanisms for interpretable system log anomaly detection. Workshop Mach Learn Comput Sys 1–8

  59. Zuo Y, Wu Y, Min G, Huang C, Pei K (2020) An intelligent anomaly detection scheme for micro-services architectures with temporal and spatial data analysis. IEEE Trans Cognit Commun Netw

  60. Chen R, Zhang S, Li D, Zhang Y, Guo F, Meng W, Pei D, Zhang Y, Chen X, Liu Y (2020) Logtransfer: cross-system log anomaly detection for software systems with transfer learning. IEEE Int Symp Softw Reliab Eng 30–47

  61. Nedelkoski S, Bogatinovski J, Acker A et al (2020) Self-attentive classification-based anomaly detection in unstructured logs. IEEE Int Conf Data Min 1196–1201

  62. Nedelkoski S, Bogatinovski J, Acker A et al (2020) Self-supervised log parsing. arXiv:2003.07905

  63. Dai H, Li H, Shang W et al (2020) Logram: efficient log parsing using n-gram dictionaries. IEEE Trans Softw Eng PP(99):1–1

    Article  Google Scholar 

  64. Huang C, Wu Y, Zuo Y, Pei K, Min G (2018) Towards experienced anomaly detector through reinforcement learning. Conf Artif Intell 8087–8088

Download references

Acknowledgements

This work is supported by Special Pro- ject of Ministry of Science and Technology of China on Innovation Method (Grant No. 2019IM020100), Strategic Priority Research Program ofthe Chinese Academy of Sciences (Grant No.XDC02070200), and the STS (Science and Technology Service Network) Plan of Chinese Academy of Sciences under Grant No. KFJ-STS-QYZD-2021-11-001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiang Zhaoxue.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhaoxue, J., Tong, L., Zhenguo, Z. et al. A Survey On Log Research Of AIOps: Methods and Trends. Mobile Netw Appl 26, 2353–2364 (2021). https://doi.org/10.1007/s11036-021-01832-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11036-021-01832-3

Keywords

Navigation