skip to main content
10.1145/3230833.3230855acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article

Converting Unstructured System Logs into Structured Event List for Anomaly Detection

Published: 27 August 2018 Publication History

Abstract

System logs provide invaluable resources for understanding system behavior and detecting anomalies on high performance computing (HPC) systems. As HPC systems continue to grow in both scale and complexity, the sheer volume of system logs and the complex interaction among system components make the traditional manual problem diagnosis and even automated line-by-line log analysis infeasible or ineffective. In this paper, we present a System Log Event Block Detection (SLEBD) framework that identifies groups of log messages that follow certain sequence but with variations, and explore these event blocks for event-based system behavior analysis and anomaly detection. Compared with the existing approaches that analyze system logs line by line, SLEBD is capable of characterizing system behavior and identifying intricate anomalies at a higher (i.e., event) level. We evaluate the performance of SLEBD by using syslogs collected from production supercomputers. Experimental results show that our framework and mechanisms can process streaming log messages, efficiently extract event blocks and effectively detect anomalies, which enables system administrators and monitoring tools to understand and process system events in real time. Additionally, we use the identified event blocks and explore deep learning algorithms to model and classify event sequences.

References

[1]
K Pedretti, S, Olivier, G Shipman, W Shu, K Ferreira." Exploring MPI Application Performance Under Power Capping on the Cray XC40 Platform, Proc. of EuroMPI, 2015.
[2]
https://www.olcf.ornl.gov/computing-resources/titan-cray-xk7/
[3]
Li Yu, Ziming Zheng, Zhiling Lan. "Filtering Log Data: Finding the needles in the Haystack", Proc. of IEEE/IFIP DSN, 2012.
[4]
S. Alspaugh, Archana Ganapathi, Marti A. Hearst, Randy Katz. "Analyzing Log Analysis: An Empirical Study of User Log Mining", Proc. of USENIX LISA, 2014.
[5]
Raghul Gunasekaran, Sarp Oral, David Dillow, Byung Park, Galen Shipman, Al Geist. "Real-Time System Log Monitoring/Analytics Framework", Proc. of CUG, 2011.
[6]
Ziming Zheng, Li Yu, Wei Tang, Zhiling Lan, R. Gupta, N. Desai, S. Coghlan, D. buettner. "Co-Analysis of RAS Log and Job Log on Blue Gene/P", Proc. of IEEE IPDPS, 2011.
[7]
Adam Oliner, Jon Stearley. "What Supercomputers Say: A Study of Five System Logs", Proc. of IEEE/IFIP DSN, 2007.
[8]
Catello Di Martino, Marcello Cinque, Domenico Cotroneo. "Assessing time coalescence techniques for the analysis of supercomputer logs", Proc. of IEEE/IFIP DSN, 2012.
[9]
Xiaoyu Fu, Rui Ren, Jianfeng Zhan, Wei Zhou, Zhen Jia, Gang Lu. "LogMaster: Mining Event Correlations in Logs of Large-Scale Cluster Systems", IEEE SRDS, 2012.
[10]
Taerat, N., Brandt, J., Gentile, A. et al. "Baler: deterministic, lossless log message clustering tool." Computer Science Research Development, 26: 285,2011.
[11]
R Vaarandi. "A Data Clustering Algorithm for Mining Patterns From Event Logs", Proc. of IEEE Workshop on IP Operations and Management, 2003.
[12]
Ana Gainaru, Franck Cappello, Stefan Trausan-Matu, Bill Kramer. "Event Log Mining Tool for Large Scale HPC Systems", Proc. of Euro-Par, 2011.
[13]
Ziming Zheng, Zhiling Lan, Byung H. Park, Al Geist. "System Log Preprocessing to Improve Failure Prediction", Proc. of IEEE/IFIP DSN, 2009.
[14]
JianGuang Lou, Qiang Fu, Yi Wang, Jiang Li. "Mining Dependency in Distributed Systems through Unstructured Logs Analysis", ACM SIGOPS Operating Systems Review, Volume 44 Issue 1, 2010.
[15]
Qiang Fu, JianGuang Lou, Yi Wang, Jiang Li. "Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis", Proc. of ICDM, 2009.
[16]
Wei Xu, Ling Huang, Armando Fox, David Patterson, Michael Jordan. "Mining Console Logs for Large-Scale System Problem Detection" Proc. of ACM SOSP, 2009.
[17]
Elisabeth Baseman, Sean Blanchard, Zongze Li, Song Fu. "Relational Synthesis of Text and Numeric Data for Anomaly Detection on Computing System Logs", Proc. of IEEE ICMLA, 2016.
[18]
M. Rosvall and C. T. Bergstrom, "Maps of random walks on complex networks reveal community structure," National Academy of Sciences, Volume 105, Issue 4, pp. 1118--1123, 2008.
[19]
Jiawei Han, Jian Pei, Behzad Mortazavi-asl, Qiming Chen, Umeshwar Dayal, Mei-Chun Hsu. "FreeSpan: frequent pattern-projected sequential pattern mining". Proc. of ACM KDD, 2000.
[20]
Mohanmmed J. Zaki. "SPADE: An Efficient Algorithm for Mining Frequent Sequences", Machine Learning, Volume 42, pp. 31--60, 2001.
[21]
Ming Hu, Guannan Zheng, Hongmei Wang. "Improvement and research on Aprioriall algorithm of sequential patterns mining", Proc. of International Conference on Industrial and Intelligent Information, 2013.
[22]
Catello Di Martino, Marcello Cinque, Domenico Cotroneo. "Assessing time coalescence techniques for the analysis of supercomputer logs", Proc. of IEEE/IFIP DSN, 2012.
[23]
Zdzislaw Pawlak. "Rough sets, decision algorithm and Bayes' theorem", European Journal of Operational Research, 136(1): 181--189, 2002.
[24]
David Niju. "Law of Total Probability", 2008.
[25]
Mutrino, Sandia National Labs, http://hpc.sandia.gov/aces/
[26]
Mikolov T, Kombrink S, Burget L, et al. Extensions of recurrent neural network language model. Proc. of IEEE ICASSP, 2011.
[27]
Jeatrakul P., Wong K.W., Fung C.C. (2010) Classification of Imbalanced Data by Combining the Complementary Neural Network and SMOTE Algorithm. Neural Information Processing, Models and Applications. Springer, 2010.
[28]
https://keras.io/.
[29]
Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735-1780 (1997).
[30]
David M. Blei, Andrwe Y. Ng, Michael I. Jordan, "Latent Dirichlet Allocation", Machine Learning Research, Volume 3, pp. 993--1022, 2003.

Cited By

View all
  • (2022)Landscape of Automated Log Analysis: A Systematic Literature Review and Mapping StudyIEEE Access10.1109/ACCESS.2022.315254910(21892-21913)Online publication date: 2022
  • (2021)Survey on Log Clustering ApproachesSmart Log Data Analytics10.1007/978-3-030-74450-2_2(13-41)Online publication date: 29-Aug-2021
  • (2020)Boot Log Anomaly Detection with K-Seen-Before2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC48688.2020.0-140(1005-1010)Online publication date: Jul-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ARES '18: Proceedings of the 13th International Conference on Availability, Reliability and Security
August 2018
603 pages
ISBN:9781450364485
DOI:10.1145/3230833
© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

  • Universität Hamburg: Universität Hamburg

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 August 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. HPC systems
  2. anomaly detection
  3. behavior analysis
  4. system reliability

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ARES 2018

Acceptance Rates

ARES '18 Paper Acceptance Rate 128 of 260 submissions, 49%;
Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)1
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Landscape of Automated Log Analysis: A Systematic Literature Review and Mapping StudyIEEE Access10.1109/ACCESS.2022.315254910(21892-21913)Online publication date: 2022
  • (2021)Survey on Log Clustering ApproachesSmart Log Data Analytics10.1007/978-3-030-74450-2_2(13-41)Online publication date: 29-Aug-2021
  • (2020)Boot Log Anomaly Detection with K-Seen-Before2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC48688.2020.0-140(1005-1010)Online publication date: Jul-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media