Skip to main content
Log in

LogGAN: a Log-level Generative Adversarial Network for Anomaly Detection using Permutation Event Modeling

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

System logs that trace system states and record valuable events comprise a significant component of any computer system in our daily life. Each log contains sufficient information (i.e., normal and abnormal instances) that assist administrators in diagnosing and maintaining the operation of systems. If administrators cannot detect and eliminate diverse and complex anomalies (i.e., bugs and failures) efficiently, running workflows and transactions, even systems, would break down. Therefore, the technique of anomaly detection has become increasingly significant and attracted a lot of research attention. However, current approaches concentrate on the anomaly detection analyzing a high-level granularity of logs (i.e., session) instead of detecting log-level anomalies which weakens the efficiency of responding anomalies and the diagnosis of system failures. To overcome the limitation, we propose an LSTM-based generative adversarial network for anomaly detection based on system logs using permutation event modeling named LogGAN, which detects log-level anomalies based on patterns (i.e., combinations of latest logs). On the one hand, the permutation event modeling mitigates the strong sequential characteristics of LSTM for solving the out-of-order problem caused by the arrival delays of logs. On the other hand, the generative adversarial network-based model mitigates the impact of imbalance between normal and abnormal instances to improve the performance of detecting anomalies. To evaluate LogGAN, we conduct extensive experiments on two real-world datasets, and the experimental results show the effectiveness of our proposed approach on the task of log-level anomaly detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Learned event embedding is used to demonstrate each event.

  2. In D, we cast the combination ci as the input of LSTM and LSTM directly outputs the hidden layer without any manipulation. Then, we concatenate the m −dimensional vector with the hidden layer as an input of a two-layer full Connected neural network which outputs whether the m −dimensional vector is real or fake as a binary classification.

  3. The case is selected from the training set and we adapt the case a bit to explain all types of processing using the permutation event modeling.

  4. https://github.com/logpai/loghub

References

  • Bodik, P., Goldszmidt, M., Fox, A., Woodard, D.B., & Andersen, H. (2010). Fingerprinting the datacenter: automated classification of performance crises. In inproceedings of the 5th european conference on computer systems (pp. 111–124): ACM.

  • Chae, D.-K., Kang, J.-S., Kim, S.-W., & Lee, J.-T. (2018). Cfgan: A generic collaborative filtering framework based on generative adversarial networks. In Inproceedings of the 27th ACM International Conference on Information and Knowledge Management (pp. 137–146): ACM.

  • Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: a survey. ACM computing surveys (CSUR), 41(3), 15.

    Article  Google Scholar 

  • Chawla, S., & Sun, P. (2006). Slom: a new measure for local spatial outliers. Knowledge and Information Systems, 9(4), 412– 429.

    Article  Google Scholar 

  • Chen, M., Zheng, A.X., Lloyd, J., Jordan, M.I., & Brewer, E. (2004). Failure diagnosis using decision trees. In International Conference on Autonomic Computing, 2004. Proceedings (pp. 36–43): IEEE.

  • Min, D., Li, F., Zheng, G., & Srikumar, V. (2017). Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (pp. 1285–1298): ACM.

  • Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Bing, X., Warde-Farley, D., Ozair, S., Courville, A.C., & Bengio, Y. (2014). Generative adversarial nets. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada (pp. 2672–2680).

  • Guo, S., Liu, Z., Chen, W., & Li, T. (2018). Event extraction from streaming system logs. In Information Science and Applications 2018 - ICISA 2018, Hong Kong, China, June 25-27th, 2018 (pp. 465–474).

  • Huang, S.Y., Lin, C.-C., Chiu, A.-A., & Yes, D.C. (2017). Fraud detection using fraud triangle risk factors. Inf. Sys. Frontiers, 19(6), 1343–1356.

    Article  Google Scholar 

  • Li, T., Zeng, C., Zhou, W., Xue, W., Huang, Y., Liu, Z., Zhou, Q., Xia, B., Wang, Q., Wang, W., & et al. (2017). Fiu-miner (a fast, integrated, and user-friendly system for data mining) and its applications. Knowledge and Information Systems, 52(2), 411–443.

    Article  Google Scholar 

  • Liang, Y., Zhang, Y., Xiong, H., & Sahoo, R. (2007). Failure prediction in ibm bluegene/l event logs. In Seventh IEEE International Conference on Data Mining (ICDM 2007) (pp. 583–588): IEEE.

  • Lin, Q., Zhang, H., Lou, J.-G., Zhang, Y., & Chen, X. (2016). Log clustering based problem identification for online service systems. In Proceedings of the 38th International Conference on Software Engineering Companion (pp. 102–111): ACM.

  • Liu, F.T., Ting, K.M., & Zhou, Z.-H. (2008). Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining (pp. 413–422): IEEE.

  • Lou, J.-G., Qiang, F., Yang, S., Ye, X., & Li, J. (2010). Mining invariants from console logs for system problem detection. In USENIX Annual Technical Conference (pp. 1–14).

  • Mondal, T., Pramanik, P., Bhattacharya, I., Boral, N., & Ghosh, S. (2018). Analysis and early detection of rumors in a post disaster scenario. Inf. Syst. Frontiers, 20(5), 961–979.

    Article  Google Scholar 

  • Niven, T., & Kao, H.-Y. (2019). Probing neural network comprehension of natural language arguments. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers (pp. 4658–4664).

  • Sun, P., & Chawla, S. (2004). On local spatial outliers, Fourth IEEE International Conference on Data Mining (ICDM’04) (pp. 209–216): IEEE.

  • Tang, L., Li, T., & Perng, C.-S. (2011). Logsig: generating system events from raw textual logs. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (pp. 785–794): ACM.

  • Troudi, A., Zayani, C.A., Jamoussi, S., & Amor, I.A.B. (2018). A new mashup based method for event detection from social media. Inf. Syst Frontiers, 20(5), 981–992.

    Article  Google Scholar 

  • Tuor, A.R., Baerwolf, R., Knowles, N., Hutchinson, B., Nichols, N., & Jasper, R. (2018). Recurrent neural network language models for open vocabulary event-level cyber anomaly detection. In Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence.

  • Wang, J., Lantao, Y., Zhang, W., Gong, Y., Yinghui, X., Wang, B., Zhang, P., & Zhang, D. (2017). Irgan: A minimax game for unifying generative and discriminative information retrieval models. In Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval (pp. 515–524): ACM.

  • Wang, W., Zeng, C., & Li, T. (2018). Discovering multiple time lags of temporal dependencies from fluctuating events. In Web and Big Data - Second International Joint Conference, APWeb-WAIM 2018, Macau, China, July 23-25, 2018, Proceedings, Part II (pp. 121–137).

  • Xia, B., Yin, J., Jian, X., & Li, Y. (2019). Loggan: A sequence-based generative adversarial network for anomaly detection based on system logs. In Liu, F., Xu, J., Xu, S., & Yung, M. (Eds.), Science of Cyber Security - Second International Conference, Scisec 2019, Nanjing, China, August 9-11, 2019, Revised Selected Papers, Volume 11933 of Lecture Notes in Computer Science (pp. 61–76): Springer.

  • Jian, X., Jiang, Y., Zeng, C., & Li, T. (2015). Node anomaly detection for homogeneous distributed environments. Expert Syst. Appl., 42(20), 7012–7025.

    Article  Google Scholar 

  • Jian, X., Tang, L., & Li, T. (2016). System situation ticket identification using svms ensemble. Expert Syst. Appl., 60, 130–140.

    Article  Google Scholar 

  • Jian, X., Tang, L., Zeng, C., & Li, T. (2016). Pattern discovery via constraint programming. Knowl.-Based Syst., 94, 23–32.

    Article  Google Scholar 

  • Wei, X., Huang, L., Fox, A., Patterson, D., & Jordan, M.I. (2009). Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles (pp. 117–132): ACM.

  • Yan, G. (2015). Be sensitive to your errors: Chaining neyman-pearson criteria for automated malware classification. In Bao, F., Miller, S., Zhou, J., & Ahn, G.-J. (Eds.), Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security, ASIA CCS ’15, Singapore, April 14-17, 2015 (pp. 121–132): ACM.

  • Yang, Z., Dai, Z., Yang, Y., Carbonell, J.G., Salakhutdinov, R., & Le, Q.V. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. CoRR abs/1906.08237.

  • Zeng, C., Tang, L., Li, T., Shwartz, L., & Grabarnik, G. (2014). Mining temporal lag from fluctuating events for correlation and root cause analysis. In 10th International Conference on Network and Service Management, CNSM 2014 and Workshop, Rio de Janeiro, Brazil, November 17-21, 2014 (pp. 19–27).

  • Ji, Z., & Wang, H. (2006). Detecting outlying subspaces for high-dimensional data: the new task, algorithms, and performance. Knowledge and information systems, 10(3), 333–355.

    Article  Google Scholar 

  • Zhu, J., He, S., Liu, J., He, P., Qi, X., Zheng, Z., & Lyu, M.R. (2018). Tools and benchmarks for automated log parsing. CoRR abs/1811.03509.

Download references

Acknowledgment

This work was supported by the National Natural Science Foundation of China under Grant No.61802205, 61872186, and 61772284, the Natural Science Research Project of Jiangsu Province under Grant 18KJB520037, and the research funds of NJUPT under Grant NY218116.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Xia.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xia, B., Bai, Y., Yin, J. et al. LogGAN: a Log-level Generative Adversarial Network for Anomaly Detection using Permutation Event Modeling. Inf Syst Front 23, 285–298 (2021). https://doi.org/10.1007/s10796-020-10026-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-020-10026-3

Keywords

Navigation