Abstract
With the growing demand for computility, the reliability of computility services has become increasingly crucial. Due to the escalating volume and complexity of tasks processed, computility services often need to operate under high load, which can easily lead to issues such as resource shortages and service interruptions. Logs in computility services meticulously record the operational information of each component; therefore, anomaly detection based on logs can effectively ensure the stable operation of computility services. This study aims to address two challenges in the field of log anomaly detection. First, this study addresses the previously overlooked issue of class-imbalanced log data. Second, given the massive volumes of log data, the time required for model training poses a significant challenge. To address these issues, we propose EDSLog, a novel efficient log anomaly detection framework based on dataset partitioning. Initially, EDSLog processes log sequences through the Weight-Based K-fold Sub Hold-out Method (WKHM), effectively alleviating the class-imbalance problem. Subsequently, EDSLog leverages Simple Recurrent Units (SRU) enhanced by a self-attention mechanism to extract features from log sequences. Finally, EDSLog determines whether the predicted log data are anomalous. Experiments show that EDSLog achieves the best evaluation metrics in class-imbalanced datasets while having the shortest total model runtime. Specifically, EDSLog achieved the highest F1 scores of 100 and 99.96 respectively on the BGL and HDFS datasets, where abnormal logs account for 0.1% of the data. Additionally, EDSLog’s training speed was 35.62% faster than the model with the second shortest training duration among all models compared.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aydın, H., Orman, Z., Aydın, M.A.: A long short-term memory (LSTM)-based distributed denial of service (DDoS) detection and defense system design in public cloud network environment. Comput. Secur. 118, 102725 (2022)
Chen, A., Fu, Y., Zheng, X., Lu, G.: An efficient network behavior anomaly detection using a hybrid DBN-LSTM network. Comput. Secur. 114, 102600 (2022)
Roy, S., et al.: Why don’t XAI techniques agree? Characterizing the disagreements between post-hoc explanations of defect predictions. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 444–448 (2022)
Du, M., Li, F., Zheng, G., Srikumar, V.: DeepLog: anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of ACM Asia Conference on Computer and Communications Security (AsiaCCS), pp. 1285–1298 (2017)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 42(2), 318–327 (2020)
Yang, L., et al.: Semi-supervised log-based anomaly detection via probabilistic label estimation. In: Proceedings of IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 1448–1460 (2021)
Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder–decoder approaches. In: Proceedings of Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST), pp. 103–111 (2014)
Mei, Y.D., Chen, X., Sun, Y.Z.: A software system anomaly detection method based on log information and CNN-text. Chin. J. Comput. 43, 366–380 (2020)
Lu, S., Wei, X., Li, Y., Wang, L.: Detecting anomaly in big data system logs using convolutional neural network. In: Proceedings of Dependable Autonomic and Secure Computing (DASC), pp. 151–158 (2018)
Zhang, C., et al.: DeepTraLog: trace-log combined microservice anomaly detection through graph-based deep learning. In: Proceedings of the 44th International Conference on Software Engineering (ICSE), pp. 623–634 (2022)
Lei, T., Zhang, Y., Wang, S., Dai, H., Artzi, Y.: Simple recurrent units for highly parallelizable recurrence. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 4470–4481 (2018)
Zhu, J., He, S., He, P., Liu, J., Lyu, M. R.: Loghub: a large collection of system log datasets for AI-driven log analytics. In: Proceedings of IEEE 34th International Symposium on Software Reliability Engineering (ISSRE), pp. 355–366 (2023)
Xu, W., Huang, L., Fox, A., Patterson, D., Jordan, M. I.: Detecting large-scale system problems by mining console logs. In: Proceedings of International Conference on Machine Learning (ICML), pp. 117–131 (2009)
Oliner, A. J., Stearley, J.: What supercomputers say: a study of five system logs. In: Proceedings of Edinburgh, pp. 575–584 (2007)
Le, V.-H., Zhang, H.: Log-based anomaly detection with deep learning: how far are we? In: Proceedings of the 44th International Conference on Software Engineering (ICSE), pp. 1356–1367 (2022)
Wang, Z., Tian, J., Fang, H., Chen, L., Qin, J.: LightLog: a lightweight temporal convolutional network for log anomaly detection on the edge. Comput. Netw. (CN) 203, 108616 (2022)
Jia, T., Li, Y., Yang, Y., Huang, G., Wu, Z.: Augmenting log-based anomaly detection models to reduce false anomalies with human feedback. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (SIGKDD), pp. 3081–3089 (2022)
Vaswani, A., et al.: Attention is all you need. In: Proceedings of Advances in Neural Information Processing Systems (NeurIPS), pp. 5998–6008 (2017)
Studiawan, H., Sohel, F., Payne, C.: Anomaly detection in operating system logs with deep learning-based sentiment analysis. IEEE Trans. Dependable Secure Comput. (TDSC) 18(5), 2136–2148 (2021)
Xie, Y., Zhang, H., Babar, M. A.: LogGD: detecting anomalies from system logs with graph neural networks. In: Proceedings of IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS), pp. 299–310 (2022)
Ou, X., Liu, J.: LogKT: hybrid log anomaly detection method for cloud data center. In: Proceedings of International Computer Software and Applications Conference (COMPSAC), pp. 164–173 (2023)
Acknowledgments
This work was supported in part by the Natural Science Foundation of Inner Mongolia of China (No.2023ZD18), the Natural Science Foundation of China (No.62462047), the Engineering Research Center of Ecological Big Data, Ministry of Education, the fund of Supporting the Reform and Development of Local Universities (Disciplinary Construction) and the special research project of First-class Discipline of Inner Mongolia A. R. of China under Grant YLXKZX-ND-036.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Liang, F., Liu, J. (2025). EDSLog: Efficient Log Anomaly Detection Method Based on Dataset Partitioning. In: Bourke, T., Chen, L., Goharshady, A. (eds) Dependable Software Engineering. Theories, Tools, and Applications. SETTA 2024. Lecture Notes in Computer Science, vol 15469. Springer, Singapore. https://doi.org/10.1007/978-981-96-0602-3_22
Download citation
DOI: https://doi.org/10.1007/978-981-96-0602-3_22
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0601-6
Online ISBN: 978-981-96-0602-3
eBook Packages: Computer ScienceComputer Science (R0)