Abstract
Logs have been used in modern software solutions for development and maintenance purposes as they are able to represent a rich source of information for subsequent analysis. A line of research focuses on the application of artificial intelligence techniques on logs to predict system behavior and to perform anomaly detection. Successful industrial applications are rather sparse due to the lack of publicly available log datasets. To fill this gap, we developed a method to synthetically generate a log dataset, which resembles a linear program execution log file. In this paper, the method is described as well as existing datasets are discussed. The generated dataset should enable a possibility for researcher to have a common base for new approaches.
The research reported in this paper has been funded by the Federal Ministry for Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK), the Federal Ministry for Digital and Economic Affairs (BMDW), and the Province of Upper Austria in the frame of the COMET - Competence Centers for Excellent Technologies Programme managed by Austrian Research Promotion Agency FFG.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adiga, N.R., et al.: An overview of the BlueGene/L supercomputer. In: SC 2002: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, p. 60. IEEE (2002)
Borthakur, D., et al.: HDFS architecture guide. Hadoop Apache Project 53(1–13), 2 (2008)
Du, M., Li, F., Zheng, G., Srikumar, V.: DeepLog: anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1285–1298 (2017)
Fu, Q., Lou, J.G., Wang, Y., Li, J.: Execution anomaly detection in distributed systems through unstructured log analysis. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 149–158. IEEE (2009)
He, S., Zhu, J., He, P., Lyu, M.R.: Experience report: system log analysis for anomaly detection. In: 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp. 207–218. IEEE (2016)
He, S., Zhu, J., He, P., Lyu, M.R.: Loghub: a large collection of system log datasets towards automated log analytics. arXiv preprint arXiv:2008.06448 (2020)
Sefraoui, O., Aissaoui, M., Eleuldj, M.: OpenStack: toward an open-source solution for cloud computing. Int. J. Comput. Appl. 55(3), 38–42 (2012)
White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc., Sebastopol (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Luftensteiner, S., Praher, P. (2022). A Synthetic Dataset for Anomaly Detection of Machine Behavior. In: Kotsis, G., et al. Database and Expert Systems Applications - DEXA 2022 Workshops. DEXA 2022. Communications in Computer and Information Science, vol 1633. Springer, Cham. https://doi.org/10.1007/978-3-031-14343-4_40
Download citation
DOI: https://doi.org/10.1007/978-3-031-14343-4_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14342-7
Online ISBN: 978-3-031-14343-4
eBook Packages: Computer ScienceComputer Science (R0)