Skip to main content

A Synthetic Dataset for Anomaly Detection of Machine Behavior

  • Conference paper
  • First Online:
Database and Expert Systems Applications - DEXA 2022 Workshops (DEXA 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1633))

Included in the following conference series:

  • 751 Accesses

Abstract

Logs have been used in modern software solutions for development and maintenance purposes as they are able to represent a rich source of information for subsequent analysis. A line of research focuses on the application of artificial intelligence techniques on logs to predict system behavior and to perform anomaly detection. Successful industrial applications are rather sparse due to the lack of publicly available log datasets. To fill this gap, we developed a method to synthetically generate a log dataset, which resembles a linear program execution log file. In this paper, the method is described as well as existing datasets are discussed. The generated dataset should enable a possibility for researcher to have a common base for new approaches.

The research reported in this paper has been funded by the Federal Ministry for Climate Action, Environment, Energy, Mobility, Innovation and Technology (BMK), the Federal Ministry for Digital and Economic Affairs (BMDW), and the Province of Upper Austria in the frame of the COMET - Competence Centers for Excellent Technologies Programme managed by Austrian Research Promotion Agency FFG.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Adiga, N.R., et al.: An overview of the BlueGene/L supercomputer. In: SC 2002: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, p. 60. IEEE (2002)

    Google Scholar 

  2. Borthakur, D., et al.: HDFS architecture guide. Hadoop Apache Project 53(1–13), 2 (2008)

    Google Scholar 

  3. Du, M., Li, F., Zheng, G., Srikumar, V.: DeepLog: anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1285–1298 (2017)

    Google Scholar 

  4. Fu, Q., Lou, J.G., Wang, Y., Li, J.: Execution anomaly detection in distributed systems through unstructured log analysis. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 149–158. IEEE (2009)

    Google Scholar 

  5. He, S., Zhu, J., He, P., Lyu, M.R.: Experience report: system log analysis for anomaly detection. In: 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), pp. 207–218. IEEE (2016)

    Google Scholar 

  6. He, S., Zhu, J., He, P., Lyu, M.R.: Loghub: a large collection of system log datasets towards automated log analytics. arXiv preprint arXiv:2008.06448 (2020)

  7. Sefraoui, O., Aissaoui, M., Eleuldj, M.: OpenStack: toward an open-source solution for cloud computing. Int. J. Comput. Appl. 55(3), 38–42 (2012)

    Google Scholar 

  8. White, T.: Hadoop: The Definitive Guide. O’Reilly Media, Inc., Sebastopol (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sabrina Luftensteiner .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Luftensteiner, S., Praher, P. (2022). A Synthetic Dataset for Anomaly Detection of Machine Behavior. In: Kotsis, G., et al. Database and Expert Systems Applications - DEXA 2022 Workshops. DEXA 2022. Communications in Computer and Information Science, vol 1633. Springer, Cham. https://doi.org/10.1007/978-3-031-14343-4_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-14343-4_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-14342-7

  • Online ISBN: 978-3-031-14343-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics