Skip to main content

Realistic and Configurable Synthesis of Malware Traces in Windows Systems

  • Conference paper
  • First Online:
Advances in Digital Forensics XVIII (DigitalForensics 2022)

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 653))

Included in the following conference series:

  • 319 Accesses

Abstract

Malware constitutes a long-term challenge to the operation of contemporary information technology systems. A tremendous amount of realistic and current training data is necessary in order to train digital forensic professionals on the use of forensic tools and to update their skills. Unfortunately, very limited training data images are available, especially images of recent malware, for reasons such as privacy, competitive advantage, intellectual property rights and secrecy. A promising solution is to provide recent, realistic corpora produced by dataset synthesis frameworks. However, none of the publicly-available frameworks currently enables the creation of realistic malware traces in a customizable manner, where the synthesis of relevant traces can be configured to meet individual needs.

This chapter presents a concept, implementation and validation of a synthesis framework that generates malware traces for Windows operating systems. The framework is able to generate coherent malware traces at three levels, random-access memory level, network level and hard drive level. A typical malware infection with data exfiltration is demonstrated as a proof of concept.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. S. Abt and H. Baier, Are we missing labels? A study of the availability of ground truth in network security research, Proceedings of the Third International Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, pp. 40–55, 2014.

    Google Scholar 

  2. I. Baggili and F. Breitinger, Data sources for advancing cyber forensics: What the social world has to offer, Proceedings of the AAAI Spring Symposia – Sociotechnical Behavior Mining: From Data to Decisions? pp. 6–9, 2015.

    Google Scholar 

  3. Biometrics and Information Security Group (dasec), hystck-malware-module, GitHub (github.com/dasec/hystck-malware-module), 2022.

    Google Scholar 

  4. D. Brauckhoff, A. Wagner and M. May, FLAME: A flow-level anomaly modeling engine, Proceedings of the Conference on Cyber Security Experimentation and Test, article no. 1, 2008.

    Google Scholar 

  5. B. Carrier, Open Source Digital Forensic Tools: The Legal Argument, @stake, Cambridge, Massachusetts, 2002.

    Google Scholar 

  6. B. Carrier, Digital Forensics Tool Testing Images (www.dftt.sourceforge.net), 2010.

  7. R. Cole, A. Moore, G. Stark and B. Stancill, STOMP 2 DIS: Brilliance in the (visual) basics, Mandiant, Reston, Virginia (www.mandiant.com/resources/stomp-2-dis-brilliance-in-the-visual-basics), February 5, 2020.

  8. C. Cordero, E. Vasilomanolakis, N. Milanov, C. Koch, D. Hausheer and M. Muhlhauser, ID2T: A DIY dataset creation toolkit for intrusion detection systems, Proceedings of the IEEE Conference on Communications and Network Security, pp. 739–740, 2015.

    Google Scholar 

  9. Digital Corpora, Home (www.digitalcorpora.org), 2021.

  10. X. Du, C. Hargreaves, J. Sheppard and M. Scanlon, TraceGen: User activity emulation for digital forensic test image generation, Digital Investigation, vol. 38(S), article no. 301133, 2021.

    Google Scholar 

  11. S. Garfinkel, P. Farrell, V. Roussev and G. Dinolt, Bringing science to digital forensics with standardized forensic corpora, Digital Investigation, vol. 6(S), pp. S2–S11, 2009.

    Google Scholar 

  12. T. Göbel, T. Schäfer, J. Hachenberger, J. Türr and H. Baier, A novel approach for generating synthetic datasets for digital forensics, in Advances in Digital Forensics XVI, G. Peterson and S. Shenoi (Eds.), Springer, Cham, Switzerland, pp. 73–93, 2020.

    Google Scholar 

  13. C. Grajeda, F. Breitinger and I. Baggili, Availability of datasets for digital forensics – And what is missing, Digital Investigation, vol. 22(S), pp. S94–S105, 2017.

    Google Scholar 

  14. A. Hadi, Digital Forensic Challenge Images (Datasets), Champlain College, Burlington, Vermont (www.ashemery.com/dfir.html), 2011.

  15. N. Harbour, Flare-On 7 challenge solutions, Mandiant, Reston, Virginia (www.mandiant.com/resources/flare-7-challenge-solutions), October 23, 2020.

  16. S. Hegt, Evil Clippy: MS Office maldoc assistant, Outflank Blog, Amsterdam, The Netherlands (www.outflank.nl/blog/2019/05/05/evil-clippy-ms-office-maldoc-assistant), May 5, 2019.

  17. J. Huang, A. Yasinsac and P. Hayes, Knowledge sharing and reuse in digital forensics, Proceedings of the Fifth IEEE International Workshop on Systematic Approaches to Digital Forensic Engineering, pp. 73–78, 2010.

    Google Scholar 

  18. D. Lillis, B. Becker, T. O’Sullivan and M. Scanlon, Current challenges and future research areas for digital forensic investigations, Proceedings of the Eleventh Annual Conference on Digital Forensics, Security and Law, 2016.

    Google Scholar 

  19. J. Liu, Ten-year synthesis review: A baccalaureate program in computer forensics, Proceedings of the Seventeenth Annual Conference on Information Technology Education and the Fifth Annual Conference on Research in Information Technology, pp. 121–126, 2016.

    Google Scholar 

  20. M. McMahon and Contributors, What is pywinauto? (pywinauto.readthedocs.io/en/latest), 2018.

    Google Scholar 

  21. MITRE Corporation, Caldera, GitHub (github.com/mitre/caldera), 2021.

    Google Scholar 

  22. C. Moch and F. Freiling, The Forensic Image Generator Generator (Forensig\(^2\)), Proceedings of the Fifth International Conference on IT Security Incident Management and IT Forensics, pp. 78–93, 2009.

    Google Scholar 

  23. C. Moch and F. Freiling, Evaluating the Forensic Image Generator Generator, Proceedings of the International Conference on Digital Forensics and Cyber Crime, pp. 238–252, 2011.

    Google Scholar 

  24. P. Mockapetris, Domain Names – Implementation and Specification, RFC 1035, 1987.

    Google Scholar 

  25. monnappa22, HollowFind, GitHub (github.com/monnappa22/HollowFind), 2016.

    Google Scholar 

  26. National Institute of Standards and Technology, The CFReDS Project, Gaithersburg, Maryland (www.cfreds.nist.gov), 2019.

  27. Outflank, Evil Clippy, GitHub (github.com/outflanknl/EvilClippy), 2021.

    Google Scholar 

  28. Quarkslab, LIEF Project, GitHub (github.com/lief-project/LIEF), 2022.

    Google Scholar 

  29. M. Scanlon, X. Du and D. Lillis, EviPlant: An efficient digital forensics challenge creation, manipulation and distribution solution, Digital Investigation, vol. 20(S), pp. S29–S36, 2017.

    Google Scholar 

  30. Statista, Operating systems most affected by malware as of 1st quarter 2020, New York (www.statista.com/statistics/680943/malware-os-distribution), April 11, 2022.

  31. The Honeynet Project, Challenges (www.honeynet.org/challenges), 2022.

  32. H. Visti, ForGe, Forensic Test Image Generator, GitHub (github.com/hannuvisti/forge), 2015.

    Google Scholar 

  33. H. Visti, S. Tohill and P. Douglas, Automatic creation of computer forensic test images, in Computational Forensics, U. Garain and F. Shafait (Eds.), Springer, Cham, Switzerland, pp. 163–175, 2015.

    Google Scholar 

  34. K. Woods, C. Lee, S. Garfinkel, D. Dittrich, A. Russell and K. Kearton, Creating realistic corpora for security and forensic education, Proceedings of the Sixth Annual Conference on Digital Forensics, Security and Law, 2011.

    Google Scholar 

  35. Y. Yannikos, L. Graner, M. Steinebach and C. Winter, Data corpora for digital forensics education and research, in Advances in Digital Forensics X, G. Peterson and S. Shenoi (Eds.), Springer, Berlin Heidelberg, Germany, pp. 309–325, 2014.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Göbel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 IFIP International Federation for Information Processing

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lukner, M., Göbel, T., Baier, H. (2022). Realistic and Configurable Synthesis of Malware Traces in Windows Systems. In: Peterson, G., Shenoi, S. (eds) Advances in Digital Forensics XVIII. DigitalForensics 2022. IFIP Advances in Information and Communication Technology, vol 653. Springer, Cham. https://doi.org/10.1007/978-3-031-10078-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-10078-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-10077-2

  • Online ISBN: 978-3-031-10078-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics