skip to main content
10.1145/3607505.3607510acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsetConference Proceedingsconference-collections
research-article

Towards Reproducible Ransomware Analysis

Published:21 August 2023Publication History

ABSTRACT

Ransomware attacks continue to be a prominent cybersecurity threat and the subject of considerable research activity. Despite frequent high profile public reports of ransomware attacks, we found a paucity of tangible open behavioral activity data for large collections of real world ransomware binaries. The lack of such open datasets introduces barriers to research that may otherwise lead to innovative approaches to ransomware mitigation. We have constructed a dataset of ransomware activity logs and corresponding provenance graphs. They are derived from the sandboxed execution of all ransomware-tagged binaries in the widely-known MalwareBazaar. We also provide the code for orchestrating the log collection and provenance inference steps. The aim is to enable other researchers to customize and extend it for their analyses. We hope that the dataset will facilitate the discovery of innovative and effective ransomware mitigation strategies.

References

  1. [n. d.]. Cuckoo Sandbox. https://cuckoosandbox.org/Google ScholarGoogle Scholar
  2. [n. d.]. FBI No Longer Negotiating with Ransomware Group That Leaked Oakland Data. https://abc7news.com/oakland-ransomware-hacked-data-leaked-fbi-dark-web/13225220/Google ScholarGoogle Scholar
  3. [n. d.]. MalwareBazaar. https://bazaar.abuse.ch/browse/Google ScholarGoogle Scholar
  4. [n. d.]. Ransomware Full Recovery Could Take Months, Dallas Officials Say. https://www.dallasnews.com/news/politics/2023/05/11/ransomware-full-recovery-could-take-months-dallas-officials-say/Google ScholarGoogle Scholar
  5. [n. d.]. Tukey five-number summary. https://en.wikipedia.org/wiki/Five-number_summaryGoogle ScholarGoogle Scholar
  6. Muhammad Ejaz Ahmed, Hyoungshick Kim, Seyit Camtepe, and Surya Nepal. 2021. Peeler: Profiling Kernel-Level Events to Detect Ransomware. 26th European Symposium on Research in Computer Security (2021).Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Mathieu Barre, Ashish Gehani, and Vinod Yegneswaran. 2019. Mining Data Provenance to Detect Advanced Persistent Threats. 11th USENIX Workshop on the Theory and Practice of Provenance (TaPP) (2019).Google ScholarGoogle Scholar
  8. Gordon Blair. 2022. Test of Time Award. ACM Middleware (2022). https://middleware-conf.github.io/2022/awards/#testOfTimeGoogle ScholarGoogle Scholar
  9. Simon Davies, Richard Macfarlane, and William J Buchanan. 2022. NapierOne: A Modern Mixed File Data Set Alternative to Govdocs1. Forensic Science International: Digital Investigation 40 (2022).Google ScholarGoogle Scholar
  10. Feng Dong, Liu Wang, Xu Nie, Fei Shao, Haoyu Wang, Ding Li, Xiapu Luo, and Xusheng Xiao. 2023. DISTDET: A Cost-Effective Distributed Cyber Threat Detection System. 30th USENIX Security Symposium (2023).Google ScholarGoogle Scholar
  11. John Ellson, Emden Gansner, Lefteris Koutsofios, Stephen C North, and Gordon Woodhull. 2002. Graphviz — Open Source Graph Drawing Tools. 9th International Symposium on Graph Drawing (2002).Google ScholarGoogle ScholarCross RefCross Ref
  12. Ashish Gehani, Raza Ahmad, Hassaan Irshad, Jianqiao Zhu, and Jignesh Patel. 2021. Digging Into "Big Provenance" (With SPADE). Commun. ACM 64(12) (2021).Google ScholarGoogle Scholar
  13. Ashish Gehani and Dawood Tariq. 2012. SPADE: Support for Provenance Auditing in Distributed Environments. 13th ACM/IFIP/USENIX International Middleware Conference (2012).Google ScholarGoogle Scholar
  14. REPROD GitHub. [n. d.]. Code for orchestrating ransomware execution log and provenance collection. https://github.com/REPROD-provGoogle ScholarGoogle Scholar
  15. Xueyuan Han, James Mickens, Ashish Gehani, Margo Seltzer, and Thomas Pasquier. 2020. Xanthus: Push-button Orchestration of Host Provenance Data Collection. 3rd ACM Workshop on Practical Reproducible Evaluation of Computer Systems (P-RECS) (2020).Google ScholarGoogle Scholar
  16. Manabu Hirano, Ryo Hodota, and Ryotaro Kobayashi. 2022. RanSAP: An Open Dataset of Ransomware Storage Access Patterns for Training Machine Learning Models. Forensic Science International: Digital Investigation 40 (2022).Google ScholarGoogle Scholar
  17. Hassaan Irshad, Gabriela Ciocarlie, Ashish Gehani, Vinod Yegneswaran, Kyu Hyung Lee, Jignesh Patel, Somesh Jha, Yonghwi Kwon, Dongyan Xu, and Xiangyu Zhang. 2021. TRACE: Enterprise-Wide Provenance Tracking For Real-Time APT Detection. IEEE Transactions on Information Forensics and Security (TIFS) 16 (2021).Google ScholarGoogle Scholar
  18. Amin Kharaz, Sajjad Arshad, Collin Mulliner, William Robertson, and Engin Kirda. 2016. UNVEIL: A Large-Scale, Automated Approach to Detecting Ransomware. 25th USENIX Security Symposium (2016).Google ScholarGoogle Scholar
  19. Rui Mei, Han-Bing Yan, and Zhi-Hui Han. 2021. RansomLens: Understanding Ransomware via Causality Analysis on System Provenance Graph. Science of Cyber Security (2021).Google ScholarGoogle Scholar
  20. Richard Vanderford. 2023. Merck’s Insurers On the Hook in $1.4 Billion NotPetya Attack, Court Says. Wall Street Journal (2023).Google ScholarGoogle Scholar
  21. Aldin Vehabovic, Nasir Ghani, Elias Bou-Harb, Jorge Crichigno, and Aysegul Yayimli. 2022. Ransomware Detection and Classification Strategies. IEEE International Black Sea Conference on Communications and Networking (2022).Google ScholarGoogle Scholar
  22. Christian Wojner. [n. d.]. DensityScout. https://cert.at/en/downloads/software/software-densityscoutGoogle ScholarGoogle Scholar
  23. REPROD Zenodo. [n. d.]. Ransomware execution trace and provenance data. https://doi.org/10.5281/zenodo.7933806Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Towards Reproducible Ransomware Analysis

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      CSET '23: Proceedings of the 16th Cyber Security Experimentation and Test Workshop
      August 2023
      87 pages
      ISBN:9798400707889
      DOI:10.1145/3607505

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 21 August 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)134
      • Downloads (Last 6 weeks)29

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format