skip to main content
10.1145/3578357.3589457acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Light-Weight Synthesis of Security Logs for Evaluation of Anomaly Detection and Security Related Experiments

Authors Info & Claims
Published:08 May 2023Publication History

ABSTRACT

Recent decades saw the development of a plethora of approaches that aim to use artificial intelligence to detect anomalies and potential signs of compromise in a computer network. These approaches have commonly been trained and evaluated using only a small number of datasets, which were often criticised in literature. Developing new datasets for this purpose tends to be very resource consuming, as they usually rely on testbeds and network emulation. While this level of details is important for anomaly detection over network traffic, which inspects details of network packets, it is superfluous in cases when such algorithms work with logs of security controls, such as in SIEM systems and approaches for alert correlation. Moreover, evaluation over a testbed generated dataset may not be relevant for the target IT system. In this paper, we propose a light-weight method to enrich existing security control logs with carefully crafted synthetic records that would be produced in case of cyber attacks. This method does not require running a dedicated testbed or comparable specialized equipment. We prepare a set of attack records with emphasis on network scans, and perform experiments with real-world firewall logs and several common anomaly detection algorithms to demonstrate that the injected records are appropriately integrated into the original logs. In the end, we propose future experiments to properly validate the quality of the datasets produced using the proposed method.

References

  1. Monowar H Bhuyan, Dhruba K Bhattacharyya, and Jugal K Kalita. 2015. Towards Generating Real-life Datasets for Network Intrusion Detection. Int. J. Netw. Secur. 17, 6 (2015), 683--701.Google ScholarGoogle Scholar
  2. Nathaniel Boggs, Hang Zhao, Senyao Du, and Salvatore J Stolfo. 2014. Synthetic data generation and defense in depth measurement of web applications. In International Workshop on Recent Advances in Intrusion Detection. Springer, 234--254.Google ScholarGoogle ScholarCross RefCross Ref
  3. Carson Brown, Alex Cowperthwaite, Abdurrahman Hijazi, and Anil Somayaji. 2009. Analysis of the 1999 darpa/lincoln laboratory ids evaluation data with netadhict. In 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications. IEEE, 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  4. Amirhossein Gharib, Iman Sharafaldin, Arash Habibi Lashkari, and Ali A Ghorbani. 2016. An evaluation framework for intrusion detection dataset. In 2016 International Conference on Information Science and Security (ICISS). IEEE, 1--6.Google ScholarGoogle ScholarCross RefCross Ref
  5. Thomas Göbel, Thomas Schäfer, Julien Hachenberger, Jan Türr, and Harald Baier. 2020. A Novel approach for generating synthetic datasets for digital forensics. In IFIP International Conference on Digital Forensics. Springer, 73--93.Google ScholarGoogle ScholarCross RefCross Ref
  6. Waqas Haider, Jiankun Hu, Jill Slay, Benjamin P Turnbull, and Yi Xie. 2017. Generating realistic intrusion detection system dataset based on fuzzy qualitative modeling. Journal of Network and Computer Applications 87 (2017), 185--192.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Zengyou He, Xiaofei Xu, and Shengchun Deng. 2003. Discovering cluster-based local outliers. Pattern recognition letters 24, 9--10 (2003), 1641--1650.Google ScholarGoogle Scholar
  8. Ivan Kovačević, Stjepan Groš, and Karlo Slovenec. 2020. Systematic review and quantitative comparison of cyberattack scenario detection and projection. Electronics 9, 10 (2020), 1722.Google ScholarGoogle ScholarCross RefCross Ref
  9. Ivan Kovačević. 2023. Firewall log PCAP injection. Google ScholarGoogle ScholarCross RefCross Ref
  10. Jiazhong Lu, Fengmao Lv, Zhongliu Zhuo, Xiaosong Zhang, Xiaolei Liu, Teng Hu, and Wei Deng. 2019. Integrating traffics with network device logs for anomaly detection. Security and Communication Networks 2019 (2019).Google ScholarGoogle Scholar
  11. Gordon Fyodor Lyon. 2008. Nmap network scanning: The official Nmap project guide to network discovery and security scanning. Insecure. Com LLC (US).Google ScholarGoogle Scholar
  12. Michael McFail, Jordan Hanna, and Daniel Rebori-Carretero. 2022. Detection Engineering in Industrial Control Systems. Ukraine 2016 Attack: Sandworm Team and Industroyer Case Study. Technical Report. MITRE CORP MCLEAN VA.Google ScholarGoogle Scholar
  13. Nour Moustafa and Jill Slay. 2015. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In 2015 military communications and information systems conference (MilCIS). IEEE, 1--6.Google ScholarGoogle Scholar
  14. Alberto Mozo, Ángel González-Prieto, Antonio Pastor, Sandra Gómez-Canaval, and Edgar Talavera. 2022. Synthetic flow-based cryptomining attack generation through Generative Adversarial Networks. Scientific reports 12, 1 (2022), 1--27.Google ScholarGoogle Scholar
  15. Sowmya Myneni, Ankur Chowdhary, Abdulhakim Sabur, Sailik Sengupta, Garima Agrawal, Dijiang Huang, and Myong Kang. 2020. DAPT 2020-constructing a benchmark dataset for advanced persistent threats. In International Workshop on Deployable Machine Learning for Security Defense. Springer, 138--163.Google ScholarGoogle Scholar
  16. OffSec Services Limited. 2022. Kali Docs. https://www.kali.org/docs/ [Online; accessed 16-December-2022].Google ScholarGoogle Scholar
  17. Stephen O'Shaughnessy and Geraldine Gray. 2011. Development and evaluation of a dataset generator tool for generating synthetic log files containing computer attack signatures. International Journal of Ambient Computing and Intelligence (IJACI) 3, 2 (2011), 64--76.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. C Madhusudhana Rao and MM Naidu. 2017. A model for generating synthetic network flows and accuracy index for evaluation of anomaly network intrusion detection systems. Indian Journal of Science and Technology 10, 14 (2017).Google ScholarGoogle Scholar
  19. Saeed Salah, Gabriel Maciá-Fernández, and Jesús E Díaz-Verdejo. 2013. A model-based survey of alert correlation techniques. Computer Networks 57, 5 (2013), 1289--1317.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Iman Sharafaldin, Arash Habibi Lashkari, and Ali A Ghorbani. 2018. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 1 (2018), 108--116.Google ScholarGoogle Scholar
  21. Ali Shiravi, Hadi Shiravi, Mahbod Tavallaee, and Ali A Ghorbani. 2012. Toward developing a systematic approach to generate benchmark datasets for intrusion detection. computers & security 31, 3 (2012), 357--374.Google ScholarGoogle Scholar
  22. Florian Skopik, Giuseppe Settanni, Roman Fiedler, and Ivo Friedberg. 2014. Semi-synthetic data set generation for security software evaluation. In 2014 Twelfth Annual International Conference on Privacy, Security and Trust. IEEE, 156--163.Google ScholarGoogle ScholarCross RefCross Ref
  23. Mahito Sugiyama and Karsten Borgwardt. 2013. Rapid distance-based outlier detection via sampling. Advances in neural information processing systems 26 (2013).Google ScholarGoogle Scholar
  24. The MITRE Corporation. 2022. CRASHOVERRIDE: Analysis of the Threat to Electric Grid Operations. https://www.dragos.com/wp-content/uploads/CrashOverride-01.pdf [Online; accessed 16-December-2022].Google ScholarGoogle Scholar
  25. The MITRE Corporation. 2022. Industroyer. https://attack.mitre.org/software/S0604/ [Online; accessed 16-December-2022].Google ScholarGoogle Scholar
  26. Markus Wurzenberger, Florian Skopik, Giuseppe Settanni, and Wolfgang Scherrer. 2016. Complex log file synthesis for rapid sandbox-benchmarking of security-and computer network analysis tools. Information Systems 60 (2016), 13--33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Yue Zhao, Zain Nasrullah, and Zheng Li. 2019. PyOD: A Python Toolbox for Scalable Outlier Detection. Journal of Machine Learning Research 20, 96 (2019), 1--7. http://jmlr.org/papers/v20/19-011.htmlGoogle ScholarGoogle Scholar
  28. Richard Zuech, Taghi M Khoshgoftaar, Naeem Seliya, Maryam M Najafabadi, and Clifford Kemp. 2015. A new intrusion detection benchmarking system. In The Twenty-Eighth International Flairs Conference.Google ScholarGoogle Scholar

Index Terms

  1. Light-Weight Synthesis of Security Logs for Evaluation of Anomaly Detection and Security Related Experiments

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          EUROSEC '23: Proceedings of the 16th European Workshop on System Security
          May 2023
          56 pages
          ISBN:9798400700859
          DOI:10.1145/3578357

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 8 May 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate47of113submissions,42%
        • Article Metrics

          • Downloads (Last 12 months)114
          • Downloads (Last 6 weeks)4

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader