skip to main content
10.1145/3512290.3528693acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Adapting novelty towards generating antigens for antivirus systems

Published:08 July 2022Publication History

ABSTRACT

It is well known that anti-malware scanners depend on malware signatures to identify malware. However, even minor modifications to malware code structure results in a change in the malware signature thus enabling the variant to evade detection by scanners. Therefore, there exists the need for a proactively generated malware variant dataset to aid detection of such diverse variants by automated antivirus scanners. This paper proposes and demonstrates a generic assembly source code based framework that facilitates any evolutionary algorithm to generate diverse and potential variants of an input malware, while retaining its maliciousness, yet capable of evading antivirus scanners. Generic code transformation functions and a novelty search supported quality metric have been proposed as components of the framework to be used respectively as variation operators and fitness function, for evolutionary algorithms. The results demonstrate the effectiveness of the framework in generating diverse variants and the generated variants have been shown to evade over 98% of popular antivirus scanners. The malware variants evolved by the framework can serve as antigens to assist malware analysis engines to improve their malware detection algorithms.

References

  1. Giovanni Apruzzese, Michele Colajanni, Luca Ferretti, Alessandro Guido, and Mirco Marchetti. 2018. On the effectiveness of machine and deep learning for cyber security. In 2018 10th International Conference on Cyber Conflict (CyCon). IEEE, IEEE, Tallinn, 371--390.Google ScholarGoogle ScholarCross RefCross Ref
  2. Kshitiz Aryal, Maanak Gupta, and Mahmoud Abdelsalam. 2021. A Survey on Adversarial Attacks for Malware Analysis. CoRR abs/2111.08223 (2021), arXiv-2111. arXiv:2111.08223 https://arxiv.org/abs/2111.08223Google ScholarGoogle Scholar
  3. Emre Aydogan and Sevil Sen. 2015. Automatic generation of mobile malwares using genetic programming. In European conference on the applications of evolutionary computation. Springer, Copenhagen, 745--756.Google ScholarGoogle ScholarCross RefCross Ref
  4. Wolfgang Banzhaf, Peter Nordin, Robert E Keller, and Frank D Francone. 1998. Genetic programming: an introduction. Vol. 1. Morgan Kaufmann Publishers San Francisco, California.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Shamik Bose, Timothy Barao, and Xiuwen Liu. 2020. Explaining ai for malware detection: Analysis of mechanisms of malconv. In 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, IEEE, Glasgow, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  6. Andrea Cani, Marco Gaudesi, Ernesto Sanchez, Giovanni Squillero, and Alberto Tonda. 2014. Towards automated malware creation: code generation and code integration. In Proceedings of the 29th Annual ACM Symposium on Applied Computing. ACM, Gyeongju, Republic of Korea, 157--160.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Raphael Labaca Castro, Corinna Schmitt, and Gabi Dreo. 2019. AIMED: Evolving Malware with Genetic Programming to Evade Detection. In 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). IEEE, Rotorua, 240--247.Google ScholarGoogle Scholar
  8. Rory Coulter, Qing-Long Han, Lei Pan, Jun Zhang, and Yang Xiang. 2020. Code analysis for intelligent cyber systems: A data-driven approach. Information sciences 524 (2020), 46--58.Google ScholarGoogle Scholar
  9. T Divya and Kandasamy Muniasamy. 2015. Real-time intrusion prediction using hidden Markov model with genetic algorithm. In Artificial intelligence and evolutionary algorithms in engineering systems. Springer, New Delhi, 731--736.Google ScholarGoogle Scholar
  10. Stephane Doncieux, Alban Laflaquière, and Alexandre Coninx. 2019. Novelty search: a theoretical perspective. In Proceedings of the Genetic and Evolutionary Computation Conference. ACM, Prague, 99--106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Stephane Doncieux, Giuseppe Paolo, Alban Laflaquière, and Alexandre Coninx. 2020. Novelty search makes evolvability inevitable. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference. ACM, Lille, 85--93.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Weijie Han, Jingfeng Xue, Yong Wang, Fuquan Zhang, and Xianwei Gao. 2021. APTMalInsight: Identify and cognize APT malware based on system call information and ontology knowledge framework. Information Sciences 546 (2021), 633--664.Google ScholarGoogle ScholarCross RefCross Ref
  13. Shohreh Hosseinzadeh, Sampsa Rauti, Samuel Laurén, Jari-Matti Mäkelä, Johannes Holvitie, Sami Hyrynsalmi, and Ville Leppänen. 2018. Diversification and obfuscation techniques for software security: A systematic literature review. Information and Software Technology 104 (2018), 72--93.Google ScholarGoogle ScholarCross RefCross Ref
  14. Weiwei Hu and Ying Tan. 2017. Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN. Google ScholarGoogle ScholarCross RefCross Ref
  15. Seungho Jeon and Jongsub Moon. 2020. Malware-detection method with a convolutional recurrent neural network using opcode sequences. Information Sciences 535 (2020), 1--15.Google ScholarGoogle ScholarCross RefCross Ref
  16. Alireza Khalilian, Amir Nourazar, Mojtaba Vahidi-Asl, and Hassan Haghighi. 2018. G3MD: Mining frequent opcode sub-graphs for metamorphic malware detection of existing families. Expert Systems with Applications 112 (2018), 15--33.Google ScholarGoogle ScholarCross RefCross Ref
  17. Joel Lehman and Kenneth O Stanley. 2011. Abandoning objectives: Evolution through the search for novelty alone. Evolutionary computation 19, 2 (2011), 189--223.Google ScholarGoogle Scholar
  18. Joel Lehman and Kenneth O Stanley. 2011. Improving evolvability through novelty search and self-adaptation. In 2011 IEEE congress of evolutionary computation (CEC). IEEE, New Orleans, 2693--2700.Google ScholarGoogle Scholar
  19. Yuanzhang Li, Yaxiao Wang, Ye Wang, Lishan Ke, and Yu-an Tan. 2020. A feature-vector generative adversarial network for evading PDF malware classifiers. Information Sciences 523 (2020), 38--48.Google ScholarGoogle ScholarCross RefCross Ref
  20. Mark A. Ludwig. 1991. The Little Black Book of Computer Viruses. Amer Eagle Pubns Inc, Arizona.Google ScholarGoogle Scholar
  21. Noah MacAskill, Zachary Wilkins, and Nur Zincir-Heywood. 2021. Scaling MultiObjective Optimization for Clustering Malware. In 2021 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, Orlando, 1--8.Google ScholarGoogle Scholar
  22. Alexey V Malanov and Vitaliy A Kamlyuk. 2012. Rapid heuristic method and system for recognition of similarity between malware variants. US Patent 8,250,655.Google ScholarGoogle Scholar
  23. Farnoush Manavi and Ali Hamzeh. 2019. A new approach for malware detection based on evolutionary algorithm. In Proceedings of the Genetic and Evolutionary Computation Conference Companion. ACM, Prague, 1619--1624.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. KannanMani S ManiArasuSekar, Paveethran Swaminathan, Ritwik Murali, Govind K Ratan, and Surya V Siva. 2020. Optimal feature selection for non-network malware classification. In 2020 International Conference on Inventive Computation Technologies (ICICT). IEEE, IEEE, Coimbatore, 82--87.Google ScholarGoogle ScholarCross RefCross Ref
  25. Syed Bilal Mehdi, Ajay Kumar Tanwani, and Muddassar Farooq. 2009. Imad: in-execution malware analysis and detection. In Proceedings of the 11th Annual conference on Genetic and evolutionary computation. ACM, Montréal, 1553--1560.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Héctor D Menéndez, David Clark, and Earl T Barr. 2021. Getting ahead of the Arms Race: Hothousing the Coevolution of VirusTotal with a Packer. Entropy 23, 4 (2021), 395.Google ScholarGoogle ScholarCross RefCross Ref
  27. Guozhu Meng, Yinxing Xue, Chandramohan Mahinthan, Annamalai Narayanan, Yang Liu, Jie Zhang, and Tieming Chen. 2016. Mystique: Evolving android malware for auditing anti-malware tools. In Proceedings of the 11th ACM on Asia conference on computer and communications security. ACM, Xi'an, 365--376.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Peter Morley. 2001. Processing virus collections. VIRUS 129 (2001), 129--134.Google ScholarGoogle Scholar
  29. Ritwik Murali, Akash Ravi, and Harshit Agarwal. 2020. A Malware Variant Resistant To Traditional Analysis Techniques. In 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE). IEEE, Chennai, 1--7.Google ScholarGoogle ScholarCross RefCross Ref
  30. Ritwik Murali and C Shunmuga Velayutham. 2020. A preliminary investigation into automatically evolving computer viruses using evolutionary algorithms. Journal of Intelligent & Fuzzy Systems 38, 5 (2020), 6517--6526.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sadia Noreen, Shafaq Murtaza, M Zubair Shafiq, and Muddassar Farooq. 2009. Evolvable malware. In Proceedings of the 11th Annual conference on Genetic and evolutionary computation. ACM, Montréal, 1569--1576.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Sadia Noreen, Shafaq Murtaza, M Zubair Shafiq, and Muddassar Farooq. 2009. Using Formal Grammar and Genetic Operators to Evolve Malware. In Recent Advances in Intrusion Detection (RAID). LNCS, Springer, France, 374--375.Google ScholarGoogle Scholar
  33. TG Gregory Paul and T Gireesh Kumar. 2017. A framework for dynamic malware analysis based on behavior artifacts. In Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications. Springer, Springer, Bhubaneswar, 551--559.Google ScholarGoogle ScholarCross RefCross Ref
  34. M Zubair Rafique, Ping Chen, Christophe Huygens, and Wouter Joosen. 2014. Evolutionary algorithms for classification of malware families through different network behaviors. In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation. ACM, Vancouver, 1167--1174.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Abhishek Singh, Debojyoti Dutta, and Amit Saha. 2019. MIGAN: malware image synthesis using GANs. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. Association for the Advancement of Artificial Intelligence, Honolulu, 10033--10034.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. SONICWALL. 2022. 2021 SonicWall Cyber Threat Report. SONICWALL. Retrieved January 28, 2022 from https://www.sonicwall.com/medialibrary/en/white-paper/2021-cyber-threat-report.pdfGoogle ScholarGoogle Scholar
  37. Peter Szor. 2005. The Art of Computer Virus Research and Defense: ART COMP VIRUS RES DEFENSE _p1. Addison Wesley Professional, USA.Google ScholarGoogle Scholar
  38. Wee Ling Tan and Tram Truong-Huu. 2020. Enhancing Robustness of Malware Detection using Synthetically-adversarial Samples. In GLOBECOM 2020-2020 IEEE Global Communications Conference. IEEE, Taipei, 1--6.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. VirusTotal. 2021. Getting started with VirusTotal. https://developers.virustotal.com/reference Last accessed August 2021.Google ScholarGoogle Scholar
  40. Zachary Wilkins and Nur Zincir-Heywood. 2020. COUGAR: clustering of unknown malware using genetic algorithm routines. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference. ACM, Cancún, 1195--1203.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Yinxing Xue, Guozhu Meng, Yang Liu, Tian Huat Tan, Hongxu Chen, Jun Sun, and Jie Zhang. 2017. Auditing anti-malware tools by evolving android malware and dynamic loading technique. IEEE Transactions on Information Forensics and Security 12, 7 (2017), 1529--1544.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Yanfang Ye, Tao Li, Donald Adjeroh, and S Sitharama Iyengar. 2017. A survey on malware detection using data mining techniques. ACM Computing Surveys (CSUR) 50, 3 (2017), 1--40.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Suyeon Yoo, Sungjin Kim, Seungjae Kim, and Brent Byunghoon Kang. 2021. AIHydRa: Advanced hybrid approach using random forest and deep learning for malware classification. Information Sciences 546 (2021), 420--435.Google ScholarGoogle ScholarCross RefCross Ref
  44. Nur Zincir-Heywood, Marco Mellia, and Yixin Diao. 2021. Overview of Artificial Intelligence and Machine Learning. Wiley Online Library, New Jersey. 19--32 pages.Google ScholarGoogle Scholar

Index Terms

  1. Adapting novelty towards generating antigens for antivirus systems

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference
            July 2022
            1472 pages
            ISBN:9781450392372
            DOI:10.1145/3512290

            Copyright © 2022 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 8 July 2022

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate1,669of4,410submissions,38%

            Upcoming Conference

            GECCO '24
            Genetic and Evolutionary Computation Conference
            July 14 - 18, 2024
            Melbourne , VIC , Australia

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader