research-article

Public Access

Deceiving Portable Executable Malware Classifiers into Targeted Misclassification with Practical Adversarial Examples

Authors:

Guanhua YanAuthors Info & Claims

CODASPY '20: Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy

Pages 341 - 352

https://doi.org/10.1145/3374664.3375741

Published: 16 March 2020 Publication History

Abstract

Due to voluminous malware attacks in the cyberspace, machine learning has become popular for automating malware detection and classification. In this work we play devil's advocate by investigating a new type of threats aimed at deceiving multi-class Portable Executable (PE) malware classifiers into targeted misclassification with practical adversarial samples. Using a malware dataset with tens of thousands of samples, we construct three types of PE malware classifiers, the first one based on frequencies of opcodes in the disassembled malware code (opcode classifier), the second one the list of API functions imported by each PE sample (API classifier), and the third one the list of system calls observed in dynamic execution (system call classifier). We develop a genetic algorithm augmented with different support functions to deceive these classifiers into misclassifying a PE sample into any target family. Using an Rbot malware sample whose source code is publicly available, we are able to create practical adversarial samples that can deceive the opcode classifier into targeted misclassification with a successful rate of 75%, the API classifier with a successful rate of 83.3%, and the system call classifier with a successful rate of 91.7%.

References

[1]

https://scipy.github.io/devdocs/generated/scipy.spatial.distance.jensenshannon.html.

[2]

Cuckoo Sandbox. https://cuckoosandbox.org.

[3]

FakeNet. https://github.com/fireeye/flare-fakenet-ng.

[4]

IDA Pro. https://www.hex-rays.com/.

[5]

Microsoft Malware Classification Challenge (BIG 2015). https://www.kaggle.com/c/malware-classification.

[6]

Pin - A Dynamic Binary Instrumentation Tool. https://software.intel.com/en-us/articles/pin-a-dynamic-binary-instrumentation-tool.

[7]

Rbot source code 0.0.3. https://github.com/ytisf/theZoo/tree/master/malwares/Source/Original/rBot0.3.3_May2004.

[8]

Stuxnet. https://en.wikipedia.org/wiki/Stuxnet.

[9]

VirusShare.com - Because Sharing is Caring. https://virusshare.com/.

[10]

VirusTotal. https://www.virustotal.com/.

[11]

Y. Aafer, W. Du, and H. Yin. Droidapiminer: Mining api-level features for robust malware detection in android. In International Conference on Security and Privacy in Communication Systems, pages 86--103. Springer, 2013.

[12]

A. Abusnaina, A. Khormali, H. Alasmary, J. Park, A. Anwar, U. Meteriz, and A. Mohaisen. Examining adversarial learning against graph-based iot malware detection systems. arXiv preprint arXiv:1902.04416, 2019.

[13]

F. Ahmed, H. Hameed, M. Z. Shafiq, and M. Farooq. Using spatio-temporal information in api calls with machine learning algorithms for malware detection. In Proceedings of the ACM workshop on Security and artificial intelligence, 2009.

Digital Library

[14]

A. Al-Dujaili, A. Huang, E. Hemberg, and U.-M. O'Reilly. Adversarial deep learning for robust detection of binary encoded malware. In 2018 IEEE Security and Privacy Workshops (SPW), pages 76--82. IEEE, 2018.

[15]

H. Anderson, A. Kharkar, B. Filar, D. Evans, and P. Roth. Learning to evade static PE machine learning malware models via reinforcement learning. arXiv preprint arXiv:1801.08917, 2018.

[16]

D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, and K. Rieck. Drebin: Effective and explainable detection of android malware in your pocket. In Proceedings of the Network and Distributed System Security Symposium, 2014.

[17]

AV-TEST. Malware statistics & trends report. https://www.av-test.org/en/statistics/malware/, Accessed in March 2018.

[18]

B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. vS rndić, P. Laskov, G. Giacinto, and F. Roli. Evasion attacks against machine learning at test time. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, 2013.

Digital Library

[19]

D. Bilar. Opcodes as predictor for malware. International Journal of Electronic Security and Digital Forensics, 1(2):156--168, 2007.

Digital Library

[20]

D. Canali, A. Lanzi, D. Balzarotti, C. Kruegel, M. Christodorescu, and E. Kirda. A quantitative study of accuracy in system call-based malware detection. In International Symposium on Software Testing and Analysis. ACM, 2012.

Digital Library

[21]

R. J. Canzanese Jr. Detection and Classification of Malicious Processes Using System Call Analysis. PhD thesis, Drexel University, 2015.

[22]

N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In 2017 IEEE Symposium on Security and Privacy (SP), pages 39--57. IEEE, 2017.

[23]

S. Cesare and Y. Xiang. Classification of malware using structured control flow. In Proceedings of the Eighth Australasian Symposium on Parallel and Distributed Computing-Volume 107, pages 61--70. Australian Computer Society, Inc., 2010.

Digital Library

[24]

P.-Y. Chen, Y. Sharma, H. Zhang, J. Yi, and C.-J. Hsieh. Ead: elastic-net attacks to deep neural networks via adversarial examples. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.

[25]

A. Demontis, M. Melis, B. Biggio, D. Maiorca, D. Arp, K. Rieck, I. Corona, G. Giacinto, and F. Roli. Yes, machine learning can be more secure! a case study on android malware detection. IEEE Transactions on Dependable and Secure Computing, 2019.

Digital Library

[26]

Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li. Boosting adversarial attacks with momentum. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9185--9193, 2018.

[27]

I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.

[28]

K. Grosse, N. Papernot, P. Manoharan, M. Backes, and P. McDaniel. Adversarial examples for malware detection. In European Symposium on Research in Computer Security, pages 62--79. Springer, 2017.

[29]

S. Hou, Y. Ye, Y. Song, and M. Abdulhayoglu. Hindroid: An intelligent android malware detection system based on structured heterogeneous information network. In Proceedings of ACM Conference on Knowledge Discovery and Data Mining, pages 1507--1515, 2017.

Digital Library

[30]

W. Hu and Y. Tan. Generating adversarial malware examples for black-box attacks based on gan. arXiv preprint arXiv:1702.05983, 2017.

[31]

P. Junod, J. Rinaldini, J. Wehrli, and J. Michielin. Obfuscator-LLVM -- software protection for the masses. In B. Wyseur, editor, Proceedings of the IEEE/ACM 1st International Workshop on Software Protection, 2015.

Digital Library

[32]

B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto, C. Eckert, and F. Roli. Adversarial malware binaries: Evading deep learning for malware detection in executables. In European Signal Processing Conference. IEEE, 2018.

[33]

J. Z. Kolter and M. A. Maloof. Learning to detect malicious executables in the wild. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 470--478. ACM, 2004.

Digital Library

[34]

D. Kong and G. Yan. Discriminant malware distance learning on structural information for automated malware classification. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013.

Digital Library

[35]

F. Kreuk, A. Barak, S. Aviv-Reuven, M. Baruch, B. Pinkas, and J. Keshet. Deceiving end-to-end deep learning malware detectors using adversarial examples. arXiv preprint arXiv:1802.04528, 2018.

[36]

J. Lin. Divergence measures based on the shannon entropy. IEEE Transactions on Information theory, 37(1):145--151, 1991.

Digital Library

[37]

X. Liu, Y. Lin, H. Li, and J. Zhang. Adversarial examples: Attacks on machine learning-based malware visualization detection methods. arXiv preprint arXiv:1808.01546, 2018.

[38]

S. B. Mehdi, A. K. Tanwani, and M. Farooq. Imad: in-execution malware analysis and detection. In Proceedings of the 11th Annual conference on Genetic and evolutionary computation, pages 1553--1560. ACM, 2009.

Digital Library

[39]

N. Miramirkhani, M. P. Appini, N. Nikiforakis, and M. Polychronakis. Spotless sandboxes: Evading malware analysis systems using wear-and-tear artifacts. In 2017 IEEE Symposium on Security and Privacy (SP), pages 1009--1024. IEEE, 2017.

[40]

A. Mohaisen, O. Alrawi, and M. Mohaisen. Amal: High-fidelity, behavior-based automated malware analysis and classification. Computers & Security, 52, 2015.

[41]

S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.

[42]

R. Moskovitch, C. Feher, N. Tzachar, E. Berger, M. Gitelman, S. Dolev, and Y. Elovici. Unknown malcode detection using opcode representation. In European conference on intelligence and security informatics. Springer, 2008.

Digital Library

[43]

N. Nissim, A. Cohen, C. Glezer, and Y. Elovici. Detection of malicious pdf files and directions for enhancements: a state-of-the art survey. Computers & Security, 48:246--266, 2015.

Digital Library

[44]

N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami. The limitations of deep learning in adversarial settings. In 2016 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 2016.

[45]

R. Perdisci, A. Lanzi, and W. Lee. Mcboost: Boosting scalability in malware collection and analysis using statistical classification of executables. In Proceedings of the Annual Computer Security Applications Conference (ACSAC), 2008.

Digital Library

[46]

K. Raman. Selecting features to classify malware. In Proceedings of InfoSec Southwest, 2012.

[47]

K. Rieck, T. Holz, C. Willems, P. Düssel, and P. Laskov. Learning and classification of malware behavior. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 2008.

Digital Library

[48]

I. Rosenberg, A. Shabtai, L. Rokach, and Y. Elovici. Generic black-box end-to-end attack against state of the art api call based malware classifiers. In International Symposium on Research in Attacks, Intrusions, and Defenses.

[49]

C. Rossow, C. J. Dietrich, C. Grier, C. Kreibich, V. Paxson, N. Pohlmann, H. Bos, and M. Van Steen. Prudent practices for designing malware experiments: Status quo and outlook. In 2012 IEEE Symposium on Security and Privacy. IEEE, 2012.

Digital Library

[50]

J. Sahs and L. Khan. A machine learning approach to android malware detection. In 2012 European Intelligence and Security Informatics Conference. IEEE, 2012.

Digital Library

[51]

A. Sami, B. Yadegari, H. Rahimi, N. Peiravian, S. Hashemi, and A. Hamze. Malware detection based on mining api calls. In Proceedings of the ACM Symposium on Applied Computing. ACM, 2010.

Digital Library

[52]

M. G. Schultz, E. Eskin, E. Zadok, and S. J. Stolfo. Data mining methods for detection of new malicious executables. In Proceedings of the IEEE Symposium on Security and Privacy, 2001.

Digital Library

[53]

A. Shabtai, R. Moskovitch, C. Feher, S. Dolev, and Y. Elovici. Detecting unknown malicious code by applying classification techniques on opcode patterns. Security Informatics, 1(1):1, 2012.

[54]

M. Z. Shafiq, S. M. Tabish, F. Mirza, and M. Farooq. PE-Miner: Mining structural information to detect malicious executables in realtime. In International Symposium on Recent Advances in Intrusion Detection. Springer-Verlag, 2009.

Digital Library

[55]

F. Shahzad and M. Farooq. Elf-miner: Using structural knowledge and data mining methods to detect new (linux) malicious executables. Knowledge and information systems, 30(3):589--612, 2012.

[56]

C. Smutz and A. Stavrou. Malicious pdf detection using metadata and structural features. In Annual Computer Security Applications Conference. ACM, 2012.

Digital Library

[57]

C. Smutz and A. Stavrou. When a tree falls: Using diversity in ensemble classifiers to identify evasion in malware detectors. In Proceedings of the Network and Distributed System Security Symposium, 2016.

[58]

N. vS rndić and P. Laskov. Practical evasion of a learning-based classifier: A case study. In Proceedings of the IEEE symposium on security and privacy, 2014.

[59]

N. vS rndić and P. Laskov. Hidost: a static machine-learning-based detector of malicious files. EURASIP Journal on Information Security, 2016(1):22, 2016.

Digital Library

[60]

M. Suenaga. A museum of API obfuscation on win32. Symantec Security Response, 2009.

[61]

Y. Vorobeychik and M. Kantarcioglu. Adversarial machine learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 12(3):1--169, 2018.

[62]

W. Xu, Y. Qi, and D. Evans. Automatically evading classifiers. In Proceedings of the 2016 Network and Distributed Systems Symposium, 2016.

[63]

G. Yan, N. Brown, and D. Kong. Exploring discriminatory features for automated malware classification. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, pages 41--61. Springer, 2013.

Digital Library

[64]

W. Yang, D. Kong, T. Xie, and C. A. Gunter. Malware detection in adversarial settings: Exploiting feature evolutions and confusions in android apps. In Proceedings of the 33rd Annual Computer Security Applications Conference, 2017.

Digital Library

[65]

Y. Ye, D. Wang, T. Li, and D. Ye. IMDS: Intelligent malware detection system. In International conference on Knowledge Discovery and Data Mining, 2007.

Digital Library

Cited By

Card QAryal KGupta M(2024)Explainability-Informed Targeted Malware Misclassification2024 33rd International Conference on Computer Communications and Networks (ICCCN)10.1109/ICCCN61486.2024.10637629(1-8)Online publication date: 29-Jul-2024
https://doi.org/10.1109/ICCCN61486.2024.10637629
Arya MArya SArya S(2023)An Evaluation of Real-time Malware Detection in IoT Devices: Comparison of Machine Learning Algorithms with RapidMiner2023 IEEE International Conference on Electro Information Technology (eIT)10.1109/eIT57321.2023.10187265(077-082)Online publication date: 18-May-2023
https://doi.org/10.1109/eIT57321.2023.10187265
Yang LChen ZCortellazzi JPendlebury FTu KPierazzi FCavallaro LWang G(2023)Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179347(719-736)Online publication date: May-2023
https://doi.org/10.1109/SP46215.2023.10179347
Show More Cited By

Index Terms

Deceiving Portable Executable Malware Classifiers into Targeted Misclassification with Practical Adversarial Examples
1. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
    1. Malware and its mitigation
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Reinforcement learning
        Adversarial learning

Recommendations

A survey on practical adversarial examples for malware classifiers
ROOTS'20: Reversing and Offensive-oriented Trends Symposium

Machine learning based solutions have been very helpful in solving problems that deal with immense amounts of data, such as malware detection and classification. However, deep neural networks have been found to be vulnerable to adversarial examples, or ...
Evading API Call Sequence Based Malware Classifiers
Information and Communications Security
Abstract
In this paper, we present a mimicry attack to transform malware binary, which can evade detection by API call sequence based malware classifiers. While original malware was detectable by malware classifiers, transformed malware, when run, with ...
Towards robust CNN-based malware classifiers using adversarial examples generated based on two saliency similarities
Abstract
Targeted malware attacks are usually more purposeful and harmful than untargeted attacks, so it is important to perform the malware family classification. In classification tasks, convolutional neural networks (CNN) have shown superior ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CODASPY '20: Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy

March 2020

392 pages

ISBN:9781450371070

DOI:10.1145/3374664

General Chairs:
Vassil Roussev
University of New Orleans, USA
,
Bhavani Thuraisingham
University of Texas at Dallas, USA
,
Program Chairs:
Barbara Carminati
University of Insubria, Italy
,
Murat Kantarcioglu
University of Texas at Dallas, USA

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 March 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

CODASPY '20

Sponsor:

SIGSAC

CODASPY '20: Tenth ACM Conference on Data and Application Security and Privacy

March 16 - 18, 2020

LA, New Orleans, USA

Acceptance Rates

Overall Acceptance Rate 149 of 789 submissions, 19%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
734
Total Downloads

Downloads (Last 12 months)134
Downloads (Last 6 weeks)16

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Card QAryal KGupta M(2024)Explainability-Informed Targeted Malware Misclassification2024 33rd International Conference on Computer Communications and Networks (ICCCN)10.1109/ICCCN61486.2024.10637629(1-8)Online publication date: 29-Jul-2024
https://doi.org/10.1109/ICCCN61486.2024.10637629
Arya MArya SArya S(2023)An Evaluation of Real-time Malware Detection in IoT Devices: Comparison of Machine Learning Algorithms with RapidMiner2023 IEEE International Conference on Electro Information Technology (eIT)10.1109/eIT57321.2023.10187265(077-082)Online publication date: 18-May-2023
https://doi.org/10.1109/eIT57321.2023.10187265
Yang LChen ZCortellazzi JPendlebury FTu KPierazzi FCavallaro LWang G(2023)Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179347(719-736)Online publication date: May-2023
https://doi.org/10.1109/SP46215.2023.10179347
Yan SRen JWang WSun LZhang WYu Q(2023)A Survey of Adversarial Attack and Defense Methods for Malware Classification in Cyber SecurityIEEE Communications Surveys & Tutorials10.1109/COMST.2022.322513725:1(467-496)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/COMST.2022.3225137
Ajmal AKhan SAlam MMehbodniya AWebber JWaheed A(2023)Toward Effective Evaluation of Cyber Defense: Threat Based Adversary Emulation ApproachIEEE Access10.1109/ACCESS.2023.327262911(70443-70458)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3272629
Zhan DHu YLi WChen JGuo SPan Z(2023)Towards robust CNN-based malware classifiers using adversarial examples generated based on two saliency similaritiesNeural Computing and Applications10.1007/s00521-023-08590-135:23(17129-17146)Online publication date: 27-Apr-2023
https://dl.acm.org/doi/10.1007/s00521-023-08590-1
Trizna DDemontis AChen XTramèr F(2022)Quo Vadis: Hybrid Machine Learning Meta-Model Based on Contextual and Behavioral Malware RepresentationsProceedings of the 15th ACM Workshop on Artificial Intelligence and Security10.1145/3560830.3563726(127-136)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3560830.3563726
Cocca DPirozzi AVisaggio C(2022)We cannot trust in you: a study about the dissonance among anti-malware engines.Proceedings of the 17th International Conference on Availability, Reliability and Security10.1145/3538969.3544411(1-13)Online publication date: 23-Aug-2022
https://dl.acm.org/doi/10.1145/3538969.3544411
Shafiei ARimmer VTsingenopoulos IDesmet LJoosen WPierazzi FŠrndić N(2022)Position PaperProceedings of the 1st Workshop on Robust Malware Analysis10.1145/3494110.3528244(15-20)Online publication date: 30-May-2022
https://dl.acm.org/doi/10.1145/3494110.3528244
Li DLi QYe YXu S(2021)A Framework for Enhancing Deep Neural Networks Against Adversarial MalwareIEEE Transactions on Network Science and Engineering10.1109/TNSE.2021.30513548:1(736-750)Online publication date: 1-Jan-2021
https://doi.org/10.1109/TNSE.2021.3051354
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten