Abstract
An adversary who aims to steal a black-box model repeatedly queries it via a prediction API to learn its decision boundary. Adversarial approximation is non-trivial because of the enormous alternatives of model architectures, parameters, and features to explore. In this context, the adversary resorts to a best-effort strategy that yields the closest approximation. This paper explores best-effort adversarial approximation of a black-box malware classifier in the most challenging setting, where the adversary’s knowledge is limited to label only for a given input. Beginning with a limited input set, we leverage feature representation mapping and cross-domain transferability to locally approximate a black-box malware classifier. We do so with different feature types for the target and the substitute model while also using non-overlapping data for training the target, training the substitute, and the comparison of the two. Against a Convolutional Neural Network (CNN) trained on raw byte sequences of Windows Portable Executables (PEs), our approach achieves a 92% accurate substitute (trained on pixel representations of PEs), and nearly 90% prediction agreement between the target and the substitute model. Against a 97.8% accurate gradient boosted decision tree trained on static PE features, our 91% accurate substitute agrees with the black-box on 90% of predictions, suggesting the strength of our purely black-box approximation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Advanced guide to inception v3 on cloud TPU (2019). https://cloud.google.com/tpu/docs/inception-v3-advanced
Cnet freeware site (2019). https://download.cnet.com/s/software/windows/?licenseType=Free
Virus share (2019). https://virusshare.com
Virus total (2119). https://www.virustotal.com/gui/home/upload
Al-Dujaili, A., Huang, A., Hemberg, E., O’Reilly, U.: Adversarial deep learning for robust detection of binary encoded malware. In: 2018 IEEE Security and Privacy Workshops, SP Workshops 2018, San Francisco, CA, USA, 24 May 2018, pp. 76–82 (2018)
Anderson, H.S., Roth, P.: EMBER: an open dataset for training static PE malware machine learning models. CoRR abs/1804.04637 (2018)
Biggio, B., et al.: Evasion attacks against machine learning at test time. In: Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Prague, Czech Republic, 23–27 September 2013, Proceedings, Part III, pp. 387–402 (2013)
Biggio, B., Nelson, B., Laskov, P.: Poisoning attacks against support vector machines. In: Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, 26 June – 1 July 2012 (2012)
Biggio, B., Roli, F.: Wild patterns: ten years after the rise of adversarial machine learning. Pattern Recognit. 84, 317–331 (2018)
Byrne, A., Hilbert, D.R.: Color realism and color science. Cambridge Univ. Press 26(1), 3–64 (2003)
Carlini, N., Liu, C., Erlingsson, Ú., Kos, J., Song, D.: The secret sharer: evaluating and testing unintended memorization in neural networks. In: 28th USENIX Security Symposium, USENIX Security 2019, Santa Clara, CA, USA, 14–16 August 2019, pp. 267–284 (2019)
Carlini, N., Wagner, D.A.: Towards evaluating the robustness of neural networks. In: 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, 22–26 May 2017, pp. 39–57 (2017)
Cortezi, A.: binviz (2019). https://github.com/cortesi/scurve/blob/master/binvis
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015)
Han, K., Lim, J.H., Kang, B., Im, E.G.: Malware analysis using visualized images and entropy graphs. Int. J. Inf. Sec. 14(1), 1–14 (2015). https://doi.org/10.1007/s10207-014-0242-0
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN. CoRR abs/1702.05983 (2017)
Hu, W., Tan, Y.: Black-box attacks against RNN based malware detection algorithms. In: The Workshops of the the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, 2–7 February 2018, pp. 245–251 (2018)
Jagielski, M., Carlini, N., Berthelot, D., Kurakin, A., Papernot, N.: High accuracy and high fidelity extraction of neural networks (2020)
Juuti, M., Szyller, S., Marchal, S., Asokan, N.: PRADA: protecting against DNN model stealing attacks. In: IEEE European Symposium on Security and Privacy, EuroS&P 2019, Stockholm, Sweden, 17–19 June 2019, pp. 512–527 (2019)
Ke, G., et al.: Lightgbm: a highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December 2017, Long Beach, CA, USA, pp. 3146–3154 (2017)
Khormali, A., Abusnaina, A., Chen, S., Nyang, D., Mohaisen, A.: COPYCAT: practical adversarial attacks on visualization-based malware detection. CoRR abs/1909.09735 (2019)
Kolosnjaji, B., et al.: Adversarial malware binaries: evading deep learning for malware detection in executables. In: 26th European Signal Processing Conference, EUSIPCO 2018, Roma, Italy, 3–7 September 2018, pp. 533–537 (2018)
Nataraj, L., Karthikeyan, S., Jacob, G., Manjunath, B.S.: Malware images: visualization and automatic classification. In: Proceedings of the 8th International Symposium on Visualization for Cyber Security, VizSec 2011, pp. 4:1–4:7 (2011)
Orekondy, T., Schiele, B., Fritz, M.: Knockoff nets: stealing functionality of black-box models. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019, pp. 4954–4963 (2019)
Papernot, N., McDaniel, P.D., Goodfellow, I.J.: Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. CoRR abs/1605.07277 (2016)
Papernot, N., McDaniel, P.D., Goodfellow, I.J., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against deep learning systems using adversarial examples. CoRR abs/1602.02697 (2016)
Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., Nicholas, C.K.: Malware detection by eating a whole EXE. In: The Workshops of the the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, Louisiana, USA, 2–7 February 2018, pp. 268–276 (2018)
Reith, R.N., Schneider, T., Tkachenko, O.: Efficiently stealing your machine learning models. In: Proceedings of the 18th ACM Workshop on Privacy in the Electronic Society, WPES@CCS 2019, London, UK, 11 November 2019, pp. 198–210 (2019)
Rosenberg, I., Shabtai, A., Rokach, L., Elovici, Y.: Generic black-box end-to-end attack against state of the art API call based malware classifiers. In: Research in Attacks, Intrusions, and Defenses - 21st International Symposium, RAID 2018, Heraklion, Crete, Greece, 10–12 September 2018, Proceedings, pp. 490–510 (2018)
Shannon, C.E.: A mathematical theory of communication. Mob. Comput. Commun. Rev. 5(1), 3–55 (2001)
Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22–26, 2017, pp. 3–18 (2017)
Suciu, O., Coull, S.E., Johns, J.: Exploring adversarial examples in malware detection. In: 2019 IEEE Security and Privacy Workshops, SP Workshops 2019, San Francisco, CA, USA, May 19–23, 2019, pp. 8–14 (2019)
Szegedy, C., et al.: Intriguing properties of neural networks. In: 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014)
Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction apis. In: 25th USENIX Security Symposium, USENIX Security 16, Austin, TX, USA, 10–12 August 2016, pp. 601–618 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Ali, A., Eshete, B. (2020). Best-Effort Adversarial Approximation of Black-Box Malware Classifiers. In: Park, N., Sun, K., Foresti, S., Butler, K., Saxena, N. (eds) Security and Privacy in Communication Networks. SecureComm 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 335. Springer, Cham. https://doi.org/10.1007/978-3-030-63086-7_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-63086-7_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63085-0
Online ISBN: 978-3-030-63086-7
eBook Packages: Computer ScienceComputer Science (R0)