Skip to main content

MaleficNet: Hiding Malware into Deep Neural Networks Using Spread-Spectrum Channel Coding

  • Conference paper
  • First Online:
Computer Security – ESORICS 2022 (ESORICS 2022)

Abstract

The training and development of good deep learning models is often a challenging task, thus leading individuals (developers, researchers, and practitioners alike) to use third-party models residing in public repositories, fine-tuning these models to their needs usually with little-to-no effort. Despite its undeniable benefits, this practice can lead to new attack vectors. In this paper, we demonstrate the feasibility and effectiveness of one such attack, namely malware embedding in deep learning models. We push the boundaries of current state-of-the-art by introducing MaleficNet, a technique that combines spread-spectrum channel coding with error correction techniques, injecting malicious payloads in the parameters of deep neural networks, all while causing no degradation to the model’s performance and successfully bypassing state-of-the-art detection and removal mechanisms. We believe this work will raise awareness against these new, dangerous, camouflaged threats, assist the research community and practitioners in evaluating the capabilities of modern machine learning architectures, and pave the way to research targeting the detection and mitigation of such threats.

D. Hitaj and G. Pagnotta—Equal Contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    GPT-3, a language model by OpenAI, has 175-billion parameters. Gopher by DeepMind has a total of 280 billion parameters and GLaM from Google has 1.2 Trillion weight parameters.

  2. 2.

    In our case we selected the \(\gamma \) in the range \([1\times 10^{-5}, 9\times 10^{-3}]\) following a grid search approach.

References

  1. Ateniese, G., Mancini, L.V., Spognardi, A., Villani, A., Vitali, D., Felici, G.: Hacking smart machines with smarter ones: how to extract meaningful data from machine learning classifiers. Int. J. Secur. Networks 10, 137–150 (2015)

    Article  Google Scholar 

  2. Baylis, D.J.: Error Correcting Codes A Mathematical Introduction. Chapman and Hall/CRC, Boca Raton (1998)

    Book  Google Scholar 

  3. Berti, J.: AI-based supply chains: using intelligent automation to build resiliency (2021). https://www.ibm.com/blogs/supply-chain/ai-based-supply-chains-using-intelligent-automation-to-build-resiliency/

  4. Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems. Curran Associates, Inc. (2020)

    Google Scholar 

  5. Cheddad, A., Condell, J., Curran, K., Mc Kevitt, P.: Digital image steganography: survey and analysis of current methods. Signal Process. 90, 727–752 (2010)

    Article  Google Scholar 

  6. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  7. Christian, S., Liu, W., Jia, Y.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  8. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2006)

    MATH  Google Scholar 

  9. Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20, 30–42 (2012)

    Article  Google Scholar 

  10. De Gaspari, F., Hitaj, D., Pagnotta, G., De Carli, L., Mancini, L.V.: Evading behavioral classifiers: a comprehensive analysis on evading ransomware detection techniques. Neural Comput. Appl. 1–20 (2022). https://doi.org/10.1007/s00521-022-07096-6

  11. De Gaspari, F., Hitaj, D., Pagnotta, G., De Carli, L., Mancini, L.V.: Reliable detection of compressed and encrypted data. Neural Comput. Appl. (2022)

    Google Scholar 

  12. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (2009)

    Google Scholar 

  13. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT (2019)

    Google Scholar 

  14. Domhan, T., Hasler, E., Tran, K., Trenous, S., Byrne, B., Hieber, F.: The devil is in the details: on the pitfalls of vocabulary selection in neural machine translation. In: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2022)

    Google Scholar 

  15. Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (2013)

    Google Scholar 

  16. Gu, T., Liu, K., Dolan-Gavitt, B., Garg, S.: BadNets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, 47230–47244 (2019)

    Article  Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  18. Hitaj, B., Gasti, P., Ateniese, G., Perez-Cruz, F.: PassGAN: a deep learning approach for password guessing. Appl. Cryptography Network Secur. (2019)

    Google Scholar 

  19. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  20. Koh, J.Y.: Model zoo. http://modelzoo.co/. Accessed November 2021

  21. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, University of Toronto, Toronto, Ontario (2009)

    Google Scholar 

  22. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems (2012)

    Google Scholar 

  23. LeCun, Y., Cortes, C.: MNIST handwritten digit database. https://yann.lecun.com/exdb/mnist/ (2010)

  24. Liu, K., Dolan-Gavitt, B., Garg, S.: Fine-pruning: defending against backdooring attacks on deep neural networks. In: Bailey, M., Holz, T., Stamatogiannakis, M., Ioannidis, S. (eds.) RAID 2018. LNCS, vol. 11050, pp. 273–294. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00470-5_13

    Chapter  Google Scholar 

  25. Liu, T., Liu, Z., Liu, Q., Wen, W., Xu, W., Li, M.: StegoNet: turn deep neural network into a stegomalware. In: Annual Computer Security Applications Conference (2020)

    Google Scholar 

  26. Lozano, M.A., et al.: Open data science to fight COVID-19: winning the 500k XPRIZE pandemic response challenge. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds.) ECML PKDD 2021. LNCS (LNAI), vol. 12978, pp. 384–399. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86514-6_24

    Chapter  Google Scholar 

  27. Metadefender: Multiple security engines. https://www.metadefender.com/. Accessed Apr 2022

  28. Mitchell, T.M.: Machine Learning. McGraw-Hill Inc, New York (1997)

    MATH  Google Scholar 

  29. Nativ, Y.: thezoo - a live malware repository. https://thezoo.morirt.com/. Accessed Nov 2021

  30. Pagnotta, G., Hitaj, D., De Gaspari, F., Mancini, L.V.: Passflow: guessing passwords with generative flows. In: 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (2022)

    Google Scholar 

  31. Richardson, T., Urbanke, R.: Modern Coding Theory. Cambridge University Press, Cambridge (2008)

    Book  Google Scholar 

  32. Rupf, M., Massey, J.L.: Optimum sequence multisets for synchronous code-division multiple-access channels. IEEE Trans. Inf. Theory 40, 1261–1266 (1994)

    Article  Google Scholar 

  33. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014)

    Google Scholar 

  34. Stevens, R., Suciu, O., Ruef, A., Hong, S., Hicks, M., Dumitraç, T.: Summoning demons: the pursuit of exploitable bugs in machine learning (2017)

    Google Scholar 

  35. Suarez-Tangil, G., Tapiador, J.E., Peris-Lopez, P.: Stegomalware: playing hide and seek with malicious components in smartphone apps. In: Lin, D., Yung, M., Zhou, J. (eds.) Inscrypt 2014. LNCS, vol. 8957, pp. 496–515. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16745-9_27

    Chapter  Google Scholar 

  36. Torrieri, D.: Iterative channel estimation, demodulation, and decoding. In: Principles of Spread-Spectrum Communication Systems, pp. 549–594. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-75343-6_9

    Chapter  Google Scholar 

  37. Vaidya, S.: Openstego. https://github.com/syvaidya/openstego/. Accessed Apr 2022

  38. Verdu, S.: Multiuser Detection. Cambridge University Press, Cambridge (1998)

    MATH  Google Scholar 

  39. Verdu, S.: Capacity region of gaussian CDMA channels: the symbol synchronous case. In: Proceedings of the 24th Allerton Conference (1986)

    Google Scholar 

  40. Verdu, S.: Recent results on the capacity of wideband channels in the low-power regime. IEEE Wirel. Commun. 9, 40–45 (2002)

    Article  Google Scholar 

  41. Viswanath, P., Anantharam, V.: Optimal sequences and sum capacity of synchronous CDMA systems. IEEE Trans. Inf. Theory 45, 1984–1991 (1999)

    Article  MathSciNet  Google Scholar 

  42. Wang, Z., Liu, C., Cui, X.: Evilmodel: hiding malware inside of neural network models. In: 2021 IEEE Symposium on Computers and Communications (2021)

    Google Scholar 

  43. Wang, Z., Liu, C., Cui, X., Yin, J., Wang, X.: Evilmodel 2.0: bringing neural network models into malware attacks. Comput. Secur. 120, 102807 (2022)

    Article  Google Scholar 

  44. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. ArXiv:abs/1708.07747 (2017)

  45. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

  46. Zhang, W., Zhai, M., Huang, Z., Liu, C., Li, W., Cao, Y.: Towards end-to-end speech recognition with deep multipath convolutional neural networks. In: Intelligent Robotics and Applications (2019)

    Google Scholar 

Download references

Acknowledgements

This work of Dorjan Hitaj, Giulio Pagnotta, and Luigi V. Mancini was supported by Gen4olive, a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 101000427.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dorjan Hitaj .

Editor information

Editors and Affiliations

Appendices

A Additional Experiments

Model parameter distribution comparisons with and without malware on different deep neural network architectures.

(Fig. 3).

Fig. 3.
figure 3

Comparison between the weight parameter distribution of different DNN before and after various sized malware were embedded in them using MaleficNet technique

Model Performance experiments on Cats vs. Dogs dataset. (Table 5)

Table 5. Baseline vs. MaleficNet model performance on Cats vs. Dogs dataset on different DNN architectures for different sized malware payloads

B Implementation Details

figure a
figure b

In Algorithms 1 and 2 we show the implementation details of MaleficNet ’s inject and extract payload methods. The injection module depicted in Algorithm 1 takes as input a model W (divided in k-blocks of size s) and uses CDMA channel coding technique to inject a pre-selected malware binary into the model weights. To allow a quick verification that the malware payload is extracted correctly, MaleficNet ’s injection module, beside the malware payload includes also a 256-bit hash of the malware payload binary as part of the payload. As mentioned above, to not deteriorate the model performance on the legitimate task, we partition the network and the payload in chunks and embed one chunk of payload in one chunk of the network. CDMA takes a narrowband signal and spreads it in a wideband signal to allow for reliable transmission and decoding. To satisfy this property, the chunk of the network is selected to be multiple times larger than the size of a chunk of the payload. In our experiments, the narrowband signal (payload chunk) is spread into a wideband signal (model chunk) that is 6 times larger (i.e., the spreading code of each bit of the payload chunk will be 6 times the length of the chunk).

The extraction module (Algorithm 2) takes as input a model W (divided in k-blocks of size s). To extract the malware payload, the extractor needs to know the seed to generate the spreading codes and the LDPC matrices, the hash of the malware binary to verify whether the extraction is successful, the size of the narrowband signal (d) and the length of the malware payload. Using the first 200 extracted bits, the estimation of the channel noise is computed. After that, the LDPC decoder is used to recover the payload. The extraction module returns the malware payload and the hash.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hitaj, D., Pagnotta, G., Hitaj, B., Mancini, L.V., Perez-Cruz, F. (2022). MaleficNet: Hiding Malware into Deep Neural Networks Using Spread-Spectrum Channel Coding. In: Atluri, V., Di Pietro, R., Jensen, C.D., Meng, W. (eds) Computer Security – ESORICS 2022. ESORICS 2022. Lecture Notes in Computer Science, vol 13556. Springer, Cham. https://doi.org/10.1007/978-3-031-17143-7_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-17143-7_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-17142-0

  • Online ISBN: 978-3-031-17143-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics