research-article

Mitigating Adversarial Attacks using Pruning

Authors:

Vipul Kumar Mishra,

Aditya Varshney,

Shekhar YadavAuthors Info & Claims

IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing

Pages 523 - 529

https://doi.org/10.1145/3607947.3608057

Published: 28 September 2023 Publication History

Abstract

The advent of deep learning has revolutionized the technology industry and has made Deep Neural Networks (DNNs) the powerhouse of many modern day software applications. Well-trained DNNs are able to perform complex tasks such as speech recognition, object detection and image classification with high precision and accuracy. However, the task of training such complex networks at times requires enormous amount of computational resources for which the task is often outsourced to third parties. Recent work suggests that outsourcing the training task can act as a favourable gateway for a malicious trainer to induce a backdoor in the model, which when triggered can force the model into performing in a predefined way which has been set up by the malicious trainer. This paper starts by giving an overview on how such attacks are induced and consequently discusses and provides experimental proof on the various strategies which can be used to neutralise such attacks. We use the l1 and l2 norm to identify weights which are susceptible to be poisoned and prune them away by setting their value to zero. We further inspect the efficiency of layer wise and global pruning. We infer from our experiments that fine-tuning the model for a few epochs after the fine-pruning stage has been completed helps the model to regain its lost accuracy and provides better test time accuracy. During this study, we understand that performing fine-pruning in the later layers is more effective. By pruning the last layer along with fine-tuning the model after fine-pruning has been completed, we achieve 99.96% and 86.05% accuracy for the clean validation and test dataset respectively and consequently witness a drop in attach success rate from 99 to 0 and near 0 in some cases.

References

[1]

Anish Athalye, Nicholas Carlini, and David Wagner. 2018. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. https://doi.org/10.48550/ARXIV.1802.00420

[2]

Mauro Barni, Kassem Kallas, and Benedetta Tondi. 2019. A new Backdoor Attack in CNNs by training set corruption without label poisoning. CoRR abs/1902.11237 (2019). arXiv:1902.11237http://arxiv.org/abs/1902.11237

[3]

Nicholas Carlini and David Wagner. 2016. Defensive Distillation is Not Robust to Adversarial Examples. https://doi.org/10.48550/ARXIV.1607.04311

[4]

Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. https://doi.org/10.48550/ARXIV.1712.05526

[5]

Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. CoRR abs/1712.05526 (2017). arXiv:1712.05526http://arxiv.org/abs/1712.05526

[6]

Tejalal Choudhary, Vipul Mishra, Anurag Goswami, and Jagannathan Sarangapani. 2020. A comprehensive survey on model compression and acceleration. Artificial Intelligence Review 53 (2020), 5113–5155.

Digital Library

[7]

Guneet S. Dhillon, Kamyar Azizzadenesheli, Jeremy D. Bernstein, Jean Kossaifi, Aran Khanna, Zachary C. Lipton, and Animashree Anandkumar. 2018. Stochastic activation pruning for robust adversarial defense. In International Conference on Learning Representations. https://openreview.net/forum?id=H1uR4GZRZ

[8]

Yu Feng, Benteng Ma, Jing Zhang, Shanshan Zhao, Yong Xia, and Dacheng Tao. 2021. FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis. CoRR abs/2112.01148 (2021). arXiv:2112.01148https://arxiv.org/abs/2112.01148

[9]

Yansong Gao, Change Xu, Derui Wang, Shiping Chen, Damith C. Ranasinghe, and Surya Nepal. 2019. STRIP: A Defence against Trojan Attacks on Deep Neural Networks. In Proceedings of the 35th Annual Computer Security Applications Conference (San Juan, Puerto Rico, USA) (ACSAC ’19). Association for Computing Machinery, New York, NY, USA, 113–125. https://doi.org/10.1145/3359789.3359790

Digital Library

[10]

Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech Recognition with Deep Recurrent Neural Networks. https://doi.org/10.48550/ARXIV.1303.5778

[11]

Ashim Gupta and Amrith Krishna. 2023. Adversarial Clean Label Backdoor Attacks and Defenses on Text Classification Systems. arxiv:2305.19607 [cs.CL]

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. CoRR abs/1512.03385 (2015). arXiv:1512.03385http://arxiv.org/abs/1512.03385

[13]

Warren He, James Wei, Xinyun Chen, Nicholas Carlini, and Dawn Song. 2017. Adversarial Example Defense: Ensembles of Weak Defenses are not Strong. In 11th USENIX Workshop on Offensive Technologies (WOOT 17). USENIX Association, Vancouver, BC. https://www.usenix.org/conference/woot17/workshop-program/presentation/he

[14]

Karl Moritz Hermann and Phil Blunsom. 2013. Multilingual Distributed Representations without Word Alignment. https://doi.org/10.48550/ARXIV.1312.6173

[15]

Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. 2018. Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks. https://doi.org/10.48550/ARXIV.1805.12185

[16]

Xuanqing Liu, Minhao Cheng, Huan Zhang, and Cho-Jui Hsieh. 2017. Towards Robust Neural Networks via Random Self-ensemble. CoRR abs/1712.00673 (2017). arXiv:1712.00673http://arxiv.org/abs/1712.00673

[17]

Tiange Luo, Tianle Cai, Mengxiao Zhang, Siyu Chen, and Liwei Wang. 2019. RANDOM MASK: Towards Robust Convolutional Neural Networks. https://openreview.net/forum?id=SkgkJn05YX

[18]

Dongyu Meng and Hao Chen. 2017. MagNet: a Two-Pronged Defense against Adversarial Examples. CoRR abs/1705.09064 (2017). arXiv:1705.09064http://arxiv.org/abs/1705.09064

[19]

Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The Limitations of Deep Learning in Adversarial Settings. In 2016 IEEE European Symposium on Security and Privacy (EuroS P). 372–387. https://doi.org/10.1109/EuroSP.2016.36

[20]

Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. In 2016 IEEE Symposium on Security and Privacy (SP). 582–597. https://doi.org/10.1109/SP.2016.41

[21]

Kui Ren, Tianhang Zheng, Zhan Qin, and Xue Liu. 2020. Adversarial Attacks and Defenses in Deep Learning. Engineering 6 (01 2020). https://doi.org/10.1016/j.eng.2019.12.012

[22]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211–252. https://doi.org/10.1007/s11263-015-0816-y

Digital Library

[23]

Alex Sherstinsky. 2020. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Physica D: Nonlinear Phenomena 404 (03 2020), 132306. https://doi.org/10.1016/j.physd.2019.132306

[24]

Mingjie Sun, Siddhant Agarwal, and J Zico Kolter. 2021. Poisoned classifiers are not only backdoored, they are fundamentally broken. https://openreview.net/forum?id=zsKWh2pRSBK

[25]

Yi Sun, Xiaogang Wang, and Xiaoou Tang. 2014. Deep Learning Face Representation from Predicting 10,000 Classes. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. 1891–1898. https://doi.org/10.1109/CVPR.2014.244

Digital Library

[26]

Hidenori Tanaka, Daniel Kunin, Daniel L. K. Yamins, and Surya Ganguli. 2020. Pruning neural networks without any data by iteratively conserving synaptic flow. CoRR abs/2006.05467 (2020). arXiv:2006.05467https://arxiv.org/abs/2006.05467

[27]

Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y. Zhao. 2019. Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. In 2019 IEEE Symposium on Security and Privacy (SP). 707–723. https://doi.org/10.1109/SP.2019.00031

[28]

Lior Wolf, Tal Hassner, and Itay Maoz. 2011. Face recognition in unconstrained videos with matched background similarity. In CVPR 2011. 529–534. https://doi.org/10.1109/CVPR.2011.5995566

Digital Library

[29]

Tong Wu, Tianhao Wang, Vikash Sehwag, Saeed Mahloujifar, and Prateek Mittal. 2022. Just Rotate It: Deploying Backdoor Attacks via Rotation Transformation. In Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security (Los Angeles, CA, USA) (AISec’22). Association for Computing Machinery, New York, NY, USA, 91–102. https://doi.org/10.1145/3560830.3563730

Digital Library

Cited By

Index Terms

Mitigating Adversarial Attacks using Pruning
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

A hybrid adversarial training for deep learning model and denoising network resistant to adversarial examples
Abstract
Deep neural networks (DNNs) are vulnerable to adversarial attacks that generate adversarial examples by adding small perturbations to the clean images. To combat adversarial attacks, the two main defense methods used are denoising and adversarial ...
Mitigating the impact of adversarial attacks in very deep networks
Abstract
Deep Neural Network (DNN) models have vulnerabilities related to security concerns, with attackers usually employing complex hacking techniques to expose their structures. Data poisoning-enabled perturbation attacks are complex ...
Defense against Adversarial Attacks on Image Recognition Systems Using an Autoencoder
Abstract
Adversarial attacks on artificial neural network systems for image recognition are considered. To improve the security of image recognition systems against adversarial attacks (evasion attacks), the use of autoencoders is proposed. Various attacks ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing

August 2023

783 pages

ISBN:9798400700224

DOI:10.1145/3607947

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

IC3 2023

IC3 2023: 2023 Fifteenth International Conference on Contemporary Computing

August 3 - 5, 2023

Noida, India

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
40
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten