skip to main content
10.1145/3607947.3608057acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesic3Conference Proceedingsconference-collections
research-article

Mitigating Adversarial Attacks using Pruning

Published: 28 September 2023 Publication History

Abstract

The advent of deep learning has revolutionized the technology industry and has made Deep Neural Networks (DNNs) the powerhouse of many modern day software applications. Well-trained DNNs are able to perform complex tasks such as speech recognition, object detection and image classification with high precision and accuracy. However, the task of training such complex networks at times requires enormous amount of computational resources for which the task is often outsourced to third parties. Recent work suggests that outsourcing the training task can act as a favourable gateway for a malicious trainer to induce a backdoor in the model, which when triggered can force the model into performing in a predefined way which has been set up by the malicious trainer. This paper starts by giving an overview on how such attacks are induced and consequently discusses and provides experimental proof on the various strategies which can be used to neutralise such attacks. We use the l1 and l2 norm to identify weights which are susceptible to be poisoned and prune them away by setting their value to zero. We further inspect the efficiency of layer wise and global pruning. We infer from our experiments that fine-tuning the model for a few epochs after the fine-pruning stage has been completed helps the model to regain its lost accuracy and provides better test time accuracy. During this study, we understand that performing fine-pruning in the later layers is more effective. By pruning the last layer along with fine-tuning the model after fine-pruning has been completed, we achieve 99.96% and 86.05% accuracy for the clean validation and test dataset respectively and consequently witness a drop in attach success rate from 99 to 0 and near 0 in some cases.

References

[1]
Anish Athalye, Nicholas Carlini, and David Wagner. 2018. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. https://doi.org/10.48550/ARXIV.1802.00420
[2]
Mauro Barni, Kassem Kallas, and Benedetta Tondi. 2019. A new Backdoor Attack in CNNs by training set corruption without label poisoning. CoRR abs/1902.11237 (2019). arXiv:1902.11237http://arxiv.org/abs/1902.11237
[3]
Nicholas Carlini and David Wagner. 2016. Defensive Distillation is Not Robust to Adversarial Examples. https://doi.org/10.48550/ARXIV.1607.04311
[4]
Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. https://doi.org/10.48550/ARXIV.1712.05526
[5]
Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning. CoRR abs/1712.05526 (2017). arXiv:1712.05526http://arxiv.org/abs/1712.05526
[6]
Tejalal Choudhary, Vipul Mishra, Anurag Goswami, and Jagannathan Sarangapani. 2020. A comprehensive survey on model compression and acceleration. Artificial Intelligence Review 53 (2020), 5113–5155.
[7]
Guneet S. Dhillon, Kamyar Azizzadenesheli, Jeremy D. Bernstein, Jean Kossaifi, Aran Khanna, Zachary C. Lipton, and Animashree Anandkumar. 2018. Stochastic activation pruning for robust adversarial defense. In International Conference on Learning Representations. https://openreview.net/forum?id=H1uR4GZRZ
[8]
Yu Feng, Benteng Ma, Jing Zhang, Shanshan Zhao, Yong Xia, and Dacheng Tao. 2021. FIBA: Frequency-Injection based Backdoor Attack in Medical Image Analysis. CoRR abs/2112.01148 (2021). arXiv:2112.01148https://arxiv.org/abs/2112.01148
[9]
Yansong Gao, Change Xu, Derui Wang, Shiping Chen, Damith C. Ranasinghe, and Surya Nepal. 2019. STRIP: A Defence against Trojan Attacks on Deep Neural Networks. In Proceedings of the 35th Annual Computer Security Applications Conference (San Juan, Puerto Rico, USA) (ACSAC ’19). Association for Computing Machinery, New York, NY, USA, 113–125. https://doi.org/10.1145/3359789.3359790
[10]
Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. 2013. Speech Recognition with Deep Recurrent Neural Networks. https://doi.org/10.48550/ARXIV.1303.5778
[11]
Ashim Gupta and Amrith Krishna. 2023. Adversarial Clean Label Backdoor Attacks and Defenses on Text Classification Systems. arxiv:2305.19607 [cs.CL]
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. CoRR abs/1512.03385 (2015). arXiv:1512.03385http://arxiv.org/abs/1512.03385
[13]
Warren He, James Wei, Xinyun Chen, Nicholas Carlini, and Dawn Song. 2017. Adversarial Example Defense: Ensembles of Weak Defenses are not Strong. In 11th USENIX Workshop on Offensive Technologies (WOOT 17). USENIX Association, Vancouver, BC. https://www.usenix.org/conference/woot17/workshop-program/presentation/he
[14]
Karl Moritz Hermann and Phil Blunsom. 2013. Multilingual Distributed Representations without Word Alignment. https://doi.org/10.48550/ARXIV.1312.6173
[15]
Kang Liu, Brendan Dolan-Gavitt, and Siddharth Garg. 2018. Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks. https://doi.org/10.48550/ARXIV.1805.12185
[16]
Xuanqing Liu, Minhao Cheng, Huan Zhang, and Cho-Jui Hsieh. 2017. Towards Robust Neural Networks via Random Self-ensemble. CoRR abs/1712.00673 (2017). arXiv:1712.00673http://arxiv.org/abs/1712.00673
[17]
Tiange Luo, Tianle Cai, Mengxiao Zhang, Siyu Chen, and Liwei Wang. 2019. RANDOM MASK: Towards Robust Convolutional Neural Networks. https://openreview.net/forum?id=SkgkJn05YX
[18]
Dongyu Meng and Hao Chen. 2017. MagNet: a Two-Pronged Defense against Adversarial Examples. CoRR abs/1705.09064 (2017). arXiv:1705.09064http://arxiv.org/abs/1705.09064
[19]
Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The Limitations of Deep Learning in Adversarial Settings. In 2016 IEEE European Symposium on Security and Privacy (EuroS P). 372–387. https://doi.org/10.1109/EuroSP.2016.36
[20]
Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks. In 2016 IEEE Symposium on Security and Privacy (SP). 582–597. https://doi.org/10.1109/SP.2016.41
[21]
Kui Ren, Tianhang Zheng, Zhan Qin, and Xue Liu. 2020. Adversarial Attacks and Defenses in Deep Learning. Engineering 6 (01 2020). https://doi.org/10.1016/j.eng.2019.12.012
[22]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211–252. https://doi.org/10.1007/s11263-015-0816-y
[23]
Alex Sherstinsky. 2020. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Physica D: Nonlinear Phenomena 404 (03 2020), 132306. https://doi.org/10.1016/j.physd.2019.132306
[24]
Mingjie Sun, Siddhant Agarwal, and J Zico Kolter. 2021. Poisoned classifiers are not only backdoored, they are fundamentally broken. https://openreview.net/forum?id=zsKWh2pRSBK
[25]
Yi Sun, Xiaogang Wang, and Xiaoou Tang. 2014. Deep Learning Face Representation from Predicting 10,000 Classes. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. 1891–1898. https://doi.org/10.1109/CVPR.2014.244
[26]
Hidenori Tanaka, Daniel Kunin, Daniel L. K. Yamins, and Surya Ganguli. 2020. Pruning neural networks without any data by iteratively conserving synaptic flow. CoRR abs/2006.05467 (2020). arXiv:2006.05467https://arxiv.org/abs/2006.05467
[27]
Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y. Zhao. 2019. Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks. In 2019 IEEE Symposium on Security and Privacy (SP). 707–723. https://doi.org/10.1109/SP.2019.00031
[28]
Lior Wolf, Tal Hassner, and Itay Maoz. 2011. Face recognition in unconstrained videos with matched background similarity. In CVPR 2011. 529–534. https://doi.org/10.1109/CVPR.2011.5995566
[29]
Tong Wu, Tianhao Wang, Vikash Sehwag, Saeed Mahloujifar, and Prateek Mittal. 2022. Just Rotate It: Deploying Backdoor Attacks via Rotation Transformation. In Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security (Los Angeles, CA, USA) (AISec’22). Association for Computing Machinery, New York, NY, USA, 91–102. https://doi.org/10.1145/3560830.3563730

Cited By

View all

Index Terms

  1. Mitigating Adversarial Attacks using Pruning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    IC3-2023: Proceedings of the 2023 Fifteenth International Conference on Contemporary Computing
    August 2023
    783 pages
    ISBN:9798400700224
    DOI:10.1145/3607947
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 September 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Adversarial attack
    2. network pruning
    3. neural networks

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    IC3 2023

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 40
      Total Downloads
    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media