skip to main content
10.1145/3475724.3483608acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Enhancing Adversarial Examples Transferability via Ensemble Feature Manifolds

Published: 22 October 2021 Publication History

Abstract

The adversarial attack is a technique that causes intended misclassification by adding imperceptible perturbations to benign inputs. It provides a way to evaluate the robustness of models. Many existing adversarial attacks have achieved good performance in the white-box settings. However, these adversarial examples generated by various attacks typically overfit the particular architecture of the source model, resulting in low transferability in the black-box scenarios. In this work, we propose a novel feature attack method called Features-Ensemble Generative Adversarial Network (FEGAN), which ensembles multiple feature manifolds to capture intrinsic adversarial information that is most likely to cause misclassification of many models, thereby improving the transferability of adversarial examples. Accordingly, a generator trained based on various latent feature vectors of benign inputs can produce adversarial examples containing this adversarial information. Extensive experiments on the MNIST and CIFAR10 datasets demonstrate that the proposed method improves the transferability of adversarial examples while ensuring the attack success rate in the white-box scenario. In addition, the generated adversarial examples are more realistic with distribution close to that of the actual data.

References

[1]
Stefano Alletto, Andrea Palazzi, Francesco Solera, Simone Calderara, and Rita Cucchiara. 2016. DR(eye)VE: A Dataset for Attention-Based Tasks with Applications to Autonomous and Assisted Driving. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2016. 54--60.
[2]
Battista Biggio and Fabio Roli. 2018. Wild patterns: Ten years after the rise of adversarial machine learning. Pattern Recognit., Vol. 84 (2018), 317--331.
[3]
Wieland Brendel, Jonas Rauber, and Matthias Bethge. 2018. Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models. In 6th International Conference on Learning Representations, ICLR 2018 .
[4]
Nicholas Carlini and David A. Wagner. 2017. Towards Evaluating the Robustness of Neural Networks. In 2017 IEEE Symposium on Security and Privacy, SP 2017. 39--57.
[5]
Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. 2017. ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, AISec@CCS 2017, Bhavani M. Thuraisingham, Battista Biggio, David Mandell Freeman, Brad Miller, and Arunesh Sinha (Eds.). 15--26.
[6]
Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting Adversarial Attacks With Momentum. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018. 9185--9193.
[7]
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger (Eds.). 2672--2680.
[8]
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In 3rd International Conference on Learning Representations, ICLR 2015, Yoshua Bengio and Yann LeCun (Eds.).
[9]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016. 770--778.
[10]
Qian Huang, Isay Katsman, Zeqi Gu, Horace He, Serge J. Belongie, and Ser-Nam Lim. 2019. Enhancing Adversarial Example Transferability With an Intermediate Level Attack. In IEEE/CVF International Conference on Computer Vision, ICCV 2019 . 4732--4741.
[11]
Nathan Inkawhich, Wei Wen, Hai (Helen) Li, and Yiran Chen. 2019. Feature Space Perturbations Yield More Transferable Adversarial Examples. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019 . 7066--7074.
[12]
Surgan Jandial, Puneet Mangla, Sakshi Varshney, and Vineeth Balasubramanian. 2019. AdvGAN
[13]
: Harnessing Latent Layers for Adversary Generation. In 2019 IEEE/CVF International Conference on Computer Vision Workshops, ICCV Workshops 2019 . 2045--2048.
[14]
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In Computer Vision - ECCV 2016 - 14th European Conference, Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling (Eds.). 694--711.
[15]
Alex Krizhevsky, Geoffrey Hinton, et almbox. 2009. Learning multiple layers of features from tiny images . Technical Report. Citeseer.
[16]
Nupur Kumari, Mayank Singh, Abhishek Sinha, Harshitha Machiraju, Balaji Krishnamurthy, and Vineeth N. Balasubramanian. 2019. Harnessing the Vulnerability of Latent Layers in Adversarially Trained Models. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Sarit Kraus (Ed.). 2779--2785.
[17]
Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. 2017. Adversarial examples in the physical world. In 5th International Conference on Learning Representations, ICLR 2017, Workshop Track Proceedings .
[18]
Yann LeCun. 1998. The MNIST database of handwritten digits. (1998).
[19]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In 6th International Conference on Learning Representations, ICLR 2018, Conference Track Proceedings .
[20]
Omid Poursaeed, Isay Katsman, Bicheng Gao, and Serge J. Belongie. 2018. Generative Adversarial Perturbations. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018. 4422--4431.
[21]
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In 3rd International Conference on Learning Representations, ICLR 2015, Yoshua Bengio and Yann LeCun (Eds.).
[22]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015. 1--9.
[23]
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In 2nd International Conference on Learning Representations, ICLR 2014, Yoshua Bengio and Yann LeCun (Eds.).
[24]
Florian Tramè r, Alexey Kurakin, Nicolas Papernot, Ian J. Goodfellow, Dan Boneh, and Patrick D. McDaniel. 2018. Ensemble Adversarial Training: Attacks and Defenses. In 6th International Conference on Learning Representations, ICLR 2018 .
[25]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research, Vol. 9 (2008), 2579--2605.
[26]
Chaowei Xiao, Bo Li, Jun-Yan Zhu, Warren He, Mingyan Liu, and Dawn Song. 2018. Generating Adversarial Examples with Adversarial Networks. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, Jé rô me Lang (Ed.). 3905--3911.
[27]
Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L. Yuille. 2019. Improving Transferability of Adversarial Examples With Input Diversity. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019 . 2730--2739.
[28]
Qiuling Xu, Guanhong Tao, Siyuan Cheng, Lin Tan, and Xiangyu Zhang. 2020. Towards Feature Space Adversarial Attack. CoRR, Vol. abs/2004.12385 (2020).
[29]
Sergey Zagoruyko and Nikos Komodakis. 2016. Wide Residual Networks. In Proceedings of the British Machine Vision Conference 2016, Richard C. Wilson, Edwin R. Hancock, and William A. P. Smith (Eds.).

Cited By

View all
  • (2025)Common knowledge learning for generating transferable adversarial examplesFrontiers of Computer Science10.1007/s11704-024-40533-419:10Online publication date: 28-Jan-2025

Index Terms

  1. Enhancing Adversarial Examples Transferability via Ensemble Feature Manifolds

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ADVM '21: Proceedings of the 1st International Workshop on Adversarial Learning for Multimedia
      October 2021
      73 pages
      ISBN:9781450386722
      DOI:10.1145/3475724
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 22 October 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. adversarial attack
      2. ensemble features
      3. generative adversarial networks
      4. transferability

      Qualifiers

      • Research-article

      Funding Sources

      • National NSF of China

      Conference

      MM '21
      Sponsor:
      MM '21: ACM Multimedia Conference
      October 20, 2021
      Virtual Event, China

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)16
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 16 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Common knowledge learning for generating transferable adversarial examplesFrontiers of Computer Science10.1007/s11704-024-40533-419:10Online publication date: 28-Jan-2025

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media