ABSTRACT
Deep neural networks excel at solving intuitive tasks that are hard to describe formally, such as classification, but are easily deceived by maliciously crafted samples, leading to misclassification. Recently, it has been observed that the attack-specific robustness of models obtained through adversarial training does not generalize well to novel or unseen attacks. While data augmentation through mixup in the input space has been shown to improve the generalization and robustness of models, there has been limited research progress on mixup in the latent space. Furthermore, almost no research on mixup has considered the robustness of models against emerging on-manifold adversarial attacks. In this paper, we first design a latent-space data augmentation strategy called dual-mode manifold interpolation, which allows for interpolating disentangled representations of source samples in two modes: convex mixing and binary mask mixing, to synthesize semantic samples. We then propose a resilient training framework, LatentRepresentationMixup (LarepMixup), that employs mixed examples and softlabel-based cross-entropy loss to refine the boundary. Experimental investigations on diverse datasets (CIFAR-10, SVHN, ImageNet-Mixed10) demonstrate that our approach delivers competitive performance in training models that are robust to off/on-manifold adversarial example attacks compared to leading mixup training techniques.
- Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion, and Matthias Hein. 2020. Square Attack: a query-efficient black-box adversarial attack via random search. In European Conference on Computer Vision (ECCV 2020). 486–501.Google ScholarDigital Library
- Shumeet Baluja and Ian Fischer. 2017. Adversarial Transformation Networks: Learning to Generate Adversarial Examples. arXiv preprint arXiv:1703.09387 (2017).Google Scholar
- Christopher Beckham, Sina Honari, Vikas Verma, Alex Lamb, Farnoosh Ghadiri, R Devon Hjelm, Yoshua Bengio, and Christopher Pal. 2019. On Adversarial Mixup Resynthesis. In 33rd Annual Conference on Neural Information Processing Systems (NIPS 2019).Google Scholar
- David Berthelot, Colin Raffel, Aurko Roy, and Ian Goodfellow. 2019. Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer. In 7th International Conference on Learning Representations (ICLR 2019).Google Scholar
- Nicholas Carlini and David Wagner. 2017. Towards Evaluating the Robustness of Neural Networks. In 38th IEEE Symposium on Security and Privacy (SP 2017).Google Scholar
- Taylan Cemgil, Sumedh Ghaisas, Krishnamurthy Dj Dvijotham, and Pushmeet Kohli. 2020. Adversarially Robust Representations with Smooth Encoders. In 8th International Conference on Learning Representations (ICLR 2020).Google Scholar
- Jiaao Chen, Zichao Yang, and Diyi Yang. 2020. MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification. In 58th Annual Meeting of the Association for Computational Linguistics. 2147–2157.Google Scholar
- Uri Cohen, SueYeon Chung, Daniel D Lee, and Haim Sompolinsky. 2020. Separability and geometry of object manifolds in deep neural networks. Nature Communications 11, 1 (2020), 1–13.Google ScholarCross Ref
- Francesco Croce and Matthias Hein. 2020. Minimally distorted adversarial examples with a fast adaptive boundary attack. In 37th International Conference on Machine Learning (ICML 2020). 2196–2205.Google ScholarDigital Library
- Francesco Croce and Matthias Hein. 2020. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In 37th International Conference on Machine Learning (ICML 2020). 2206–2216.Google ScholarDigital Library
- Gavin Weiguang Ding, Luyu Wang, and Xiaomeng Jin. 2019. AdverTorch v0. 1: An adversarial robustness toolbox based on pytorch. arXiv preprint arXiv:1902.07623 (2019).Google Scholar
- Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting Adversarial Attacks With Momentum. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018). 9185–9193.Google Scholar
- Qingkai Fang, Rong Ye, Lei Li, Yang Feng, and Mingxuan Wang. 2022. STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation. In Annual Meeting of the Association for Computational Linguistics. 7050–7062.Google Scholar
- Mojtaba Faramarzi, Mohammad Amini, Akilesh Badrinaaraayanan, Vikas Verma, and Sarath Chandar. 2022. PatchUp: A Feature-Space Block-Level Regularization Technique for Convolutional Neural Networks. AAAI Conference on Artificial Intelligence (AAAI 2022) (2022).Google Scholar
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT press.Google ScholarDigital Library
- Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In 3rd International Conference on Learning Representations (ICLR 2015).Google Scholar
- Sadaf Gulshad, Jan Hendrik Metzen, and Arnold Smeulders. 2020. Adversarial and Natural Perturbations for General Robustness. arXiv preprint arXiv:2010.01401 (2020).Google Scholar
- Hongyu Guo. 2020. Nonlinear mixup: Out-of-manifold data augmentation for text classification. In AAAI Conference on Artificial Intelligence (AAAI 2020).Google ScholarCross Ref
- Hongyu Guo, Yongyi Mao, and Richong Zhang. 2019. MixUp as Locally Linear Out-of-Manifold Regularization. In 33rd AAAI Conference on Artificial Intelligence (AAAI 2019). 3714–3722.Google ScholarDigital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2016). 770–778.Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Identity mappings in deep residual networks. In European Conference on Computer Vision (ECCV 2016). Springer, 630–645.Google ScholarCross Ref
- Geoffrey E Hinton and Ruslan R Salakhutdinov. 2006. Reducing the Dimensionality of Data with Neural Networks. science 313, 5786 (2006), 504–507.Google Scholar
- Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2017). 4700–4708.Google ScholarCross Ref
- Ajil Jalal, Andrew Ilyas, Constantinos Daskalakis, and Alexandros G Dimakis. 2017. The Robust Manifold Defense: Adversarial Training using Generative Models. arXiv preprint arXiv:1712.09196 (2017).Google Scholar
- Daniel Kang, Yi Sun, Dan Hendrycks, Tom Brown, and Jacob Steinhardt. 2019. Testing robustness against unforeseen adversaries. arXiv preprint arXiv:1908.08016 (2019).Google Scholar
- Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2020. Training Generative Adversarial Networks with Limited Data. In 34th Annual Conference on Neural Information Processing Systems (NIPS 2020).Google Scholar
- Tero Karras, Samuli Laine, and Timo Aila. 2019. A Style-Based Generator Architecture for Generative Adversarial Networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019). 4401–4410.Google ScholarCross Ref
- Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and Improving the Image Quality of StyleGAN. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020). 8107–8116.Google Scholar
- Jang-Hyun Kim, Wonho Choo, and Hyun Oh Song. 2020. Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup. In 37th International Conference on Machine Learning (ICML 2020). 5275–5285.Google Scholar
- Alex Krizhevsky, Geoffrey Hinton, 2009. Learning Multiple Layers of Features from Tiny Images. (2009).Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. Imagenet Classification with Deep Convolutional Neural Networks. Commun. ACM (2017).Google Scholar
- Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2017. Adversarial Machine Learning at Scale. In 5th International Conference on Learning Representations (ICLR 2017).Google Scholar
- Alexey Kurakin, Ian Goodfellow, Samy Bengio, 2017. Adversarial examples in the physical world. In 5th International Conference on Learning Representations (ICLR 2017).Google Scholar
- Saehyung Lee, Hyungyu Lee, and Sungroh Yoon. 2020. Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2020). 272–281.Google Scholar
- Wei-An Lin, Chun Pong Lau, Alexander Levine, Rama Chellappa, and Soheil Feizi. 2020. Dual Manifold Adversarial Robustness: Defense against Lp and non-Lp Adversarial Attacks. In 34th Annual Conference on Neural Information Processing Systems (NIPS 2020).Google Scholar
- Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In 6th International Conference on Learning Representations (ICLR 2018).Google Scholar
- Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. 2017. Universal Adversarial Perturbations. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2017). 1765–1773.Google Scholar
- Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2016). 2574–2582.Google ScholarCross Ref
- Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning. (2011).Google Scholar
- Maria-Irina Nicolae, Mathieu Sinn, Minh Ngoc Tran, Beat Buesser, Ambrish Rawat, Martin Wistuba, Valentina Zantedeschi, Nathalie Baracaldo, Bryant Chen, Heiko Ludwig, 2018. Adversarial Robustness Toolbox v1. 0.0. arXiv preprint arXiv:1807.01069 (2018).Google Scholar
- Tianyu Pang, Kun Xu, and Jun Zhu. 2020. Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks. In 8th International Conference on Learning Representations (ICLR 2020).Google Scholar
- Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. 2016. The Limitations of Deep Learning in Adversarial Settings. In IEEE European symposium on security and privacy (EuroS&P 2016).Google Scholar
- Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, 2019. Pytorch: An Imperative Style, High-performance Deep Learning Library. 33rd Annual Conference on Neural Information Processing Systems (NIPS 2019) 32 (2019).Google Scholar
- H Sebastian Seung and Daniel D Lee. 2000. The manifold ways of perception. science 290, 5500 (2000), 2268–2269.Google Scholar
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- Yang Song, Rui Shu, Nate Kushman, and Stefano Ermon. 2018. Constructing Unrestricted Adversarial Examples with Generative Models. In 32nd Annual Conference on Neural Information Processing Systems (NIPS 2018). 8322–8333.Google Scholar
- David Stutz, Matthias Hein, and Bernt Schiele. 2019. Disentangling Adversarial Robustness and Generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019). 6976–6987.Google Scholar
- Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. 2019. One Pixel Attack for Fooling Deep Neural Networks. IEEE Transactions on Evolutionary Computation 23, 5 (2019), 828–841.Google ScholarCross Ref
- Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2015). 1–9.Google ScholarCross Ref
- Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In 2nd International Conference on Learning Representations (ICLR 2014).Google Scholar
- Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, and Ludwig Schmidt. 2020. Measuring robustness to natural distribution shifts in image classification. arXiv preprint arXiv:2007.00644 (2020).Google Scholar
- Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, David Lopez-Paz, and Yoshua Bengio. 2019. Manifold Mixup: Better Representations by Interpolating Hidden States. In 36th International Conference on Machine Learning (ICML 2019). 6438–6447.Google Scholar
- Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L Yuille. 2019. Improving Transferability of Adversarial Examples With Input Diversity. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2019). 2730–2739.Google Scholar
- Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. 2019. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features. In IEEE/CVF International Conference on Computer Vision (ICCV 2019). 6023–6032.Google ScholarCross Ref
- Sergey Zagoruyko and Nikos Komodakis. 2016. Wide residual networks. arXiv preprint arXiv:1605.07146 (2016).Google Scholar
- Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. 2018. mixup: Beyond Empirical Risk Minimization. In 6th International Conference on Learning Representations (ICLR 2018).Google Scholar
- Jiahao Zhao, Penghui Wei, and Wenji Mao. 2021. Robust Neural Text Classification and Entailment via Mixup Regularized Adversarial Training. In International ACM SIGIR Conference on Research and Development in Information Retrieval. 1778–1782.Google ScholarDigital Library
Index Terms
- Boost Off/On-Manifold Adversarial Robustness for Deep Learning with Latent Representation Mixup
Recommendations
Improving adversarial robustness of deep neural networks via adaptive margin evolution
Highlights- A poof of the existence of an optimal state for adversarial training.
- Adaptive ...
AbstractAdversarial training is the most popular and general strategy to improve Deep Neural Network (DNN) robustness against adversarial noises. Many adversarial training methods have been proposed in the past few years. However, most of ...
Adversarial supervised contrastive learning
AbstractContrastive learning is prevalently used in pre-training deep models, followed with fine-tuning in downstream tasks for better performance or faster training. However, pre-trained models from contrastive learning are barely robust against ...
A hybrid adversarial training for deep learning model and denoising network resistant to adversarial examples
AbstractDeep neural networks (DNNs) are vulnerable to adversarial attacks that generate adversarial examples by adding small perturbations to the clean images. To combat adversarial attacks, the two main defense methods used are denoising and adversarial ...
Comments