Abstract
Recent improvements in deep learning models and their practical applications have raised concerns about the robustness of these models against adversarial examples. Adversarial training (AT) has been shown effective in reaching a robust model against the attack that is used during training. However, it usually fails against other attacks, i.e., the model overfits to the training attack scheme. In this paper, we propose a new method for generating adversarial perturbations during training that mitigates the mentioned issue. More specifically, we minimize the perturbation \(\ell _p\) norm while maximizing the classification loss in the Lagrangian form to craft adversarial examples. We argue that crafting adversarial examples based on this scheme results in enhanced attack generalization in the learned model. We compare our final model robust accuracy with the closely related state-of-the-art AT methods against attacks that were not used during training. This comparison demonstrates that our average robust accuracy against unseen attacks is 5.9% higher in the CIFAR-10 dataset and 3.2% higher in the ImageNet-100 dataset than corresponding state-of-the-art methods. We also demonstrate that our attack is faster than other attack schemes that are designed for unseen attack generalization and conclude that the proposed method is feasible for large datasets. Our code is available at https://github.com/rohban-lab/Lagrangian_Unseen.
Similar content being viewed by others
Data availibility statement
All datasets that are used, including CIFAR-10, ImageNet-100, and Flowers-Recognition are publicly available.
Code availability
The codes are made publicly available at https://github.com/rohban-lab/Lagrangian_Unseen, which contains necessary instructions to reproduce the results.
References
Akhtar, N., & Mian, A. (2018). Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access 6(August), 14410–14430. https://doi.org/10.1109/ACCESS.2018.2807385.
Allen-Zhu, Z., & Li, Y. (2022). Feature purification: How adversarial training performs robust deep learning. In IEEE 62nd annual symposium on foundations of computer science, FOCS. https://doi.org/10.1109/FOCS52979.2021.00098.
Andriushchenko, M., & Flammarion, N. (2020). Understanding and improving fast adversarial training. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2020/file/b8ce47761ed7b3b6f48b583350b7f9e4-Paper.pdf.
Athalye, A., Carlini, N., & Wagner, D.A. (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the 35th international conference on machine learning, ICML. http://proceedings.mlr.press/v80/athalye18a.html.
Athalye, A., Engstrom, L., Ilyas, A., & Kwok, K. (2018). Synthesizing robust adversarial examples. In Proceedings of the 35th international conference on machine learning, ICML. http://proceedings.mlr.press/v80/athalye18b.html.
Bai, T., Luo, J., Zhao, J., Wen, B., & Wang, Q. (2021). Recent advances in adversarial training for adversarial robustness. In Proceedings of the 30th international joint conference on artificial intelligence, IJCAI. https://doi.org/10.24963/ijcai.2021/591. https://doi.org/10.24963/ijcai.2021/591.
Balaji, Y., Goldstein, T., & Hoffman, J. (2019). Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets. CoRR arXiv:1910.08051.
Cohen, J.M., Rosenfeld, E., & Kolter, J.Z. (2019). Certified adversarial robustness via randomized smoothing. In Proceedings of the 36th international conference on machine learning, ICML. http://proceedings.mlr.press/v97/cohen19c.html.
Carlini, N., & Wagner, D. (2016). Towards evaluating the robustness of neural networks. In IEEE symposium on security and privacy, SP. https://doi.org/10.1109/SP.2017.49.
Chen, T., Liu, S., Chang, S., Cheng, Y., Amini, L., & Wang, Z. (2020). Adversarial robustness: From self-supervised pre-training to fine-tuning. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR42600.2020.00078.
Croce, F., & Hein, M. (2019). Sparse and imperceivable adversarial attacks. In Proceedings of the IEEE international conference on computer vision, ICCV. http://arxiv.org/abs/1909.05040.
Croce, F., & Hein, M. (2020). Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In Proceedings of the 37th international conference on machine learning, ICML. http://proceedings.mlr.press/v119/croce20b.html
Ding, G.W., Lui, K.Y.C., Jin, X., Wang, L., & Huang, R. (2019). On the sensitivity of adversarial robustness to input data distributions. In 7th international conference on learning representations, ICLR. https://openreview.net/forum?id=S1xNEhR9KX.
Ding, G.W., Sharma, Y., Lui, K.Y.C., & Huang, R. (2020). Mma training: Direct input space margin maximization through adversarial training. In 8th international conference on learning representations, ICLR. https://openreview.net/forum?id=HkeryxBtPB.
Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., & Li, J. (2017). Boosting Adversarial Attacks with Momentum. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR.2018.00957.
Goodfellow, I.J., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. In 3rd international conference on learning representations, ICLR. arXiv:abs/1412.6572.
Gulrajani, I., & Lopez-Paz, D. (2021). In search of lost domain generalization. In 9th international conference on learning representations, ICLR. https://openreview.net/forum?id=lQdXeXDoWtI.
Hendrycks, D., Zou, A., Mazeika, M., Tang, L., Li, B., Song, D., & Steinhardt, J. (2022). PixMix: Dreamlike pictures comprehensively improve safety measures. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR52688.2022.01628.
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., & Madry, A. (2019). Adversarial examples are not bugs, they are features. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2019/file/e2c420d928d4bf8ce0ff2ec19b371514-Paper.pdf.
Javanmard, A., Soltanolkotabi, M., & Hassani, H. (2020). Precise tradeoffs in adversarial training for linear regression. In Conference on learning theory, COLT. http://proceedings.mlr.press/v125/javanmard20a.html.
Kang, D., Sun, Y., Hendrycks, D., Brown, T., & Steinhardt, J. (2019). Testing robustness against unforeseen adversaries. CoRR arXiv:abs/1908.08016.
Kettunen, M., Härkönen, E., & Lehtinen, J. (2019). E-LPIPS: Robust perceptual image similarity via random transformation ensembles. CoRR arXiv:abs/1906.03973.
Kim, M., Tack, J., & Hwang, S.J. (2020). Adversarial self-supervised contrastive learning. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2020/file/1f1baa5b8edac74eb4eaa329f14a0361-Paper.pdf.
Kireev, K., Andriushchenko, M., & Flammarion, N. (2021). On the effectiveness of adversarial training against common corruptions. arXiv:2103.02325.
Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Technical report.
Kurakin, A., Goodfellow, I.J., & Bengio, S. (2017). Adversarial examples in the physical world. In 5th international conference on learning representations workshop, ICLR. https://openreview.net/forum?id=HJGU3Rodl.
Laidlaw, C., & Feizi, S. (2019). Functional adversarial attacks. In Advances in neural information processing systems, NeurIPS.
Laidlaw, C., Singla, S., & Feizi, S. (2021). Perceptual adversarial robustness: Defense against unseen threat models. In 9th international conference on learning representations, ICLR. https://openreview.net/forum?id=dFwBosAcJkN.
Lehman, C. (2019). Learning to Segment CIFAR10. https://charlielehman.github.io/post/weak-segmentation-cifar10/.
Lin, S. C., Zhang, Y., Hsu, C. H., Skach, M., Haque, M. E., Tang, L., & Mars, J. (2018). The architectural implications of autonomous driving: Constraints and acceleration. ACM SIGPLAN Notices 53(2), 751–766. https://doi.org/10.1145/3173162.3173191.
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. In 6th international conference on learning representations, ICLR. https://openreview.net/forum?id=rJzIBfZAb.
Maini, P., Wong, E., & Zico Kolter, J. (2020). Adversarial robustness against the union of multiple perturbation models. In Proceedings of the 37th international conference on machine learning, ICML. http://proceedings.mlr.press/v119/maini20a.html.
Milgrom, P. & Segal, I. (2002). Envelope theorems for arbitrary choice sets. Econometrica 70(2), 583–601.
Modas, A., Moosavi-Dezfooli, S.-M., & Frossard, P. (2019). Sparsefool: A few pixels make a big difference. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR https://doi.org/10.1109/CVPR.2019.00930.
Mohseni, S., Wang, H., Xiao, C., Yu, Z., Wang, Z., & Yadawa, J. (2023). Taxonomy of machine learning safety: A survey and primer. ACM Comput. Surv., 55(8), 157–115738.
Moosavi-Dezfooli, S.-M., Fawzi, A., & Frossard, P. (2015). DeepFool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR.2016.282.
Qian, Z., Huang, K., Wang, Q., & Zhang, X. (2022). A survey of robust adversarial training in pattern recognition: Fundamental, theory, and methodologies. Pattern Recognition, 131, 108889.
Rade, R., Moosavi-Dezfooli, & S.-M. (2022). Reducing excessive margin to achieve a better accuracy vs. robustness trade-off. In 10th international conference on learning representations, ICLR. https://openreview.net/forum?id=Azh9QBQ4tR7.
Rebuffi, S.-A., Gowal, S., Calian, D.A., Stimberg, F., Wiles, O., & Mann, T.A. (2021) Data augmentation can improve robustness. In Advances in Neural Information Processing Systems, NeurIPS.
Rice, L., Wong, E., & Kolter, J.Z. (2020). Overfitting in adversarially robust deep learning. In Proceedings of the 37th international conference on machine learning, ICML. http://proceedings.mlr.press/v119/rice20a.html.
Rony, J., Granger, E., Pedersoli, M., & Ayed, I.B. (2020). Augmented Lagrangian adversarial attacks. CoRR arXiv:abs/2011.11857.
Rony, J., Hafemann, L.G., Oliveira, L.S., Ben Ayed, I., Sabourin, R., & Granger, E. (2019). Decoupling direction and norm for efficient gradient-based l2 adversarial attacks and defenses. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR.2019.00445.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, IJCV, 115(3), 211–252. https://doi.org/10.1007/s11263-015-0816-y
Sehwag, V., Mahloujifar, S., Handina, T., Dai, S., Xiang, C., Chiang, M., & Mittal, P. (2022). Robust learning meets generative models: Can proxy distributions improve adversarial robustness? In 10th international conference on learning representations, ICLR. https://openreview.net/forum?id=WVX0NNVBBkV.
Sen, A., Zhu, X., Marshall, L., Nowak, & R.D. (2019). Should adversarial attacks use pixel p-norm? CoRR arXiv:1906.02439.
Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I.J., & Fergus, R. (2014). Intriguing properties of neural networks. In 2nd international conference on learning representations, ICLR. arXiv:abs/1312.6199.
Taori, R., Dave, A., Shankar, V., Carlini, N., Recht, B., & Schmidt, L. (2020). Measuring robustness to natural distribution shifts in image classification. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2020/hash/d8330f857a17c53d217014ee776bfd50-Abstract.html.
Tramer, F., & Boneh, D. (2019). Adversarial training and robustness for multiple perturbations. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2019/file/5d4ae76f053f8f2516ad12961ef7fe97-Paper.pdf.
Tramèr, F., Behrmann, J., Carlini, N., Papernot, N., & Jacobsen, J. (2020). Fundamental tradeoffs between invariance and sensitivity to adversarial perturbations. In Proceedings of the 37th international conference on machine learning, ICML. http://proceedings.mlr.press/v119/tramer20a.html
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., & Madry, A. (2019). Robustness may be at odds with accuracy. In 7th international conference on learning representations, ICLR. https://openreview.net/forum?id=SyxAb30cY7.
Uesato, J., O’Donoghue, B., Kohli, P., & van den Oord, A. (2018). Adversarial risk and the dangers of evaluating against weak attacks. In Proceedings of the 35th International Conference on Machine Learning, ICML. http://proceedings.mlr.press/v80/uesato18a.html.
Wang, Z., Pang, T., Du, C., Lin, M., Liu, W., & Yan, S. (2023). Better diffusion models further improve adversarial training. CoRR arXiv: abs/2302.04638.
Wang, H., Xiao, C., Kossaifi, J., Yu, Z., Anandkumar, A., & Wang, Z. (2021). Augmax: Adversarial composition of random augmentations for robust training. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2021/hash/01e9565cecc4e989123f9620c1d09c09-Abstract.html.
Wang, H., Zhang, A., Zheng, S., Shi, X., Li, M., & Wang, Z. (2022). Removing batch normalization boosts adversarial training. In Proceedings of the 39th international conference on machine learning, ICML. https://proceedings.mlr.press/v162/wang22ap.html.
Wang, Y., Zou, D., Yi, J., Bailey, J., Ma, X., & Gu, Q. (2020). Improving adversarial robustness requires revisiting misclassified examples. In 8th international conference on learning representations, ICLR. https://openreview.net/forum?id=rklOg6EFwS.
Wang, Z., Bovik, A.C., Sheikh, H.R., \& Simoncelli, E.P. (2004). Simoncelli, Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing. https://doi.org/10.1109/TIP.2003.819861.
Wang, M., & Deng, W. (2021). Deep face recognition: A survey. Neurocomputing, 429, 215–244. https://doi.org/10.1016/j.neucom.2020.10.081
Wong, E., Schmidt, F.R., & Kolter, J.Z. (2019). Wasserstein adversarial examples via projected Sinkhorn iterations. In Proceedings of the 36th international conference on machine learning, ICML. http://proceedings.mlr.press/v97/wong19a.html.
Wu, D., Xia, S., & Wang, Y. (2020). Adversarial weight perturbation helps robust generalization. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2020/hash/1ef91c212e30e14bf125e9374262401f-Abstract.html.
Xiao, C., Zhu, J.-Y., Li, B., He, W., Liu, M., & Song, D. (2018). Spatially transformed adversarial examples. In 6th international conference on learning representations, ICLR. https://openreview.net/forum?id=HyydRMZC-.
Xing, Y., Zhang, R., & Cheng, G. (2021). Adversarially robust estimate and risk analysis in linear regression. In The 24th international conference on artificial intelligence and statistics, AISTATS. http://proceedings.mlr.press/v130/xing21c.html.
Zagoruyko, S., & Komodakis, N. (2017). Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In 5th international conference on learning representations, ICLR. https://openreview.net/forum?id=Sks9_ajex
Zhai, R., Dan, C., He, D., Zhang, H., Gong, B., Ravikumar, P., Hsieh, C., & Wang, L. (2020). MACER: attack-free and scalable robust training via maximizing certified radius. In 8th international conference on learning representations, ICLR. https://openreview.net/forum?id=rJx1Na4Fwr.
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR.2018.00068.
Zhang, B., Jiang, D., He, D., & Wang, L. (2022). Rethinking lipschitz neural networks and certified robustness: A Boolean function perspective. In Advances in neural information processing systems, NeurIPS. https://openreview.net/forum?id=xaWO6bAY0xM.
Zhang, J., Xu, X., Han, B., Niu, G., Cui, L., Sugiyama, M., & Kankanhalli, M.S. (2020). Attacks which do not kill training make adversarial learning stronger. In Proceedings of the 37th international conference on machine learning, ICML. http://proceedings.mlr.press/v119/zhang20z.html.
Zhang, H., Yu, Y., Jiao, J., Xing, E.P., Ghaoui, L.E., & Jordan, M.I. (2019). Theoretically principled trade-off between robustness and accuracy. In Proceedings of the 36th international conference on machine learning, ICML. http://proceedings.mlr.press/v97/zhang19p.html.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
M.A. and M.H.R. contributed to the conception, design, analysis, interpretation of results, and drafting of the manuscript. M.A. also contributed to writing and running codes, and designing and performing the ablation studies.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no competing interests to declare.
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Editor: Lijun Zhang.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Azizmalayeri, M., Rohban, M.H. Lagrangian objective function leads to improved unforeseen attack generalization. Mach Learn 112, 3003–3031 (2023). https://doi.org/10.1007/s10994-023-06348-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-023-06348-3