Skip to main content
Log in

Lagrangian objective function leads to improved unforeseen attack generalization

  • Published:
Machine Learning Aims and scope Submit manuscript

Abstract

Recent improvements in deep learning models and their practical applications have raised concerns about the robustness of these models against adversarial examples. Adversarial training (AT) has been shown effective in reaching a robust model against the attack that is used during training. However, it usually fails against other attacks, i.e., the model overfits to the training attack scheme. In this paper, we propose a new method for generating adversarial perturbations during training that mitigates the mentioned issue. More specifically, we minimize the perturbation \(\ell _p\) norm while maximizing the classification loss in the Lagrangian form to craft adversarial examples. We argue that crafting adversarial examples based on this scheme results in enhanced attack generalization in the learned model. We compare our final model robust accuracy with the closely related state-of-the-art AT methods against attacks that were not used during training. This comparison demonstrates that our average robust accuracy against unseen attacks is 5.9% higher in the CIFAR-10 dataset and 3.2% higher in the ImageNet-100 dataset than corresponding state-of-the-art methods. We also demonstrate that our attack is faster than other attack schemes that are designed for unseen attack generalization and conclude that the proposed method is feasible for large datasets. Our code is available at https://github.com/rohban-lab/Lagrangian_Unseen.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availibility statement

All datasets that are used, including CIFAR-10, ImageNet-100, and Flowers-Recognition are publicly available.

Code availability

The codes are made publicly available at https://github.com/rohban-lab/Lagrangian_Unseen, which contains necessary instructions to reproduce the results.

Notes

  1. https://www.kaggle.com/alxmamaev/flowers-recognition.

References

  • Akhtar, N., & Mian, A. (2018). Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access 6(August), 14410–14430. https://doi.org/10.1109/ACCESS.2018.2807385.

    Article  Google Scholar 

  • Allen-Zhu, Z., & Li, Y. (2022). Feature purification: How adversarial training performs robust deep learning. In IEEE 62nd annual symposium on foundations of computer science, FOCS. https://doi.org/10.1109/FOCS52979.2021.00098.

  • Andriushchenko, M., & Flammarion, N. (2020). Understanding and improving fast adversarial training. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2020/file/b8ce47761ed7b3b6f48b583350b7f9e4-Paper.pdf.

  • Athalye, A., Carlini, N., & Wagner, D.A. (2018). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the 35th international conference on machine learning, ICML. http://proceedings.mlr.press/v80/athalye18a.html.

  • Athalye, A., Engstrom, L., Ilyas, A., & Kwok, K. (2018). Synthesizing robust adversarial examples. In Proceedings of the 35th international conference on machine learning, ICML. http://proceedings.mlr.press/v80/athalye18b.html.

  • Bai, T., Luo, J., Zhao, J., Wen, B., & Wang, Q. (2021). Recent advances in adversarial training for adversarial robustness. In Proceedings of the 30th international joint conference on artificial intelligence, IJCAI. https://doi.org/10.24963/ijcai.2021/591. https://doi.org/10.24963/ijcai.2021/591.

  • Balaji, Y., Goldstein, T., & Hoffman, J. (2019). Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets. CoRR arXiv:1910.08051.

  • Cohen, J.M., Rosenfeld, E., & Kolter, J.Z. (2019). Certified adversarial robustness via randomized smoothing. In Proceedings of the 36th international conference on machine learning, ICML. http://proceedings.mlr.press/v97/cohen19c.html.

  • Carlini, N., & Wagner, D. (2016). Towards evaluating the robustness of neural networks. In IEEE symposium on security and privacy, SP. https://doi.org/10.1109/SP.2017.49.

  • Chen, T., Liu, S., Chang, S., Cheng, Y., Amini, L., & Wang, Z. (2020). Adversarial robustness: From self-supervised pre-training to fine-tuning. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR42600.2020.00078.

  • Croce, F., & Hein, M. (2019). Sparse and imperceivable adversarial attacks. In Proceedings of the IEEE international conference on computer vision, ICCV. http://arxiv.org/abs/1909.05040.

  • Croce, F., & Hein, M. (2020). Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In Proceedings of the 37th international conference on machine learning, ICML. http://proceedings.mlr.press/v119/croce20b.html

  • Ding, G.W., Lui, K.Y.C., Jin, X., Wang, L., & Huang, R. (2019). On the sensitivity of adversarial robustness to input data distributions. In 7th international conference on learning representations, ICLR. https://openreview.net/forum?id=S1xNEhR9KX.

  • Ding, G.W., Sharma, Y., Lui, K.Y.C., & Huang, R. (2020). Mma training: Direct input space margin maximization through adversarial training. In 8th international conference on learning representations, ICLR. https://openreview.net/forum?id=HkeryxBtPB.

  • Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., & Li, J. (2017). Boosting Adversarial Attacks with Momentum. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR.2018.00957.

  • Goodfellow, I.J., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. In 3rd international conference on learning representations, ICLR. arXiv:abs/1412.6572.

  • Gulrajani, I., & Lopez-Paz, D. (2021). In search of lost domain generalization. In 9th international conference on learning representations, ICLR. https://openreview.net/forum?id=lQdXeXDoWtI.

  • Hendrycks, D., Zou, A., Mazeika, M., Tang, L., Li, B., Song, D., & Steinhardt, J. (2022). PixMix: Dreamlike pictures comprehensively improve safety measures. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR52688.2022.01628.

  • Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., & Madry, A. (2019). Adversarial examples are not bugs, they are features. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2019/file/e2c420d928d4bf8ce0ff2ec19b371514-Paper.pdf.

  • Javanmard, A., Soltanolkotabi, M., & Hassani, H. (2020). Precise tradeoffs in adversarial training for linear regression. In Conference on learning theory, COLT. http://proceedings.mlr.press/v125/javanmard20a.html.

  • Kang, D., Sun, Y., Hendrycks, D., Brown, T., & Steinhardt, J. (2019). Testing robustness against unforeseen adversaries. CoRR arXiv:abs/1908.08016.

  • Kettunen, M., Härkönen, E., & Lehtinen, J. (2019). E-LPIPS: Robust perceptual image similarity via random transformation ensembles. CoRR arXiv:abs/1906.03973.

  • Kim, M., Tack, J., & Hwang, S.J. (2020). Adversarial self-supervised contrastive learning. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2020/file/1f1baa5b8edac74eb4eaa329f14a0361-Paper.pdf.

  • Kireev, K., Andriushchenko, M., & Flammarion, N. (2021). On the effectiveness of adversarial training against common corruptions. arXiv:2103.02325.

  • Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. Technical report.

  • Kurakin, A., Goodfellow, I.J., & Bengio, S. (2017). Adversarial examples in the physical world. In 5th international conference on learning representations workshop, ICLR. https://openreview.net/forum?id=HJGU3Rodl.

  • Laidlaw, C., & Feizi, S. (2019). Functional adversarial attacks. In Advances in neural information processing systems, NeurIPS.

  • Laidlaw, C., Singla, S., & Feizi, S. (2021). Perceptual adversarial robustness: Defense against unseen threat models. In 9th international conference on learning representations, ICLR. https://openreview.net/forum?id=dFwBosAcJkN.

  • Lehman, C. (2019). Learning to Segment CIFAR10. https://charlielehman.github.io/post/weak-segmentation-cifar10/.

  • Lin, S. C., Zhang, Y., Hsu, C. H., Skach, M., Haque, M. E., Tang, L., & Mars, J. (2018). The architectural implications of autonomous driving: Constraints and acceleration. ACM SIGPLAN Notices 53(2), 751–766. https://doi.org/10.1145/3173162.3173191.

    Article  Google Scholar 

  • Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. In 6th international conference on learning representations, ICLR. https://openreview.net/forum?id=rJzIBfZAb.

  • Maini, P., Wong, E., & Zico Kolter, J. (2020). Adversarial robustness against the union of multiple perturbation models. In Proceedings of the 37th international conference on machine learning, ICML. http://proceedings.mlr.press/v119/maini20a.html.

  • Milgrom, P. & Segal, I. (2002). Envelope theorems for arbitrary choice sets. Econometrica 70(2), 583–601.

    Article  MathSciNet  MATH  Google Scholar 

  • Modas, A., Moosavi-Dezfooli, S.-M., & Frossard, P. (2019). Sparsefool: A few pixels make a big difference. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR https://doi.org/10.1109/CVPR.2019.00930.

  • Mohseni, S., Wang, H., Xiao, C., Yu, Z., Wang, Z., & Yadawa, J. (2023). Taxonomy of machine learning safety: A survey and primer. ACM Comput. Surv., 55(8), 157–115738.

    Article  Google Scholar 

  • Moosavi-Dezfooli, S.-M., Fawzi, A., & Frossard, P. (2015). DeepFool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR.2016.282.

  • Qian, Z., Huang, K., Wang, Q., & Zhang, X. (2022). A survey of robust adversarial training in pattern recognition: Fundamental, theory, and methodologies. Pattern Recognition, 131, 108889.

    Article  Google Scholar 

  • Rade, R., Moosavi-Dezfooli, & S.-M. (2022). Reducing excessive margin to achieve a better accuracy vs. robustness trade-off. In 10th international conference on learning representations, ICLR. https://openreview.net/forum?id=Azh9QBQ4tR7.

  • Rebuffi, S.-A., Gowal, S., Calian, D.A., Stimberg, F., Wiles, O., & Mann, T.A. (2021) Data augmentation can improve robustness. In Advances in Neural Information Processing Systems, NeurIPS.

  • Rice, L., Wong, E., & Kolter, J.Z. (2020). Overfitting in adversarially robust deep learning. In Proceedings of the 37th international conference on machine learning, ICML. http://proceedings.mlr.press/v119/rice20a.html.

  • Rony, J., Granger, E., Pedersoli, M., & Ayed, I.B. (2020). Augmented Lagrangian adversarial attacks. CoRR arXiv:abs/2011.11857.

  • Rony, J., Hafemann, L.G., Oliveira, L.S., Ben Ayed, I., Sabourin, R., & Granger, E. (2019). Decoupling direction and norm for efficient gradient-based l2 adversarial attacks and defenses. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR.2019.00445.

  • Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., & Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, IJCV, 115(3), 211–252. https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  • Sehwag, V., Mahloujifar, S., Handina, T., Dai, S., Xiang, C., Chiang, M., & Mittal, P. (2022). Robust learning meets generative models: Can proxy distributions improve adversarial robustness? In 10th international conference on learning representations, ICLR. https://openreview.net/forum?id=WVX0NNVBBkV.

  • Sen, A., Zhu, X., Marshall, L., Nowak, & R.D. (2019). Should adversarial attacks use pixel p-norm? CoRR arXiv:1906.02439.

  • Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I.J., & Fergus, R. (2014). Intriguing properties of neural networks. In 2nd international conference on learning representations, ICLR. arXiv:abs/1312.6199.

  • Taori, R., Dave, A., Shankar, V., Carlini, N., Recht, B., & Schmidt, L. (2020). Measuring robustness to natural distribution shifts in image classification. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2020/hash/d8330f857a17c53d217014ee776bfd50-Abstract.html.

  • Tramer, F., & Boneh, D. (2019). Adversarial training and robustness for multiple perturbations. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2019/file/5d4ae76f053f8f2516ad12961ef7fe97-Paper.pdf.

  • Tramèr, F., Behrmann, J., Carlini, N., Papernot, N., & Jacobsen, J. (2020). Fundamental tradeoffs between invariance and sensitivity to adversarial perturbations. In Proceedings of the 37th international conference on machine learning, ICML. http://proceedings.mlr.press/v119/tramer20a.html

  • Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., & Madry, A. (2019). Robustness may be at odds with accuracy. In 7th international conference on learning representations, ICLR. https://openreview.net/forum?id=SyxAb30cY7.

  • Uesato, J., O’Donoghue, B., Kohli, P., & van den Oord, A. (2018). Adversarial risk and the dangers of evaluating against weak attacks. In Proceedings of the 35th International Conference on Machine Learning, ICML. http://proceedings.mlr.press/v80/uesato18a.html.

  • Wang, Z., Pang, T., Du, C., Lin, M., Liu, W., & Yan, S. (2023). Better diffusion models further improve adversarial training. CoRR arXiv: abs/2302.04638.

  • Wang, H., Xiao, C., Kossaifi, J., Yu, Z., Anandkumar, A., & Wang, Z. (2021). Augmax: Adversarial composition of random augmentations for robust training. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2021/hash/01e9565cecc4e989123f9620c1d09c09-Abstract.html.

  • Wang, H., Zhang, A., Zheng, S., Shi, X., Li, M., & Wang, Z. (2022). Removing batch normalization boosts adversarial training. In Proceedings of the 39th international conference on machine learning, ICML. https://proceedings.mlr.press/v162/wang22ap.html.

  • Wang, Y., Zou, D., Yi, J., Bailey, J., Ma, X., & Gu, Q. (2020). Improving adversarial robustness requires revisiting misclassified examples. In 8th international conference on learning representations, ICLR. https://openreview.net/forum?id=rklOg6EFwS.

  • Wang, Z., Bovik, A.C., Sheikh, H.R., \& Simoncelli, E.P. (2004). Simoncelli, Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing. https://doi.org/10.1109/TIP.2003.819861.

    Article  Google Scholar 

  • Wang, M., & Deng, W. (2021). Deep face recognition: A survey. Neurocomputing, 429, 215–244. https://doi.org/10.1016/j.neucom.2020.10.081

    Article  Google Scholar 

  • Wong, E., Schmidt, F.R., & Kolter, J.Z. (2019). Wasserstein adversarial examples via projected Sinkhorn iterations. In Proceedings of the 36th international conference on machine learning, ICML. http://proceedings.mlr.press/v97/wong19a.html.

  • Wu, D., Xia, S., & Wang, Y. (2020). Adversarial weight perturbation helps robust generalization. In Advances in neural information processing systems, NeurIPS. https://proceedings.neurips.cc/paper/2020/hash/1ef91c212e30e14bf125e9374262401f-Abstract.html.

  • Xiao, C., Zhu, J.-Y., Li, B., He, W., Liu, M., & Song, D. (2018). Spatially transformed adversarial examples. In 6th international conference on learning representations, ICLR. https://openreview.net/forum?id=HyydRMZC-.

  • Xing, Y., Zhang, R., & Cheng, G. (2021). Adversarially robust estimate and risk analysis in linear regression. In The 24th international conference on artificial intelligence and statistics, AISTATS. http://proceedings.mlr.press/v130/xing21c.html.

  • Zagoruyko, S., & Komodakis, N. (2017). Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. In 5th international conference on learning representations, ICLR. https://openreview.net/forum?id=Sks9_ajex

  • Zhai, R., Dan, C., He, D., Zhang, H., Gong, B., Ravikumar, P., Hsieh, C., & Wang, L. (2020). MACER: attack-free and scalable robust training via maximizing certified radius. In 8th international conference on learning representations, ICLR. https://openreview.net/forum?id=rJx1Na4Fwr.

  • Zhang, R., Isola, P., Efros, A.A., Shechtman, E., & Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR. https://doi.org/10.1109/CVPR.2018.00068.

  • Zhang, B., Jiang, D., He, D., & Wang, L. (2022). Rethinking lipschitz neural networks and certified robustness: A Boolean function perspective. In Advances in neural information processing systems, NeurIPS. https://openreview.net/forum?id=xaWO6bAY0xM.

  • Zhang, J., Xu, X., Han, B., Niu, G., Cui, L., Sugiyama, M., & Kankanhalli, M.S. (2020). Attacks which do not kill training make adversarial learning stronger. In Proceedings of the 37th international conference on machine learning, ICML. http://proceedings.mlr.press/v119/zhang20z.html.

  • Zhang, H., Yu, Y., Jiao, J., Xing, E.P., Ghaoui, L.E., & Jordan, M.I. (2019). Theoretically principled trade-off between robustness and accuracy. In Proceedings of the 36th international conference on machine learning, ICML. http://proceedings.mlr.press/v97/zhang19p.html.

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Contributions

M.A. and M.H.R. contributed to the conception, design, analysis, interpretation of results, and drafting of the manuscript. M.A. also contributed to writing and running codes, and designing and performing the ablation studies.

Corresponding author

Correspondence to Mohammad Hossein Rohban.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Editor: Lijun Zhang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Azizmalayeri, M., Rohban, M.H. Lagrangian objective function leads to improved unforeseen attack generalization. Mach Learn 112, 3003–3031 (2023). https://doi.org/10.1007/s10994-023-06348-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10994-023-06348-3

Keywords

Navigation