Skip to main content

Attribution of Adversarial Attacks via Multi-task Learning

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14448))

Included in the following conference series:

  • 528 Accesses

Abstract

Deep neural networks (DNNs) can be easily fooled by adversarial examples during inference phase when attackers add imperceptible perturbations to original examples. Many works focus on adversarial detection and adversarial training to defend against adversarial attacks. However, few works explore the tool-chains behind adversarial examples, which is called Adversarial Attribution Problem (AAP). In this paper, AAP is defined as the recognition of three signatures, i.e., attack algorithm, victim model and hyperparameter. Existing works transfer AAP into a single-label classification task and ignore the relationship among above three signatures. Actually, there exists owner-member relationship between attack algorithm and hyperparameter, which means hyperparameter recognition relies on the result of attack algorithm classification. Besides, the value of hyperparameter is continuous, hence hyperparameter recognition should be regarded as a regression task. As a result, AAP should be considered as a multi-task learning problem rather than a single-label classification problem or a single-task learning problem. To deal with above problems, we propose a multi-task learning framework named Multi-Task Adversarial Attribution (MTAA) to recognize above three signatures simultaneously. It takes the relationship between attack algorithm and the corresponding hyperparameter into account and uses the uncertainty weighted loss to adjust the weights of three recognition tasks. The experimental results on MNIST and ImageNet show the feasibility and scalability of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  2. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

    Google Scholar 

  3. Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2014)

    Google Scholar 

  4. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)

    Google Scholar 

  5. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2018)

    Google Scholar 

  6. Carlini, N., Wagner, D, Towards evaluating the robustness of neural networks. In: S &P, pp. 39–57 (2017). https://doi.org/10.1109/SP.2017.49

  7. Qin, Y., Frosst, N., Sabour, S., Raffel, C., Cottrell, G., Hinton, G.: Detecting and diagnosing adversarial images with class-conditional capsule reconstructions. In: ICLR (2020)

    Google Scholar 

  8. Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. In: ICLR (2018)

    Google Scholar 

  9. DARPA Artificial Intelligence Exploration (AIE) Opportunity and DARPA-PA-19-02-09 and Reverse Engineering of Deceptions (RED)

    Google Scholar 

  10. Gong, Y., et al.: Reverse engineering of imperceptible adversarial image perturbations. In: ICLR (2022)

    Google Scholar 

  11. Thaker, D., Giampouras, P., Vidal, R.: Reverse Engineering \(\ell _p \) attacks: a block-sparse optimization approach with recovery guarantees. In: ICML, pp. 21253–21271 (2022)

    Google Scholar 

  12. Papernot, N., et al.: Technical report on the CleverHans v2.1.0 adversarial examples library. arXiv preprint arXiv:1610.00768 (2016)

  13. Moayeri, M., Feizi, S, Sample efficient detection and classification of adversarial attacks via self-supervised embeddings. In: ICCV, pp. 7677–7686 (2021)

    Google Scholar 

  14. Dotter, M., Xie, S., Manville, K., Harguess, J., Busho, C.: Rodriguez, M.: Adversarial attack attribution: discovering attributable signals in adversarial ML attacks. In: RSEML Workshop at AAAI (2021)

    Google Scholar 

  15. Li, Y.: Supervised Classification on Deep Neural Network Attack Toolchains. Northeastern University (2021)

    Google Scholar 

  16. Nicholson, D.A., Emanuele, V.: Reverse engineering adversarial attacks with fingerprints from adversarial examples. arXiv preprint arXiv:2301.13869 (2023)

  17. Kingma, D. P., Ba, J, Adam: a method for stochastic optimization. In: ICLR (2015)

    Google Scholar 

  18. Zhang, H., Li, A., Guo, J., Guo, Y.: Hybrid models for open set recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 102–117. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_7

    Chapter  Google Scholar 

  19. Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: CVPR, pp. 7482–7491 (2018)

    Google Scholar 

  20. Maninis, K.K., Radosavovic, I., Kokkinos, I.: Attentive single-tasking of multiple tasks. In: CVPR, pp. 1851–1860 (2019)

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by National Natural Science Foundation of China (No. 61772284).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yun Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guo, Z., Han, K., Ge, Y., Li, Y., Ji, W. (2024). Attribution of Adversarial Attacks via Multi-task Learning. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14448. Springer, Singapore. https://doi.org/10.1007/978-981-99-8082-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8082-6_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8081-9

  • Online ISBN: 978-981-99-8082-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics