Attribution of Adversarial Attacks via Multi-task Learning

Guo, Zhongyi; Han, Keji; Ge, Yao; Li, Yun; Ji, Wei

doi:10.1007/978-981-99-8082-6_7

Zhongyi Guo¹²,
Keji Han¹²,
Yao Ge¹²,
Yun Li^12,13 &
…
Wei Ji¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14448))

Included in the following conference series:

International Conference on Neural Information Processing

528 Accesses

Abstract

Deep neural networks (DNNs) can be easily fooled by adversarial examples during inference phase when attackers add imperceptible perturbations to original examples. Many works focus on adversarial detection and adversarial training to defend against adversarial attacks. However, few works explore the tool-chains behind adversarial examples, which is called Adversarial Attribution Problem (AAP). In this paper, AAP is defined as the recognition of three signatures, i.e., attack algorithm, victim model and hyperparameter. Existing works transfer AAP into a single-label classification task and ignore the relationship among above three signatures. Actually, there exists owner-member relationship between attack algorithm and hyperparameter, which means hyperparameter recognition relies on the result of attack algorithm classification. Besides, the value of hyperparameter is continuous, hence hyperparameter recognition should be regarded as a regression task. As a result, AAP should be considered as a multi-task learning problem rather than a single-label classification problem or a single-task learning problem. To deal with above problems, we propose a multi-task learning framework named Multi-Task Adversarial Attribution (MTAA) to recognize above three signatures simultaneously. It takes the relationship between attack algorithm and the corresponding hyperparameter into account and uses the uncertainty weighted loss to adjust the weights of three recognition tasks. The experimental results on MNIST and ImageNet show the feasibility and scalability of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2014)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)
Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2018)
Google Scholar
Carlini, N., Wagner, D, Towards evaluating the robustness of neural networks. In: S &P, pp. 39–57 (2017). https://doi.org/10.1109/SP.2017.49
Qin, Y., Frosst, N., Sabour, S., Raffel, C., Cottrell, G., Hinton, G.: Detecting and diagnosing adversarial images with class-conditional capsule reconstructions. In: ICLR (2020)
Google Scholar
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. In: ICLR (2018)
Google Scholar
DARPA Artificial Intelligence Exploration (AIE) Opportunity and DARPA-PA-19-02-09 and Reverse Engineering of Deceptions (RED)
Google Scholar
Gong, Y., et al.: Reverse engineering of imperceptible adversarial image perturbations. In: ICLR (2022)
Google Scholar
Thaker, D., Giampouras, P., Vidal, R.: Reverse Engineering \(\ell _p \) attacks: a block-sparse optimization approach with recovery guarantees. In: ICML, pp. 21253–21271 (2022)
Google Scholar
Papernot, N., et al.: Technical report on the CleverHans v2.1.0 adversarial examples library. arXiv preprint arXiv:1610.00768 (2016)
Moayeri, M., Feizi, S, Sample efficient detection and classification of adversarial attacks via self-supervised embeddings. In: ICCV, pp. 7677–7686 (2021)
Google Scholar
Dotter, M., Xie, S., Manville, K., Harguess, J., Busho, C.: Rodriguez, M.: Adversarial attack attribution: discovering attributable signals in adversarial ML attacks. In: RSEML Workshop at AAAI (2021)
Google Scholar
Li, Y.: Supervised Classification on Deep Neural Network Attack Toolchains. Northeastern University (2021)
Google Scholar
Nicholson, D.A., Emanuele, V.: Reverse engineering adversarial attacks with fingerprints from adversarial examples. arXiv preprint arXiv:2301.13869 (2023)
Kingma, D. P., Ba, J, Adam: a method for stochastic optimization. In: ICLR (2015)
Google Scholar
Zhang, H., Li, A., Guo, J., Guo, Y.: Hybrid models for open set recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 102–117. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_7
Chapter Google Scholar
Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: CVPR, pp. 7482–7491 (2018)
Google Scholar
Maninis, K.K., Radosavovic, I., Kokkinos, I.: Attentive single-tasking of multiple tasks. In: CVPR, pp. 1851–1860 (2019)
Google Scholar

Download references

Acknowledgements

This work was partially supported by National Natural Science Foundation of China (No. 61772284).

Author information

Authors and Affiliations

School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
Zhongyi Guo, Keji Han, Yao Ge & Yun Li
Jiangsu Key Laboratory of Big Data Security and Intelligent Processing, Nanjing, 210023, China
Yun Li
School of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing, 210003, China
Wei Ji

Authors

Zhongyi Guo
View author publications
You can also search for this author in PubMed Google Scholar
Keji Han
View author publications
You can also search for this author in PubMed Google Scholar
Yao Ge
View author publications
You can also search for this author in PubMed Google Scholar
Yun Li
View author publications
You can also search for this author in PubMed Google Scholar
Wei Ji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yun Li .

Editor information

Editors and Affiliations

Central South University, Changsha, China
Biao Luo
Chinese Academy of Sciences, Beijing, China
Long Cheng
Zhejiang University, Hangzhou, China
Zheng-Guang Wu
Guangdong University of Technology, Guangzhou, China
Hongyi Li
UNSW Sydney, Sydney, NSW, Australia
Chaojie Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guo, Z., Han, K., Ge, Y., Li, Y., Ji, W. (2024). Attribution of Adversarial Attacks via Multi-task Learning. In: Luo, B., Cheng, L., Wu, ZG., Li, H., Li, C. (eds) Neural Information Processing. ICONIP 2023. Lecture Notes in Computer Science, vol 14448. Springer, Singapore. https://doi.org/10.1007/978-981-99-8082-6_7

Download citation

DOI: https://doi.org/10.1007/978-981-99-8082-6_7
Published: 15 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8081-9
Online ISBN: 978-981-99-8082-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Attribution of Adversarial Attacks via Multi-task Learning