Abstract
Adversarial attacks have emerged in the field of visual object tracking to mislead the tracker and result in its failure. Black-box attacks in particular have attracted increasing attention for their affinity with real-world applications. In the paradigm of decision-based black-box attacks, the magnitude of perturbation is gradually amplified, while the optimisation direction is predefined by an initial adversarial sample. Considering the pivotal role played by the initial adversarial sample in determining the success of an attack, we utilise the noise generated from the reverse process of a diffusion model as a better attacking direction. On the one hand, the diffusion model generates Gaussian noise, which formulate global information interaction, with a comprehensive impact on Transformer-based trackers. On the other hand, the diffusion model pays more attention to the target region during the inverse process, resulting in a more powerful perturbation of the target object. Our method, which is widely applicable, has been validated on a range of trackers using several benchmarking datasets. It is shown to deliver more extensive tracking performance degradation, compared to other state-of-the-art methods. We also investigate different approaches to the problem of generating the initial adversarial sample, confirming the effectiveness and rationality of our proposed diffusion initialisation method.
This work is supported in part by the National Natural Science Foundation of China (Grant No. 62106089, 62020106012).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Augustin, M., Boreiko, V., Croce, F., Hein, M.: Diffusion visual counterfactual explanations. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (NeurIPS) (2022)
Bai, S., Li, Y., Zhou, Y., Li, Q., Torr, P.S.: Adversarial metric attack and defense for person re-identification. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 43(06), 2119–2126 (2021)
Bhat, G., Danelljan, M., Gool, L.V., Timofte, R.: Learning discriminative model prediction for tracking. In: International Conference on Computer Vision (ICCV), pp. 6182–6191 (2019)
Brendel, W., Rauber, J., Bethge, M.: Decision-based adversarial attacks: reliable attacks against black-box machine learning models. In: International Conference on Learning Representations (ICLR) (2018)
Chen, J., Jordan, M.I., Wainwright, M.J.: HopSkipJumpAttack: a query-efficient decision-based attack. In: IEEE Symposium on Security and Privacy (SP), pp. 1277–1294. IEEE (2020)
Chen, X., Yan, X., Zheng, F., Jiang, Y., Ji, R.: One-shot adversarial attacks on visual tracking with dual attention. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Damer, N., Fang, M., Siebke, P., Kolf, J.N., Huber, M., Boutros, F.: MorDIFF: recognition vulnerability and attack detectability of face morphing attacks created by diffusion autoencoders (2023)
Dhariwal, P., Nichol, A.: Diffusion models beat GANs on image synthesis. In: Advances in Neural Information Processing Systems (NIPS), vol. 34, pp. 8780–8794 (2021)
Dong, Y., et al.: Efficient decision-based black-box adversarial attacks on face recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7714–7722 (2019)
Guo, Q., et al.: SPARK: spatial-aware online incremental attack against visual tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 202–219. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_13
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 6840–6851 (2020)
Huang, L., Zhao, X., Huang, K.: GOT-10k: a large high-diversity benchmark for generic object tracking in the wild. IEEE Trans. Pattern Anal. Mach. Intell. 43(5), 1562–1577 (2021)
Jeanneret, G., Simon, L., Jurie, F.: Adversarial counterfactual visual explanations. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16425–16435 (2023)
Jia, S., Ma, C., Song, Y., Yang, X.: Robust tracking against adversarial attacks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 69–84. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_5
Jia, S., Song, Y., Ma, C., Yang, X.: IoU attack: towards temporally coherent black-box adversarial attack for visual object tracking. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Li, B., Wu, W., Wang, Q., Zhang, F., Xing, J., Yan, J.: SiamRPN++: evolution of Siamese visual tracking with very deep networks. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4277–4286 (2019)
Liang, S., Wei, X., Yao, S., Cao, X.: Efficient adversarial attacks for visual object tracking. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12371, pp. 34–50. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_3
Maho, T., Furon, T., Merrer, E.L.: SurFree: a fast surrogate-free black-box attack. Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10425–10434 (2020)
Mayer, C., et al.: Transforming model prediction for tracking. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8731–8740, June 2022
Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 445–461. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_27
Nie, W., Guo, B., Huang, Y., Xiao, C., Vahdat, A., Anandkumar, A.: Diffusion models for adversarial purification. In: International Conference on Machine Learning (ICML) (2022)
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, pp. 506–519 (2017)
Song, J., Meng, C., Ermon, S.: Denoising diffusion implicit models. In: International Conference on Learning Representations (ICLR) (2021)
Wang, J., Lyu, Z., Lin, D., Dai, B., Fu, H.: Guided diffusion model for adversarial purification (2022)
Wu, Q., Ye, H., Gu, Y.: Guided diffusion model for adversarial purification from random noise (2022)
Wu, Y., Lim, J., Yang, M.H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 37(9), 1834–1848 (2015)
Xu, T., Feng, Z., Wu, X.J., Kittler, J.: Adaptive channel selection for robust visual object tracking with discriminative correlation filters. Int. J. Comput. Vision 129, 1359–1375 (2021)
Xu, T., Feng, Z., Wu, X.J., Kittler, J.: Toward robust visual object tracking with independent target-agnostic detection and effective Siamese cross-task interaction. IEEE Trans. Image Process. 32, 1541–1554 (2023)
Xu, T., Wu, X.J., Kittler, J.: Non-negative subspace representation learning scheme for correlation filter based tracking. In: 2018 24th International Conference on Pattern Recognition (ICPR), pp. 1888–1893. IEEE (2018)
Xu, T., Zhu, X.F., Wu, X.J.: Learning spatio-temporal discriminative model for affine subspace based visual object tracking. Vis. Intell. 1(1), 4 (2023)
Yan, B., Peng, H., Fu, J., Wang, D., Lu, H.: Learning spatio-temporal transformer for visual tracking. In: International Conference on Computer Vision (ICCV), pp. 10448–10457 (2021)
Yan, B., Wang, D., Lu, H., Yang, X.: Cooling-shrinking attack: blinding the tracker with imperceptible noises. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 987–996 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, R., Xu, T., Zhao, S., Wu, XJ., Kittler, J. (2023). Diffusion Init: Stronger Initialisation of Decision-Based Black-Box Attacks for Visual Object Tracking. In: Lu, H., Blumenstein, M., Cho, SB., Liu, CL., Yagi, Y., Kamiya, T. (eds) Pattern Recognition. ACPR 2023. Lecture Notes in Computer Science, vol 14407. Springer, Cham. https://doi.org/10.1007/978-3-031-47637-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-031-47637-2_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47636-5
Online ISBN: 978-3-031-47637-2
eBook Packages: Computer ScienceComputer Science (R0)