Skip to main content
Log in

U-Turn: Crafting Adversarial Queries with Opposite-Direction Features

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

This paper aims to craft adversarial queries for image retrieval, which uses image features for similarity measurement. Many commonly used methods are developed in the context of image classification. However, these methods, which attack prediction probabilities, only exert an indirect influence on the image features and are thus found less effective when being applied to the retrieval problem. In designing an attack method specifically for image retrieval, we introduce opposite-direction feature attack (ODFA), a white-box attack approach that directly attacks query image features to generate adversarial queries. As the name implies, the main idea underpinning ODFA is to impel the original image feature to the opposite direction, similar to a U-turn. This simple idea is experimentally evaluated on five retrieval datasets. We show that the adversarial queries generated by ODFA cause true matches no longer to be seen at the top ranks, and the attack success rate is consistently higher than classifier attack methods. In addition, our method of creating adversarial queries can be extended for multi-scale query inputs and is generalizable to other retrieval models without foreknowing their weights, i.e., the black-box setting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. There exist multi-query retrieval systems (Zheng et al., 2015; Wang et al., 2017) but for simplicity, we only consider single-query systems.

  2. https://github.com/layumi/U_turn

  3. https://github.com/cleverhans-lab/cleverhans

References

  • Athalye, A., Engstrom, L., Ilyas, A., and Kwok, K. (2018). Synthesizing robust adversarial examples. In ICML.

  • Babenko, A., Slesarev, A., Chigorin, A., and Lempitsky, V. (2014). Neural codes for image retrieval. In ECCV.

  • Bai, S., Bai, X., and Tian, Q. (2017). Scalable person re-identification on supervised smoothed manifold. In CVPR.

  • Bai, S., Li, Y., Zhou, Y., Li, Q., & Torr, P. H. (2020). Adversarial metric attack and defense for person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(6), 2119–2126.

    Article  Google Scholar 

  • Bai, X., Yang, M., Huang, T., Dou, Z., Yu, R., & Xu, Y. (2020). Deep-person: Learning discriminative deep features for person re-identification. Pattern Recognition, 98, 107036.

    Article  Google Scholar 

  • Bouniot, Q., Audigier, R., and Loesch, A. (2020). Vulnerability of person re-identification models to metric adversarial attacks. In CVPR Workshop.

  • Chakraborty, A., Alam, M., Dey, V., Chattopadhyay, A., and Mukhopadhyay, D. (2018). Adversarial attacks and defences: A survey. arXiv:1810.00069.

  • Chen, J. and Ngo, C.-W. (2016). Deep-based ingredient recognition for cooking recipe retrieval. In ACM Multimedia.

  • Chen, Y., Zhu, X., and Gong, S. (2017). Person re-identification by deep learning multi-scale representations. In ICCV.

  • Cherepanova, V., Goldblum, M., Foley, H., Duan, S., Dickerson, J., Taylor, G., and Goldstein, T. (2021). Lowkey: leveraging adversarial attacks to protect social media users from facial recognition. In ICLR.

  • Deng, C., Yang, X., Nie, F., & Tao, D. (2019). Saliency detection via a multiple self-weighted graph-based manifold ranking. IEEE Transactions on Multimedia, 22(4), 885–896.

    Article  Google Scholar 

  • Dong, Y., Liao, F., Pang, T., Su, H., Zhu, J., Hu, X., and Li, J. (2018). Boosting adversarial attacks with momentum. CVPR.

  • Eykholt, K., Evtimov, I., Fernandes, E., Li, B., Rahmati, A., Xiao, C., Prakash, A., Kohno, T., and Song, D. (2018). Robust physical-world attacks on deep learning visual classification. In CVPR.

  • Gong, Y., Huang, L., and Chen, L. (2022). Person re-identification method based on color attack and joint defence. In CVPR.

  • Goodfellow, I. J., Shlens, J., and Szegedy, C. (2015). Explaining and harnessing adversarial examples. In ICLR.

  • Guo, H., Zhao, C., Liu, Z., Jinqiao, W., and Hanqing, L. (2018). Learning coarse-to-fine structured feature embedding for vehicle re-identification. aaai 2018. In AAAI.

  • He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In CVPR.

  • Hermans, A., Beyer, L., and Leibe, B. (2017). In defense of the triplet loss for person re-identification. arXiv:1703.07737.

  • Huang, G., Liu, Z., Maaten, L. V. D., and Weinberger, K. Q. (2017). Densely connected convolutional networks. In CVPR.

  • Jin, L., Li, K., Li, Z., Xiao, F., Qi, G.-J., & Tang, J. (2018). Deep semantic-preserving ordinal hashing for cross-modal similarity search. IEEE Transactions on Neural Networks and Learning Systems, 30(5), 1429–1440.

    Article  MathSciNet  Google Scholar 

  • Kawano, Y. and Yanai, K. (2014). Automatic expansion of a food image dataset leveraging existing categories with domain adaptation. In ECCV workshop on transferring and adapting source knowledge in computer vision (TASK-CV).

  • Krizhevsky, A. and Hinton, G. (2009). Learning multiple layers of features from tiny images.

  • Kurakin, A., Goodfellow, I., and Bengio, S. (2017). Adversarial examples in the physical world. In ICLR Workshop.

  • LeCun, Y., Bengio, Y., et al. (1995). Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks, 3361(10), 1995.

    Google Scholar 

  • Li, J., Ji, R., Liu, H., Hong, X., Gao, Y., and Tian, Q. (2019a). Universal perturbation attack against image retrieval. In ICCV.

  • Li, K., Qi, G.-J., & Hua, K. A. (2018). Learning label preserving binary codes for multimedia retrieval: A general approach. ACM Transactions on Multimedia, Computing, Communications and Applications (TOMM), 14(1), 2.

    Google Scholar 

  • Li, X., Li, J., Chen, Y., Ye, S., He, Y., Wang, S., Su, H., and Xue, H. (2021). Qair: Practical query-efficient black-box attacks for image retrieval. In CVPR.

  • Li, Y., Yao, T., Pan, Y., Chao, H., & Mei, T. (2019). Deep metric learning with density adaptivity. IEEE Transactions on Multimedia, 22(5), 1285–1297.

    Article  Google Scholar 

  • Lin, K., Lu, J., Chen, C.-S., Zhou, J., & Sun, M.-T. (2018). Unsupervised deep learning of compact binary descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(6), 1501–1514.

    Article  Google Scholar 

  • Lin, K., Yang, H.-F., Hsiao, J.-H., and Chen, C.-S. (2015). Deep learning of binary hash codes for fast image retrieval. In CVPR Workshop.

  • Liu, M.-Y., Huang, X., Mallya, A., Karras, T., Aila, T., Lehtinen, J., and Kautz., J. (2019a). Few-shot unsueprvised image-to-image translation. In CVPR.

  • Liu, S., Song, Z., Liu, G., Xu, C., Lu, H., and Yan, S. (2012). Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set. In CVPR.

  • Liu, W., Wen, Y., Yu, Z., and Yang, M. (2016). Large-margin softmax loss for convolutional neural networks. In ICML.

  • Liu, X., Liu, W., Mei, T., & Ma, H. (2017). Provid: Progressive and multimodal vehicle reidentification for large-scale urban surveillance. IEEE Transactions on Multimedia, 20(3), 645–658.

    Article  Google Scholar 

  • Liu, Y., Chen, X., Liu, C., and Song, D. (2017b). Delving into transferable adversarial examples and black-box attacks. In ICLR.

  • Liu, Z., Zhao, Z., and Larson, M. (2019b). Who’s afraid of adversarial queries? the impact of image modifications on content-based image retrieval. In ICMR.

  • Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. In ICLR.

  • Moosavi Dezfooli, S. M., Fawzi, A., and Frossard, P. (2016). Deepfool: a simple and accurate method to fool deep neural networks. In CVPR.

  • Moosavidezfooli, S. M., Fawzi, A., Fawzi, O., and Frossard, P. (2017). Universal adversarial perturbations. In CVPR.

  • Narodytska, N. and Kasiviswanathan, S. P. (2017). Simple black-box adversarial perturbations for deep networks. CVPR Workshop.

  • Papernot, N., McDaniel, P., and Goodfellow, I. (2016a). Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv:1605.07277.

  • Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z. B., and Swami, A. (2017). Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on asia conference on computer and communications security, pp. 506–519.

  • Papernot, N., Mcdaniel, P., Jha, S., Fredrikson, M., Celik, Z. B., and Swami, A. (2016b). The limitations of deep learning in adversarial settings. European Symposium on Security & Privacy.

  • Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., and Lerer, A. (2017). Automatic differentiation in pytorch. NeurIPS Workshop.

  • Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In CVPR.

  • Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. (2008). Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR.

  • Qian, X., Fu, Y., Jiang, Y.-G., Xiang, T., and Xue, X. (2017). Multi-scale deep learning architectures for person re-identification. In ICCV.

  • Radenović, F., Tolias, G., and Chum, O. (2016). Cnn image retrieval learns from bow: Unsupervised fine-tuning with hard examples. In ECCV.

  • Radenović, F., Tolias, G., & Chum, O. (2018). Fine-tuning CNN image retrieval with no human annotation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(7), 1655–1668.

    Article  Google Scholar 

  • Ristani, E. and Tomasi, C. (2018). Features for multi-target multi-camera tracking and re-identification. In CVPR.

  • Schonberger, J. L., Radenovic, F., Chum, O., and Frahm, J.-M. (2015). From single image query to detailed 3d reconstruction. In CVPR.

  • Sharif, M., Bhagavatula, S., Bauer, L., and Reiter, M. K. (2016). Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. In ACM SIGSAC conference on computer and communications security.

  • Shen, C., Jin, Z., Chu, W., Jiang, R., Chen, Y., Qi, G.-J., & Hua, X.-S. (2019). Multi-level similarity perception network for person re-identification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 15(2), 32.

    Google Scholar 

  • Shi, C., Xu, X., Ji, S., Bu, K., Chen, J., Beyah, R., and Wang, T. (2021). Adversarial captchas. IEEE Transactions on Cybernetics.

  • Sigurbjörnsson, B., Van Zwol, R. (2008). Flickr tag recommendation based on collective knowledge. In WWW.

  • Simonyan, K. and Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In ICLR.

  • Song, H. O., Xiang, Y., Jegelka, S., and Savarese, S. (2016). Deep metric learning via lifted structured feature embedding. In CVPR.

  • Suh, Y., Wang, J., Tang, S., Mei, T., and Mu Lee, K. (2018). Part-aligned bilinear representations for person re-identification. In ECCV.

  • Sun, Y., Zheng, L., Li, Y., Yang, Y., Tian, Q., & Wang, S. (2019). Learning part-based convolutional features for person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(3), 902–917.

    Article  Google Scholar 

  • Sun, Y., Zheng, L., Yang, Y., Tian, Q., and Wang, S. (2018). Beyond part models: Person retrieval with refined part pooling. ECCV.

  • Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2014). Intriguing properties of neural networks. In ICLR.

  • Tolias, G., Radenovic, F., and Chum, O. (2019). Targeted mismatch adversarial attack: Query with a flower to retrieve the tower. In ICCV.

  • Tolias, G., Sicre, R., and Jégou, H. (2015). Particular object retrieval with integral max-pooling of cnn activations. In ICLR.

  • Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., and McDaniel, P. (2018). Ensemble adversarial training: Attacks and defenses. In ICLR.

  • Van der Maaten, L. and Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research, 9(11).

  • Wah, C., Branson, S., Welinder, P., Perona, P., and Belongie, S. (2011). The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology.

  • Wang, J., Sun, K., Cheng, T., Jiang, B., Deng, C., Zhao, Y., Liu, D., Mu, Y., Tan, M., Wang, X., et al. (2020). Deep high-resolution representation learning for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(10), 3349–3364.

    Article  Google Scholar 

  • Wang, W., Qian, X., Fu, Y., and Xue, X. (2022). Dst: Dynamic substitute training for data-free black-box attack. In CVPR.

  • Wang, X., Li, S., Liu, M., Wang, Y., and Roy-Chowdhury, A. K. (2021). Multi-expert adversarial attack detection in person re-identification using context inconsistency. In ICCV.

  • Wang, Y., Lin, X., Wu, L., & Zhang, W. (2017). Effective multi-query expansions: Collaborative deep networks for robust landmark retrieval. IEEE Transactions on Image Processing, 26(3), 1393–1404.

    Article  MathSciNet  MATH  Google Scholar 

  • Wei, Y., Zhao, Y., Lu, C., Wei, S., Liu, L., Zhu, Z., & Yan, S. (2016). Cross-modal retrieval with CNN visual features: A new baseline. IEEE Transactions on Cybernetics, 47(2), 449–460.

    Google Scholar 

  • Xiao, C., Zhu, J.-Y., Li, B., He, W., Liu, M., and Song, D. (2018). Spatially transformed adversarial examples. In ICLR.

  • Yan, C., Gong, B., Wei, Y., & Gao, Y. (2020). Deep multi-view enhancement hashing for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(4), 1445–1451.

    Article  Google Scholar 

  • Yan, C., Li, Z., Zhang, Y., Liu, Y., Ji, X., & Zhang, Y. (2020). Depth image denoising using nuclear norm and learning graph model. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 16(4), 1–17.

    Article  Google Scholar 

  • Yan, C., Teng, T., Liu, Y., Zhang, Y., Wang, H., & Ji, X. (2021). Precise no-reference image quality evaluation based on distortion identification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 17, 1–21.

    Article  Google Scholar 

  • Yang, E., Deng, C., Li, C., Liu, W., Li, J., & Tao, D. (2018). Shared predictive cross-modal deep quantization. IEEE Transactions on Neural Networks and Learning Systems, 29(11), 5292–5303.

    Article  Google Scholar 

  • Yang, E., Liu, T., Deng, C., & Tao, D. (2018). Adversarial examples for hamming space search. IEEE Transactions on Cybernetics, 50(4), 1473–1484.

    Article  Google Scholar 

  • Yang, H.-F., Lin, K., & Chen, C.-S. (2017). Supervised learning of semantics-preserving hash via deep convolutional neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(2), 437–451.

    Article  Google Scholar 

  • Yang, X., Zhou, P., & Wang, M. (2018). Person reidentification via structural deep metric learning. IEEE Transactions on Neural Networks and Learning Systems, 30(10), 2987–2998.

    Article  Google Scholar 

  • Yang, Y., Zhuang, Y., & Pan, Y. (2021). Multiple knowledge representation for big data artificial intelligence: Framework, applications, and case studies. Frontiers of Information Technology and Electronic Engineering, 22(12), 1551–1558. https://doi.org/10.1631/FITEE.2100463

    Article  Google Scholar 

  • Yu, H., Dong, F., Li, J., Xie, W., Qiu, J., and Gu, Z. (2021). Adversarial attacks on vehicle re-identification. In DSC.

  • Yu, Q., Chang, X., Song, Y.-Z., Xiang, T., and Hospedales, T. M. (2017). The devil is in the middle: Exploiting mid-level representations for cross-domain instance matching. arXiv:1711.08106.

  • Yu, R., Dou, Z., Bai, S., Zhang, Z., Xu, Y., and Bai, X. (2018). Hard-aware point-to-set deep metric for person re-identification. In ECCV.

  • Yue-Hei Ng, J., Yang, F., and Davis, L. S. (2015). Exploiting local features from deep networks for image retrieval. In CVPR Workshop.

  • Zagoruyko, S. and Komodakis, N. (2016). Wide residual networks. BMVC.

  • Zhang, S., Ji, R., Hu, J., Lu, X., & Li, X. (2018). Face sketch synthesis by multidomain adversarial learning. IEEE Transactions on Neural Networks and Learning Systems, 30(5), 1419–1428.

    Article  Google Scholar 

  • Zhang, X., Luo, H., Fan, X., Xiang, W., Sun, Y., Xiao, Q., Jiang, W., Zhang, C., and Sun, J. (2017). Alignedreid: Surpassing human-level performance in person re-identification. arXiv:1711.08184.

  • Zhang, X., Zhang, R., Cao, J., Gong, D., You, M., and Shen, C. (2020). Part-guided attention learning for vehicle instance retrieval. IEEE Transactions on Intelligent Transportation Systems.

  • Zhao, G., Zhang, M., Liu, J., Li, Y., and Wen, J.-R. (2020). Ap-gan: Adversarial patch attack on content-based image retrieval systems. GeoInformatica, pp. 1–31.

  • Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., and Tian, Q. (2015). Scalable person re-identification: A benchmark. In ICCV.

  • Zheng, L., Yang, Y., and Hauptmann, A. G. (2016). Person re-identification: Past, present and future. arXiv:1610.02984.

  • Zheng, Z., Ruan, T., Wei, Y., Yang, Y., & Mei, T. (2020). Vehiclenet: Learning robust visual representation for vehicle re-identification. IEEE Transactions on Multimedia. https://doi.org/10.1109/TMM.2020.3014488

    Article  Google Scholar 

  • Zheng, Z., Zheng, L., Hu, Z., and Yang, Y. (2018a). Open set adversarial examples. arXiv:1809.02681.

  • Zheng, Z., Zheng, L., & Yang, Y. (2018). A discriminatively learned CNN embedding for person reidentification. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 14(1), 13. https://doi.org/10.1145/3159171

    Article  MathSciNet  Google Scholar 

  • Zhong, Z., Zheng, L., Luo, Z., Li, S., & Yang, Y. (2020). Learning to adapt invariance in memory for person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(8), 2723–2738.

    Google Scholar 

  • Zhou, M., Wang, L., Niu, Z., Zhang, Q., Xu, Y., Zheng, N., and Hua, G. (2021). Practical relative order attack in deep ranking. In ICCV.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Zheng.

Additional information

Communicated by V. Lepetit.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the Fundamental Research Funds for the Central Universities (No. 226-2022-00051)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, Z., Zheng, L., Yang, Y. et al. U-Turn: Crafting Adversarial Queries with Opposite-Direction Features. Int J Comput Vis 131, 835–854 (2023). https://doi.org/10.1007/s11263-022-01737-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-022-01737-y

Keywords

Navigation