Skip to main content
Log in

Modified dual attention triplet-supervised hashing network for image retrieval

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

In view of the problems of insufficient feature extraction and ineffective capture of correlation between deep features in existing image retrieval methods, a modified dual attention triplet-supervised hashing network (MDATSH) is proposed. A modified dual attention module is added after the deep neural network, that is, based on dual attention, where two attention modules are, respectively, equipped with a local branch. Specifically, a local spatial attention branch is added to the position attention module and a local channel attention branch is added to the channel attention module, which captures global dependencies while avoiding the loss of local information and effectively capturing the correlation between deep features of the image. By combining these two attention mechanisms, the network is capable of effectively extracting crucial information from input images, thereby enhancing the robustness of image feature representation. Meanwhile, a dynamic cross-entropy loss function is introduced to dynamically adjust the loss weights during model training, which is combined with the triple loss function to enhance the class separability of image hash codes while maintaining semantic similarity. The experimental results on three public datasets show that the performance of MDATSH is effectively improved in image retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

All the datasets explored in this paper are publicly available.

References

  1. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. IEEE 86, 2278–2324 (1998)

    Article  Google Scholar 

  2. Chen, W., Liu, Y., Wang, W., Bakker, E.M., Georgiou, T., Fieguth, P., Liu, L., Lew, M.S.: Deep learning for instance retrieval: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 7270–7292 (2022)

    Article  Google Scholar 

  3. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)

    Article  ADS  CAS  PubMed  Google Scholar 

  4. Revaud, J., Almazán, J., Rezende, R.S., Souza, C.R.d.: Learning with average precision: Training image retrieval with a listwise loss. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5107–5116, (2019)

  5. Liu, H., Wang, R., Shan, S., Chen, X.: Deep supervised hashing for fast image retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2064–2072 (2016)

  6. Xia, R., Pan, Y., Lai, H., Liu, C., Yan, S.: Supervised hashing for image retrieval via image representation learning. In: Proceedings of the AAAI conference on artificial intelligence, Vol. 28, (2014)

  7. Li, W., Wang, S., Kang, W.: Feature learning based deep supervised hashing with pairwise labels. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1711–1717 (2015)

  8. Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3146–3154 (2019)

  9. Gong, Y., Wang, L., Li, Y., Du, A.: A discriminative person re-identification model with global-local attention and adaptive weighted rank list loss. IEEE Access 8, 203700–203711 (2020)

    Article  Google Scholar 

  10. Fang, J., Fu, H., Liu, J.: Deep triplet hashing network for case-based medical image retrieval. Med. Image Anal. 69, 101981 (2021)

    Article  PubMed  Google Scholar 

  11. Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017). Accessed 20 December 2022

  12. Huang, Q., Song, K., Lu, J.: Application of the loss balance function to the imbalanced multiclassification problems. CAAI Trans. Intell. Syst. 14, 953–958 (2019). http://kns.cnki.net/kcms/detail/23.1538.TP.20181223.1553.004.html

  13. Wang, J., Zhang, T., Sebe, N., Shen, H.T.: A survey on learning to hash. IEEE Trans. Pattern Anal. Mach. Intell. 40, 769–790 (2017)

    Article  PubMed  Google Scholar 

  14. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the annual symposium on Computational geometry, pp. 253–262 (2004)

  15. Cao, Z., Long, M., Wang, J., Yu, P.S.: HashNet: Deep learning to hash by continuation. In: Proceedings of the IEEE international conference on computer vision, pp. 5608–5617 (2017)

  16. Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. Adv. Neural. Inf. Process. Syst. 21, 1753–1760 (2008)

    Google Scholar 

  17. Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2916–2929 (2012)

    Article  Google Scholar 

  18. Jin, Z., Li, C., Lin, Y., Cai, D.: Density sensitive hashing. IEEE Trans. Cybernet. 44, 1362–1371 (2013)

    Article  Google Scholar 

  19. Ng, K.W., Zhu, X., Hoe, J.T., Chan, C.S., Zhang, T., Song, Y., Xiang, T.: Unsupervised hashing via similarity distribution calibration. arXiv preprint arXiv:2302.07669 (2023). Accessed 23 July 2023

  20. Lai, H., Pan, Y., Liu, Y., Yan, S.: Simultaneous feature learning and hash coding with deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3270–3278 (2015)

  21. Wang, X., Shi, Y., Kitani, K.M.: Deep Supervised Hashing with Triplet Labels. In: Proceedings of the Asian conference on computer vision, pp. 70–84 (2017)

  22. Zheng, X., Zhang, Y., Lu, X.: Deep balanced discrete hashing for image retrieval. Neurocomputing 403, 224–236 (2020)

    Article  Google Scholar 

  23. Song, W., Gao, Z., Dian, R., Ghamisi, P., Zhang, Y., Benediktsson, J.A.: Asymmetric hash code learning for remote sensing image retrieval. IEEE T. Geosci. Remote 60, 1–14 (2022)

    CAS  Google Scholar 

  24. Jiang, Q., Li, W.: Asymmetric deep supervised hashing. In: Proceedings of the AAAI conference on artificial intelligence 32, 3342–3349 (2018)

  25. Jang, Y.K., Gu, G., Ko, B., Kang, I., Cho, N.I.: Deep hash distillation for image retrieval. In: Proceedings of the European conference on computer vision. Springer, pp. 354–371 (2022)

  26. Zhang, Z., Zou, Q., Lin, Y., Chen, L., Wang, S.: Improved deep hashing with soft pairwise similarity for multi-label image retrieval. IEEE T. Multimedia 22, 540–553 (2019)

    Article  Google Scholar 

  27. Yang, Z., Raymond, O.I., Sun, W., Long, J.: Deep attention-guided hashing. IEEE Access 7, 11209–11221 (2019)

    Article  Google Scholar 

  28. Li, X., Xu, M., Xu, J., Weise, T., Zou, L., Sun, F., Wu, Z.: Image retrieval using a deep attention-based hash. IEEE Access 8, 142229–142242 (2020)

    Article  Google Scholar 

  29. Song, C.H., Han, H.J., Avrithis, Y.: All the attention you need: Global-local, spatial-channel attention for image retrieval. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 2754–2763 (2022)

  30. Yang, W., Wang, L., Cheng, S., Li, Y., Du, A.: Deep hash with improved dual attention for image retrieval. Information 12, 285–303 (2021)

    Article  Google Scholar 

  31. Xue, X., Shi, J., He, X., Xu, S., Pan, Z.: Cross-scale context extracted hashing for fine-grained image binary encoding. arXiv preprint arXiv:2210.07572 (2022). Accessed 25 July 2023

  32. Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19 (2018)

  33. Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: Proceedings of the international conference on machine learning. PMLR, pp. 7354–7363 (2019)

  34. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Tech. Rep. 1, 1–10 (2009)

    Google Scholar 

  35. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  36. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88, 303–338 (2010)

    Article  Google Scholar 

  37. Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 1–60 (2008)

    Article  Google Scholar 

Download references

Funding

This work is supported by the National Natural Science Foundation of China (No. 62277016).

Author information

Authors and Affiliations

Authors

Contributions

The contribution of the authors to the article is equal. All authors reviewed the manuscript.

Corresponding author

Correspondence to Jingwen Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, X., Chen, J. & Wang, R. Modified dual attention triplet-supervised hashing network for image retrieval. SIViP 18, 1939–1948 (2024). https://doi.org/10.1007/s11760-023-02908-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02908-1

Keywords

Navigation