Abstract
In view of the problems of insufficient feature extraction and ineffective capture of correlation between deep features in existing image retrieval methods, a modified dual attention triplet-supervised hashing network (MDATSH) is proposed. A modified dual attention module is added after the deep neural network, that is, based on dual attention, where two attention modules are, respectively, equipped with a local branch. Specifically, a local spatial attention branch is added to the position attention module and a local channel attention branch is added to the channel attention module, which captures global dependencies while avoiding the loss of local information and effectively capturing the correlation between deep features of the image. By combining these two attention mechanisms, the network is capable of effectively extracting crucial information from input images, thereby enhancing the robustness of image feature representation. Meanwhile, a dynamic cross-entropy loss function is introduced to dynamically adjust the loss weights during model training, which is combined with the triple loss function to enhance the class separability of image hash codes while maintaining semantic similarity. The experimental results on three public datasets show that the performance of MDATSH is effectively improved in image retrieval.






Similar content being viewed by others
Data availability
All the datasets explored in this paper are publicly available.
References
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. IEEE 86, 2278–2324 (1998)
Chen, W., Liu, Y., Wang, W., Bakker, E.M., Georgiou, T., Fieguth, P., Liu, L., Lew, M.S.: Deep learning for instance retrieval: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 45, 7270–7292 (2022)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Revaud, J., Almazán, J., Rezende, R.S., Souza, C.R.d.: Learning with average precision: Training image retrieval with a listwise loss. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5107–5116, (2019)
Liu, H., Wang, R., Shan, S., Chen, X.: Deep supervised hashing for fast image retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2064–2072 (2016)
Xia, R., Pan, Y., Lai, H., Liu, C., Yan, S.: Supervised hashing for image retrieval via image representation learning. In: Proceedings of the AAAI conference on artificial intelligence, Vol. 28, (2014)
Li, W., Wang, S., Kang, W.: Feature learning based deep supervised hashing with pairwise labels. In: Proceedings of the International Joint Conference on Artificial Intelligence, pp. 1711–1717 (2015)
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3146–3154 (2019)
Gong, Y., Wang, L., Li, Y., Du, A.: A discriminative person re-identification model with global-local attention and adaptive weighted rank list loss. IEEE Access 8, 203700–203711 (2020)
Fang, J., Fu, H., Liu, J.: Deep triplet hashing network for case-based medical image retrieval. Med. Image Anal. 69, 101981 (2021)
Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017). Accessed 20 December 2022
Huang, Q., Song, K., Lu, J.: Application of the loss balance function to the imbalanced multiclassification problems. CAAI Trans. Intell. Syst. 14, 953–958 (2019). http://kns.cnki.net/kcms/detail/23.1538.TP.20181223.1553.004.html
Wang, J., Zhang, T., Sebe, N., Shen, H.T.: A survey on learning to hash. IEEE Trans. Pattern Anal. Mach. Intell. 40, 769–790 (2017)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the annual symposium on Computational geometry, pp. 253–262 (2004)
Cao, Z., Long, M., Wang, J., Yu, P.S.: HashNet: Deep learning to hash by continuation. In: Proceedings of the IEEE international conference on computer vision, pp. 5608–5617 (2017)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. Adv. Neural. Inf. Process. Syst. 21, 1753–1760 (2008)
Gong, Y., Lazebnik, S., Gordo, A., Perronnin, F.: Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2916–2929 (2012)
Jin, Z., Li, C., Lin, Y., Cai, D.: Density sensitive hashing. IEEE Trans. Cybernet. 44, 1362–1371 (2013)
Ng, K.W., Zhu, X., Hoe, J.T., Chan, C.S., Zhang, T., Song, Y., Xiang, T.: Unsupervised hashing via similarity distribution calibration. arXiv preprint arXiv:2302.07669 (2023). Accessed 23 July 2023
Lai, H., Pan, Y., Liu, Y., Yan, S.: Simultaneous feature learning and hash coding with deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3270–3278 (2015)
Wang, X., Shi, Y., Kitani, K.M.: Deep Supervised Hashing with Triplet Labels. In: Proceedings of the Asian conference on computer vision, pp. 70–84 (2017)
Zheng, X., Zhang, Y., Lu, X.: Deep balanced discrete hashing for image retrieval. Neurocomputing 403, 224–236 (2020)
Song, W., Gao, Z., Dian, R., Ghamisi, P., Zhang, Y., Benediktsson, J.A.: Asymmetric hash code learning for remote sensing image retrieval. IEEE T. Geosci. Remote 60, 1–14 (2022)
Jiang, Q., Li, W.: Asymmetric deep supervised hashing. In: Proceedings of the AAAI conference on artificial intelligence 32, 3342–3349 (2018)
Jang, Y.K., Gu, G., Ko, B., Kang, I., Cho, N.I.: Deep hash distillation for image retrieval. In: Proceedings of the European conference on computer vision. Springer, pp. 354–371 (2022)
Zhang, Z., Zou, Q., Lin, Y., Chen, L., Wang, S.: Improved deep hashing with soft pairwise similarity for multi-label image retrieval. IEEE T. Multimedia 22, 540–553 (2019)
Yang, Z., Raymond, O.I., Sun, W., Long, J.: Deep attention-guided hashing. IEEE Access 7, 11209–11221 (2019)
Li, X., Xu, M., Xu, J., Weise, T., Zou, L., Sun, F., Wu, Z.: Image retrieval using a deep attention-based hash. IEEE Access 8, 142229–142242 (2020)
Song, C.H., Han, H.J., Avrithis, Y.: All the attention you need: Global-local, spatial-channel attention for image retrieval. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 2754–2763 (2022)
Yang, W., Wang, L., Cheng, S., Li, Y., Du, A.: Deep hash with improved dual attention for image retrieval. Information 12, 285–303 (2021)
Xue, X., Shi, J., He, X., Xu, S., Pan, Z.: Cross-scale context extracted hashing for fine-grained image binary encoding. arXiv preprint arXiv:2210.07572 (2022). Accessed 25 July 2023
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19 (2018)
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: Proceedings of the international conference on machine learning. PMLR, pp. 7354–7363 (2019)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Tech. Rep. 1, 1–10 (2009)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115, 211–252 (2015)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88, 303–338 (2010)
Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: ideas, influences, and trends of the new age. ACM Comput. Surv. 40, 1–60 (2008)
Funding
This work is supported by the National Natural Science Foundation of China (No. 62277016).
Author information
Authors and Affiliations
Contributions
The contribution of the authors to the article is equal. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Cheng, X., Chen, J. & Wang, R. Modified dual attention triplet-supervised hashing network for image retrieval. SIViP 18, 1939–1948 (2024). https://doi.org/10.1007/s11760-023-02908-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-023-02908-1