Deep Hashing Network With Hybrid Attention and Adaptive Weighting for Image Retrieval | IEEE Journals & Magazine | IEEE Xplore

Deep Hashing Network With Hybrid Attention and Adaptive Weighting for Image Retrieval


Abstract:

Due to the low computational cost of Hamming distance, hashing-based image retrieval has been universally acknowledged. Therefore, it is becoming increasingly important t...Show More

Abstract:

Due to the low computational cost of Hamming distance, hashing-based image retrieval has been universally acknowledged. Therefore, it is becoming increasingly important to quickly generate high-precision hash codes (also hash features) from images. However, the existing deep hashing methods are vulnerable to image content variations; that is, it is difficult to generate stable and consistent hash codes for similar images. In addition, generating hash codes of different lengths requires retraining the model, which is expensive in training time. To address these problems, this paper proposes a deep hashing network (DHN) with a hybrid attention mechanism and adaptive weighting (HAAW) learning. It mainly consists of a feature extraction module, feature refinement module, classification layer, hash layer and an adaptive weight layer. In particular, the hybrid attention mechanism combines bottom-up pixel saliency and top-down semantic constraints, in which the former is achieved through channel and spatial attention (CSA) and the latter is supervised by classification labels. In this way, it encourages the network to focus on dominant semantic features without being disturbed by irrelevant objects so that semantically similar images can be mapped to approximate hash codes. We further propose an adaptive weighting learning algorithm to generate weights for each bit of the hash code generated by the deep network. Then, we directly generate shorter hash codes from the available long hash code according to the importance of bits represented by the weights. This avoids retraining the network for learning hash codes of different lengths. Extensive experiments on public CIFAR-10, NUS_WIDE and ImageNet datasets show that our method has achieved substantial improvements over the counterparts in terms of precision and speed.
Published in: IEEE Transactions on Multimedia ( Volume: 26)
Page(s): 4961 - 4973
Date of Publication: 30 October 2023

ISSN Information:

Funding Agency:


References

References is not available for this document.