ABSTRACT
Deep hashing has been extensively explored for image retrieval due to fast computation and efficient storage. Since conventional deep hashing methods are not suitable for the common scenario in real life that data exhibits a long-tailed distribution, several long-tailed hashing methods have been proposed recently. However, existing long-tail hashing methods seek to utilize fixed class centroids and cannot fully develop the discriminative ability of hash codes for tail-class samples. Specifically, fixed class centroids cannot characterize authentic semantics of tail classes or provide effective semantic information for hash codes learning under the long-tailed setting. To this end, we propose a novel Dual Dynamic Proxy Hashing Network (DDPHN) with two sets of learnable dynamic proxies, i.e. hash proxies and feature proxies, to improve the discrimination of hash codes for tail-class samples. Compared with fixed class centroids, learnable proxies can be optimized constantly via the proxy learning loss and depict accurate class semantics despite the scarcity of tail-class samples. Apart from low-dimensional binary hash proxies, we introduce high-dimensional continuous feature proxies that can describe semantic relationships more precisely, contributing to hash codes learning as well. To further leverage semantic information carried by proxies, we build a hypergraph by exploring neighborhood relationships in the feature space and then introduce a hypergraph neural network to transfer knowledge from proxies to samples in the Hamming space. Extensive experiments show the superiority of our learnable dynamic proxies and demonstrate that our method outperforms numerous deep hashing models and recent state-of-the-art long-tailed hashing methods.
- Fatih Cakir, Kun He, Sarah Adel Bargal, and Stan Sclaroff. 2019. Hashing with mutual information. IEEE transactions on pattern analysis and machine intelligence, Vol. 41, 10 (2019), 2424--2437.Google Scholar
- Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2017. Hashnet: Deep learning to hash by continuation. In Proceedings of the IEEE international conference on computer vision. 5608--5617.Google ScholarCross Ref
- Zhangjie Cao, Ziping Sun, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2018. Deep priority hashing. In Proceedings of the 26th ACM international conference on Multimedia. 1653--1661.Google ScholarDigital Library
- Shen Chen, Liujuan Cao, Mingbao Lin, Yan Wang, Xiaoshuai Sun, Chenglin Wu, Jingfei Qiu, and Rongrong Ji. 2019. Hadamard codebook based deep hashing. arXiv preprint arXiv:1910.09182 (2019).Google Scholar
- Yong Chen, Yuqing Hou, Shu Leng, Qing Zhang, Zhouchen Lin, and Dell Zhang. 2021. Long-tail hashing. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1328--1338.Google ScholarDigital Library
- Peng Chu, Xiao Bian, Shaopeng Liu, and Haibin Ling. 2020. Feature space augmentation for long-tailed data. In ECCV. Springer, 694--710.Google Scholar
- Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge Belongie. 2019. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9268--9277.Google ScholarCross Ref
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.Google ScholarCross Ref
- Thanh-Toan Do, Anh-Dzung Doan, and Ngai-Man Cheung. 2016. Learning to hash with binary deep neural network. In ECCV. Springer, 219--234.Google Scholar
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).Google Scholar
- Lixin Fan, Kam Woh Ng, Ce Ju, Tianyu Zhang, and Chee Seng Chan. 2020. Deep Polarized Network for Supervised Learning of Accurate Binary Hashing Codes.. In IJCAI. 825--831.Google Scholar
- Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, and Yue Gao. 2019. Hypergraph neural networks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 3558--3565.Google ScholarDigital Library
- Jiannan Ge, Hongtao Xie, Shaobo Min, Pandeng Li, and Yongdong Zhang. 2022. Dual Part Discovery Network for Zero-Shot Learning. In Proceedings of the 30th ACM International Conference on Multimedia. 3244--3252.Google ScholarDigital Library
- Hao Guo and Song Wang. 2021. Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15089--15098.Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
- Ruifei He, Jihan Yang, and Xiaojuan Qi. 2021. Re-distributing biased pseudo labels for semi-supervised semantic segmentation: A baseline investigation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6930--6940.Google ScholarCross Ref
- Jiun Tian Hoe, Kam Woh Ng, Tianyu Zhang, Chee Seng Chan, Yi-Zhe Song, and Tao Xiang. 2021. One loss for all: Deep hashing with a single cosine similarity based learning objective. Advances in Neural Information Processing Systems, Vol. 34 (2021), 24286--24298.Google Scholar
- Zhi Hou, Baosheng Yu, and Dacheng Tao. 2022. Batchformer: Learning to explore sample relationships for robust representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7256--7266.Google ScholarCross Ref
- Xuan Kou, Chenghao Xu, Xu Yang, and Cheng Deng. 2022. Attention-guided Contrastive Hashing for Long-tailed Image Retrieval. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. 1017--1023.Google ScholarCross Ref
- Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).Google Scholar
- Hanjiang Lai, Yan Pan, Ye Liu, and Shuicheng Yan. 2015. Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3270--3278.Google ScholarCross Ref
- Pandeng Li, Yan Li, Hongtao Xie, and Lei Zhang. 2022a. Neighborhood-adaptive structure augmented metric learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1367--1375.Google ScholarCross Ref
- Pandeng Li, Chen-Wei Xie, Hongtao Xie, Liming Zhao, Lei Zhang, Yun Zheng, Deli Zhao, and Yongdong Zhang. 2023. MomentDiff: Generative Video Moment Retrieval from Random to Real. arXiv preprint arXiv:2307.02869 (2023).Google Scholar
- Pandeng Li, Hongtao Xie, Jiannan Ge, Lei Zhang, Shaobo Min, and Yongdong Zhang. 2022b. Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval. In ECCV. Springer, 181--197.Google Scholar
- Pandeng Li, Hongtao Xie, Shaobo Min, Jiannan Ge, Xun Chen, and Yongdong Zhang. 2022c. Deep Fourier Ranking Quantization for Semi-supervised Image Retrieval. Transactions on Image Processing, Vol. 31 (2022), 5909--5922.Google ScholarDigital Library
- Qi Li, Zhenan Sun, Ran He, and Tieniu Tan. 2017. Deep supervised discrete hashing. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 2479--2488.Google Scholar
- Shuang Li, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Feng Qiao, and Xinjing Cheng. 2021. Metasaug: Meta semantic augmentation for long-tailed visual recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5212--5221.Google ScholarCross Ref
- Wu-Jun Li, Sheng Wang, and Wang-Cheng Kang. 2016. Feature learning based deep supervised hashing with pairwise labels. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. 1711--1717.Google Scholar
- Jongin Lim, Sangdoo Yun, Seulki Park, and Jin Young Choi. 2022. Hypergraph-induced semantic tuplet loss for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 212--222.Google ScholarCross Ref
- Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, and Chu-Song Chen. 2015. Deep learning of binary hash codes for fast image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 27--35.Google ScholarCross Ref
- Bin Liu, Yue Cao, Mingsheng Long, Jianmin Wang, and Jingdong Wang. 2018. Deep triplet quantization. In Proceedings of the 26th ACM international conference on Multimedia. 755--763.Google ScholarDigital Library
- Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, and Stella X Yu. 2019. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2537--2546.Google ScholarCross Ref
- Xiao Luo, Haixin Wang, Daqing Wu, Chong Chen, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. 2023. A survey on deep hashing methods. ACM Transactions on Knowledge Discovery from Data, Vol. 17, 1 (2023), 1--50.Google Scholar
- Zeyu Ma, Wei Ju, Xiao Luo, Chong Chen, Xian-Sheng Hua, and Guangming Lu. 2022. Improved Deep Unsupervised Hashing via Prototypical Learning. In Proceedings of the 30th ACM International Conference on Multimedia. 659--667.Google ScholarDigital Library
- Yair Movshovitz-Attias, Alexander Toshev, Thomas K Leung, Sergey Ioffe, and Saurabh Singh. 2017. No fuss distance metric learning using proxies. In Proceedings of the IEEE international conference on computer vision. 360--368.Google ScholarCross Ref
- Mark EJ Newman. 2005. Power laws, Pareto distributions and Zipf's law. Contemporary physics, Vol. 46, 5 (2005), 323--351.Google Scholar
- Sarah Parisot, Pedro M Esperancc a, Steven McDonagh, Tamas J Madarasz, Yongxin Yang, and Zhenguo Li. 2022. Long-tail recognition via compositional knowledge transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6939--6948.Google ScholarCross Ref
- Seulki Park, Youngkyu Hong, Byeongho Heo, Sangdoo Yun, and Jin Young Choi. 2022. The majority can help the minority: Context-rich minority oversampling for long-tailed classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6887--6896.Google ScholarCross Ref
- Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, Vol. 32 (2019).Google Scholar
- Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International journal of computer vision, Vol. 115 (2015), 211--252.Google Scholar
- Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep asymmetric pairwise hashing. In Proceedings of the 25th ACM international conference on Multimedia. 1522--1530.Google ScholarDigital Library
- Jinan Sun, Haixin Wang, Xiao Luo, Shikun Zhang, Wei Xiang, Chong Chen, and Xian-Sheng Hua. 2022. HEART: Towards Effective Hash Codes under Label Noise. In Proceedings of the 30th ACM International Conference on Multimedia. 366--375.Google ScholarDigital Library
- Eu Wern Teh, Terrance DeVries, and Graham W Taylor. 2020. Proxynca: Revisiting and revitalizing proxy neighborhood component analysis. In ECCV. Springer, 448--464.Google Scholar
- Rong-Cheng Tu, Xian-Ling Mao, Jia-Nan Guo, Wei Wei, and Heyan Huang. 2021. Partial-softmax loss based deep hashing. In Proceedings of the Web Conference 2021. 2869--2878.Google ScholarDigital Library
- Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).Google Scholar
- Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, and Dahua Lin. 2021. Seesaw loss for long-tailed instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9695--9704.Google ScholarCross Ref
- Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, and Jiashi Feng. 2020. The devil is in classification: A simple framework for long-tail instance segmentation. In ECCV. Springer, 728--744.Google Scholar
- Xiaofang Wang, Yi Shi, and Kris M Kitani. 2017. Deep supervised hashing with triplet labels. In ACCV. Springer, 70--84.Google Scholar
- Yulin Wang, Xuran Pan, Shiji Song, Hong Zhang, Cheng Wu, and Gao Huang. 2019. Implicit semantic data augmentation for deep networks. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. 12635--12644.Google ScholarDigital Library
- Yair Weiss, Antonio Torralba, and Rob Fergus. 2008. Spectral hashing. Advances in neural information processing systems, Vol. 21 (2008).Google Scholar
- Zhenzhen Weng, Mehmet Giray Ogut, Shai Limonchik, and Serena Yeung. 2021. Unsupervised discovery of the long-tail in instance segmentation using hierarchical self-supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2603--2612.Google ScholarCross Ref
- Tobias Weyand, Andre Araujo, Bingyi Cao, and Jack Sim. 2020. Google landmarks dataset v2-a large-scale benchmark for instance-level recognition and retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2575--2584.Google ScholarCross Ref
- Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang, and Junsong Yuan. 2020b. Forest r-cnn: Large-vocabulary long-tailed object detection and instance segmentation. In Proceedings of the 28th ACM International Conference on Multimedia. 1570--1578.Google ScholarDigital Library
- Tong Wu, Qingqiu Huang, Ziwei Liu, Yu Wang, and Dahua Lin. 2020a. Distribution-balanced loss for multi-label classification in long-tailed datasets. In ECCV. Springer, 162--178.Google Scholar
- Li Yuan, Tao Wang, Xiaopeng Zhang, Francis EH Tay, Zequn Jie, Wei Liu, and Jiashi Feng. 2020. Central similarity quantization for efficient image and video retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3083--3092.Google ScholarCross Ref
- Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. 2019. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision. 6023--6032.Google ScholarCross Ref
- Fang Zhao, Yongzhen Huang, Liang Wang, and Tieniu Tan. 2015. Deep semantic ranking based hashing for multi-label image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1556--1564.Google Scholar
- Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with hypergraphs: Clustering, classification, and embedding. Advances in neural information processing systems, Vol. 19 (2006).Google Scholar
Index Terms
- Dual Dynamic Proxy Hashing Network for Long-tailed Image Retrieval
Recommendations
Deep Double Center Hashing for Face Image Retrieval
Pattern Recognition and Computer VisionAbstractHashing is an effective and widely used technology for fast approximate nearest neighbor search in large-scale images. In recent years, it has been combined with a powerful feature learning model, convolutional neural network(CNN), to boost the ...
Supervised discrete discriminant hashing for image retrieval
We develop a new supervised discrete discriminant hashing learning method, which can learn discrete hashing codes and hashing function simultaneously.To make the learned discrete hash codes to be optimal for classification, the learned hashing framework ...
ElasticHash: Semantic Image Similarity Search by Deep Hashing with Elasticsearch
Computer Analysis of Images and PatternsAbstractWe present ElasticHash, a novel approach for high-quality, efficient, and large-scale semantic image similarity search. It is based on a deep hashing model to learn hash codes for fine-grained image similarity search in natural images and a two-...
Comments