research-article

Dual Dynamic Proxy Hashing Network for Long-tailed Image Retrieval

Authors:
Yan Jiang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China

0000-0002-7284-1226
View Profile

,
Hongtao Xie

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China

0000-0002-6249-5315
View Profile

,
Lei Zhang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China

0000-0002-2839-8693
View Profile

,
Pandeng Li

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China

0000-0002-0717-8659
View Profile

,
Dongming Zhang

State Key Laboratory of Communication Content Cognition, People's Daily Online, Beijing, China

State Key Laboratory of Communication Content Cognition, People's Daily Online, Beijing, China

0000-0002-1237-7177
View Profile

,
Yongdong Zhang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China

0000-0002-1151-1792
View Profile

MM '23: Proceedings of the 31st ACM International Conference on MultimediaOctober 2023Pages 8942–8953https://doi.org/10.1145/3581783.3612328

Published:27 October 2023Publication History

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 8942–8953

ABSTRACT

Deep hashing has been extensively explored for image retrieval due to fast computation and efficient storage. Since conventional deep hashing methods are not suitable for the common scenario in real life that data exhibits a long-tailed distribution, several long-tailed hashing methods have been proposed recently. However, existing long-tail hashing methods seek to utilize fixed class centroids and cannot fully develop the discriminative ability of hash codes for tail-class samples. Specifically, fixed class centroids cannot characterize authentic semantics of tail classes or provide effective semantic information for hash codes learning under the long-tailed setting. To this end, we propose a novel Dual Dynamic Proxy Hashing Network (DDPHN) with two sets of learnable dynamic proxies, i.e. hash proxies and feature proxies, to improve the discrimination of hash codes for tail-class samples. Compared with fixed class centroids, learnable proxies can be optimized constantly via the proxy learning loss and depict accurate class semantics despite the scarcity of tail-class samples. Apart from low-dimensional binary hash proxies, we introduce high-dimensional continuous feature proxies that can describe semantic relationships more precisely, contributing to hash codes learning as well. To further leverage semantic information carried by proxies, we build a hypergraph by exploring neighborhood relationships in the feature space and then introduce a hypergraph neural network to transfer knowledge from proxies to samples in the Hamming space. Extensive experiments show the superiority of our learnable dynamic proxies and demonstrate that our method outperforms numerous deep hashing models and recent state-of-the-art long-tailed hashing methods.

References

Fatih Cakir, Kun He, Sarah Adel Bargal, and Stan Sclaroff. 2019. Hashing with mutual information. IEEE transactions on pattern analysis and machine intelligence, Vol. 41, 10 (2019), 2424--2437.Google Scholar
Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2017. Hashnet: Deep learning to hash by continuation. In Proceedings of the IEEE international conference on computer vision. 5608--5617.Google ScholarCross Ref
Zhangjie Cao, Ziping Sun, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2018. Deep priority hashing. In Proceedings of the 26th ACM international conference on Multimedia. 1653--1661.Google ScholarDigital Library
Shen Chen, Liujuan Cao, Mingbao Lin, Yan Wang, Xiaoshuai Sun, Chenglin Wu, Jingfei Qiu, and Rongrong Ji. 2019. Hadamard codebook based deep hashing. arXiv preprint arXiv:1910.09182 (2019).Google Scholar
Yong Chen, Yuqing Hou, Shu Leng, Qing Zhang, Zhouchen Lin, and Dell Zhang. 2021. Long-tail hashing. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1328--1338.Google ScholarDigital Library
Peng Chu, Xiao Bian, Shaopeng Liu, and Haibin Ling. 2020. Feature space augmentation for long-tailed data. In ECCV. Springer, 694--710.Google Scholar
Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge Belongie. 2019. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9268--9277.Google ScholarCross Ref
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.Google ScholarCross Ref
Thanh-Toan Do, Anh-Dzung Doan, and Ngai-Man Cheung. 2016. Learning to hash with binary deep neural network. In ECCV. Springer, 219--234.Google Scholar
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).Google Scholar
Lixin Fan, Kam Woh Ng, Ce Ju, Tianyu Zhang, and Chee Seng Chan. 2020. Deep Polarized Network for Supervised Learning of Accurate Binary Hashing Codes.. In IJCAI. 825--831.Google Scholar
Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, and Yue Gao. 2019. Hypergraph neural networks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 3558--3565.Google ScholarDigital Library
Jiannan Ge, Hongtao Xie, Shaobo Min, Pandeng Li, and Yongdong Zhang. 2022. Dual Part Discovery Network for Zero-Shot Learning. In Proceedings of the 30th ACM International Conference on Multimedia. 3244--3252.Google ScholarDigital Library
Hao Guo and Song Wang. 2021. Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15089--15098.Google ScholarCross Ref
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
Ruifei He, Jihan Yang, and Xiaojuan Qi. 2021. Re-distributing biased pseudo labels for semi-supervised semantic segmentation: A baseline investigation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6930--6940.Google ScholarCross Ref
Jiun Tian Hoe, Kam Woh Ng, Tianyu Zhang, Chee Seng Chan, Yi-Zhe Song, and Tao Xiang. 2021. One loss for all: Deep hashing with a single cosine similarity based learning objective. Advances in Neural Information Processing Systems, Vol. 34 (2021), 24286--24298.Google Scholar
Zhi Hou, Baosheng Yu, and Dacheng Tao. 2022. Batchformer: Learning to explore sample relationships for robust representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7256--7266.Google ScholarCross Ref
Xuan Kou, Chenghao Xu, Xu Yang, and Cheng Deng. 2022. Attention-guided Contrastive Hashing for Long-tailed Image Retrieval. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. 1017--1023.Google ScholarCross Ref
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).Google Scholar
Hanjiang Lai, Yan Pan, Ye Liu, and Shuicheng Yan. 2015. Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3270--3278.Google ScholarCross Ref
Pandeng Li, Yan Li, Hongtao Xie, and Lei Zhang. 2022a. Neighborhood-adaptive structure augmented metric learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1367--1375.Google ScholarCross Ref
Pandeng Li, Chen-Wei Xie, Hongtao Xie, Liming Zhao, Lei Zhang, Yun Zheng, Deli Zhao, and Yongdong Zhang. 2023. MomentDiff: Generative Video Moment Retrieval from Random to Real. arXiv preprint arXiv:2307.02869 (2023).Google Scholar
Pandeng Li, Hongtao Xie, Jiannan Ge, Lei Zhang, Shaobo Min, and Yongdong Zhang. 2022b. Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval. In ECCV. Springer, 181--197.Google Scholar
Pandeng Li, Hongtao Xie, Shaobo Min, Jiannan Ge, Xun Chen, and Yongdong Zhang. 2022c. Deep Fourier Ranking Quantization for Semi-supervised Image Retrieval. Transactions on Image Processing, Vol. 31 (2022), 5909--5922.Google ScholarDigital Library
Qi Li, Zhenan Sun, Ran He, and Tieniu Tan. 2017. Deep supervised discrete hashing. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 2479--2488.Google Scholar
Shuang Li, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Feng Qiao, and Xinjing Cheng. 2021. Metasaug: Meta semantic augmentation for long-tailed visual recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5212--5221.Google ScholarCross Ref
Wu-Jun Li, Sheng Wang, and Wang-Cheng Kang. 2016. Feature learning based deep supervised hashing with pairwise labels. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. 1711--1717.Google Scholar
Jongin Lim, Sangdoo Yun, Seulki Park, and Jin Young Choi. 2022. Hypergraph-induced semantic tuplet loss for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 212--222.Google ScholarCross Ref
Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, and Chu-Song Chen. 2015. Deep learning of binary hash codes for fast image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 27--35.Google ScholarCross Ref
Bin Liu, Yue Cao, Mingsheng Long, Jianmin Wang, and Jingdong Wang. 2018. Deep triplet quantization. In Proceedings of the 26th ACM international conference on Multimedia. 755--763.Google ScholarDigital Library
Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, and Stella X Yu. 2019. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2537--2546.Google ScholarCross Ref
Xiao Luo, Haixin Wang, Daqing Wu, Chong Chen, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. 2023. A survey on deep hashing methods. ACM Transactions on Knowledge Discovery from Data, Vol. 17, 1 (2023), 1--50.Google Scholar
Zeyu Ma, Wei Ju, Xiao Luo, Chong Chen, Xian-Sheng Hua, and Guangming Lu. 2022. Improved Deep Unsupervised Hashing via Prototypical Learning. In Proceedings of the 30th ACM International Conference on Multimedia. 659--667.Google ScholarDigital Library
Yair Movshovitz-Attias, Alexander Toshev, Thomas K Leung, Sergey Ioffe, and Saurabh Singh. 2017. No fuss distance metric learning using proxies. In Proceedings of the IEEE international conference on computer vision. 360--368.Google ScholarCross Ref
Mark EJ Newman. 2005. Power laws, Pareto distributions and Zipf's law. Contemporary physics, Vol. 46, 5 (2005), 323--351.Google Scholar
Sarah Parisot, Pedro M Esperancc a, Steven McDonagh, Tamas J Madarasz, Yongxin Yang, and Zhenguo Li. 2022. Long-tail recognition via compositional knowledge transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6939--6948.Google ScholarCross Ref
Seulki Park, Youngkyu Hong, Byeongho Heo, Sangdoo Yun, and Jin Young Choi. 2022. The majority can help the minority: Context-rich minority oversampling for long-tailed classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6887--6896.Google ScholarCross Ref
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, Vol. 32 (2019).Google Scholar
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International journal of computer vision, Vol. 115 (2015), 211--252.Google Scholar
Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep asymmetric pairwise hashing. In Proceedings of the 25th ACM international conference on Multimedia. 1522--1530.Google ScholarDigital Library
Jinan Sun, Haixin Wang, Xiao Luo, Shikun Zhang, Wei Xiang, Chong Chen, and Xian-Sheng Hua. 2022. HEART: Towards Effective Hash Codes under Label Noise. In Proceedings of the 30th ACM International Conference on Multimedia. 366--375.Google ScholarDigital Library
Eu Wern Teh, Terrance DeVries, and Graham W Taylor. 2020. Proxynca: Revisiting and revitalizing proxy neighborhood component analysis. In ECCV. Springer, 448--464.Google Scholar
Rong-Cheng Tu, Xian-Ling Mao, Jia-Nan Guo, Wei Wei, and Heyan Huang. 2021. Partial-softmax loss based deep hashing. In Proceedings of the Web Conference 2021. 2869--2878.Google ScholarDigital Library
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).Google Scholar
Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, and Dahua Lin. 2021. Seesaw loss for long-tailed instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9695--9704.Google ScholarCross Ref
Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, and Jiashi Feng. 2020. The devil is in classification: A simple framework for long-tail instance segmentation. In ECCV. Springer, 728--744.Google Scholar
Xiaofang Wang, Yi Shi, and Kris M Kitani. 2017. Deep supervised hashing with triplet labels. In ACCV. Springer, 70--84.Google Scholar
Yulin Wang, Xuran Pan, Shiji Song, Hong Zhang, Cheng Wu, and Gao Huang. 2019. Implicit semantic data augmentation for deep networks. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. 12635--12644.Google ScholarDigital Library
Yair Weiss, Antonio Torralba, and Rob Fergus. 2008. Spectral hashing. Advances in neural information processing systems, Vol. 21 (2008).Google Scholar
Zhenzhen Weng, Mehmet Giray Ogut, Shai Limonchik, and Serena Yeung. 2021. Unsupervised discovery of the long-tail in instance segmentation using hierarchical self-supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2603--2612.Google ScholarCross Ref
Tobias Weyand, Andre Araujo, Bingyi Cao, and Jack Sim. 2020. Google landmarks dataset v2-a large-scale benchmark for instance-level recognition and retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2575--2584.Google ScholarCross Ref
Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang, and Junsong Yuan. 2020b. Forest r-cnn: Large-vocabulary long-tailed object detection and instance segmentation. In Proceedings of the 28th ACM International Conference on Multimedia. 1570--1578.Google ScholarDigital Library
Tong Wu, Qingqiu Huang, Ziwei Liu, Yu Wang, and Dahua Lin. 2020a. Distribution-balanced loss for multi-label classification in long-tailed datasets. In ECCV. Springer, 162--178.Google Scholar
Li Yuan, Tao Wang, Xiaopeng Zhang, Francis EH Tay, Zequn Jie, Wei Liu, and Jiashi Feng. 2020. Central similarity quantization for efficient image and video retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3083--3092.Google ScholarCross Ref
Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. 2019. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision. 6023--6032.Google ScholarCross Ref
Fang Zhao, Yongzhen Huang, Liang Wang, and Tieniu Tan. 2015. Deep semantic ranking based hashing for multi-label image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1556--1564.Google Scholar
Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with hypergraphs: Clustering, classification, and embedding. Advances in neural information processing systems, Vol. 19 (2006).Google Scholar

Index Terms

Dual Dynamic Proxy Hashing Network for Long-tailed Image Retrieval
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Similarity measures
    2. Specialized information retrieval
      1. Multimedia and multimodal retrieval
        Image search

Recommendations

Deep Double Center Hashing for Face Image Retrieval
Pattern Recognition and Computer Vision
Abstract
Hashing is an effective and widely used technology for fast approximate nearest neighbor search in large-scale images. In recent years, it has been combined with a powerful feature learning model, convolutional neural network(CNN), to boost the ...
Read More
Supervised discrete discriminant hashing for image retrieval

We develop a new supervised discrete discriminant hashing learning method, which can learn discrete hashing codes and hashing function simultaneously.To make the learned discrete hash codes to be optimal for classification, the learned hashing framework ...
Read More
ElasticHash: Semantic Image Similarity Search by Deep Hashing with Elasticsearch
Computer Analysis of Images and Patterns
Abstract
We present ElasticHash, a novel approach for high-quality, efficient, and large-scale semantic image similarity search. It is based on a deep hashing model to learn hash codes for fine-grained image similarity search in natural images and a two-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM '23: Proceedings of the 31st ACM International Conference on Multimedia
October 2023
9913 pages
ISBN:9798400701085
DOI:10.1145/3581783
General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 27 October 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep hashing
large-scale image retrieval
long-tailed learning
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate995of4,171submissions,24%
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 128
  Total Downloads
- Downloads (Last 12 months)128
- Downloads (Last 6 weeks)28
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Dual Dynamic Proxy Hashing Network for Long-tailed Image Retrieval

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deep Double Center Hashing for Face Image Retrieval

Supervised discrete discriminant hashing for image retrieval

ElasticHash: Semantic Image Similarity Search by Deep Hashing with Elasticsearch