skip to main content
10.1145/3581783.3612328acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Dual Dynamic Proxy Hashing Network for Long-tailed Image Retrieval

Published:27 October 2023Publication History

ABSTRACT

Deep hashing has been extensively explored for image retrieval due to fast computation and efficient storage. Since conventional deep hashing methods are not suitable for the common scenario in real life that data exhibits a long-tailed distribution, several long-tailed hashing methods have been proposed recently. However, existing long-tail hashing methods seek to utilize fixed class centroids and cannot fully develop the discriminative ability of hash codes for tail-class samples. Specifically, fixed class centroids cannot characterize authentic semantics of tail classes or provide effective semantic information for hash codes learning under the long-tailed setting. To this end, we propose a novel Dual Dynamic Proxy Hashing Network (DDPHN) with two sets of learnable dynamic proxies, i.e. hash proxies and feature proxies, to improve the discrimination of hash codes for tail-class samples. Compared with fixed class centroids, learnable proxies can be optimized constantly via the proxy learning loss and depict accurate class semantics despite the scarcity of tail-class samples. Apart from low-dimensional binary hash proxies, we introduce high-dimensional continuous feature proxies that can describe semantic relationships more precisely, contributing to hash codes learning as well. To further leverage semantic information carried by proxies, we build a hypergraph by exploring neighborhood relationships in the feature space and then introduce a hypergraph neural network to transfer knowledge from proxies to samples in the Hamming space. Extensive experiments show the superiority of our learnable dynamic proxies and demonstrate that our method outperforms numerous deep hashing models and recent state-of-the-art long-tailed hashing methods.

References

  1. Fatih Cakir, Kun He, Sarah Adel Bargal, and Stan Sclaroff. 2019. Hashing with mutual information. IEEE transactions on pattern analysis and machine intelligence, Vol. 41, 10 (2019), 2424--2437.Google ScholarGoogle Scholar
  2. Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2017. Hashnet: Deep learning to hash by continuation. In Proceedings of the IEEE international conference on computer vision. 5608--5617.Google ScholarGoogle ScholarCross RefCross Ref
  3. Zhangjie Cao, Ziping Sun, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2018. Deep priority hashing. In Proceedings of the 26th ACM international conference on Multimedia. 1653--1661.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Shen Chen, Liujuan Cao, Mingbao Lin, Yan Wang, Xiaoshuai Sun, Chenglin Wu, Jingfei Qiu, and Rongrong Ji. 2019. Hadamard codebook based deep hashing. arXiv preprint arXiv:1910.09182 (2019).Google ScholarGoogle Scholar
  5. Yong Chen, Yuqing Hou, Shu Leng, Qing Zhang, Zhouchen Lin, and Dell Zhang. 2021. Long-tail hashing. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1328--1338.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Peng Chu, Xiao Bian, Shaopeng Liu, and Haibin Ling. 2020. Feature space augmentation for long-tailed data. In ECCV. Springer, 694--710.Google ScholarGoogle Scholar
  7. Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge Belongie. 2019. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9268--9277.Google ScholarGoogle ScholarCross RefCross Ref
  8. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.Google ScholarGoogle ScholarCross RefCross Ref
  9. Thanh-Toan Do, Anh-Dzung Doan, and Ngai-Man Cheung. 2016. Learning to hash with binary deep neural network. In ECCV. Springer, 219--234.Google ScholarGoogle Scholar
  10. Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).Google ScholarGoogle Scholar
  11. Lixin Fan, Kam Woh Ng, Ce Ju, Tianyu Zhang, and Chee Seng Chan. 2020. Deep Polarized Network for Supervised Learning of Accurate Binary Hashing Codes.. In IJCAI. 825--831.Google ScholarGoogle Scholar
  12. Yifan Feng, Haoxuan You, Zizhao Zhang, Rongrong Ji, and Yue Gao. 2019. Hypergraph neural networks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 3558--3565.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jiannan Ge, Hongtao Xie, Shaobo Min, Pandeng Li, and Yongdong Zhang. 2022. Dual Part Discovery Network for Zero-Shot Learning. In Proceedings of the 30th ACM International Conference on Multimedia. 3244--3252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hao Guo and Song Wang. 2021. Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15089--15098.Google ScholarGoogle ScholarCross RefCross Ref
  15. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  16. Ruifei He, Jihan Yang, and Xiaojuan Qi. 2021. Re-distributing biased pseudo labels for semi-supervised semantic segmentation: A baseline investigation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6930--6940.Google ScholarGoogle ScholarCross RefCross Ref
  17. Jiun Tian Hoe, Kam Woh Ng, Tianyu Zhang, Chee Seng Chan, Yi-Zhe Song, and Tao Xiang. 2021. One loss for all: Deep hashing with a single cosine similarity based learning objective. Advances in Neural Information Processing Systems, Vol. 34 (2021), 24286--24298.Google ScholarGoogle Scholar
  18. Zhi Hou, Baosheng Yu, and Dacheng Tao. 2022. Batchformer: Learning to explore sample relationships for robust representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7256--7266.Google ScholarGoogle ScholarCross RefCross Ref
  19. Xuan Kou, Chenghao Xu, Xu Yang, and Cheng Deng. 2022. Attention-guided Contrastive Hashing for Long-tailed Image Retrieval. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. 1017--1023.Google ScholarGoogle ScholarCross RefCross Ref
  20. Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).Google ScholarGoogle Scholar
  21. Hanjiang Lai, Yan Pan, Ye Liu, and Shuicheng Yan. 2015. Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3270--3278.Google ScholarGoogle ScholarCross RefCross Ref
  22. Pandeng Li, Yan Li, Hongtao Xie, and Lei Zhang. 2022a. Neighborhood-adaptive structure augmented metric learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1367--1375.Google ScholarGoogle ScholarCross RefCross Ref
  23. Pandeng Li, Chen-Wei Xie, Hongtao Xie, Liming Zhao, Lei Zhang, Yun Zheng, Deli Zhao, and Yongdong Zhang. 2023. MomentDiff: Generative Video Moment Retrieval from Random to Real. arXiv preprint arXiv:2307.02869 (2023).Google ScholarGoogle Scholar
  24. Pandeng Li, Hongtao Xie, Jiannan Ge, Lei Zhang, Shaobo Min, and Yongdong Zhang. 2022b. Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval. In ECCV. Springer, 181--197.Google ScholarGoogle Scholar
  25. Pandeng Li, Hongtao Xie, Shaobo Min, Jiannan Ge, Xun Chen, and Yongdong Zhang. 2022c. Deep Fourier Ranking Quantization for Semi-supervised Image Retrieval. Transactions on Image Processing, Vol. 31 (2022), 5909--5922.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Qi Li, Zhenan Sun, Ran He, and Tieniu Tan. 2017. Deep supervised discrete hashing. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 2479--2488.Google ScholarGoogle Scholar
  27. Shuang Li, Kaixiong Gong, Chi Harold Liu, Yulin Wang, Feng Qiao, and Xinjing Cheng. 2021. Metasaug: Meta semantic augmentation for long-tailed visual recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5212--5221.Google ScholarGoogle ScholarCross RefCross Ref
  28. Wu-Jun Li, Sheng Wang, and Wang-Cheng Kang. 2016. Feature learning based deep supervised hashing with pairwise labels. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. 1711--1717.Google ScholarGoogle Scholar
  29. Jongin Lim, Sangdoo Yun, Seulki Park, and Jin Young Choi. 2022. Hypergraph-induced semantic tuplet loss for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 212--222.Google ScholarGoogle ScholarCross RefCross Ref
  30. Kevin Lin, Huei-Fang Yang, Jen-Hao Hsiao, and Chu-Song Chen. 2015. Deep learning of binary hash codes for fast image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 27--35.Google ScholarGoogle ScholarCross RefCross Ref
  31. Bin Liu, Yue Cao, Mingsheng Long, Jianmin Wang, and Jingdong Wang. 2018. Deep triplet quantization. In Proceedings of the 26th ACM international conference on Multimedia. 755--763.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, and Stella X Yu. 2019. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2537--2546.Google ScholarGoogle ScholarCross RefCross Ref
  33. Xiao Luo, Haixin Wang, Daqing Wu, Chong Chen, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. 2023. A survey on deep hashing methods. ACM Transactions on Knowledge Discovery from Data, Vol. 17, 1 (2023), 1--50.Google ScholarGoogle Scholar
  34. Zeyu Ma, Wei Ju, Xiao Luo, Chong Chen, Xian-Sheng Hua, and Guangming Lu. 2022. Improved Deep Unsupervised Hashing via Prototypical Learning. In Proceedings of the 30th ACM International Conference on Multimedia. 659--667.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yair Movshovitz-Attias, Alexander Toshev, Thomas K Leung, Sergey Ioffe, and Saurabh Singh. 2017. No fuss distance metric learning using proxies. In Proceedings of the IEEE international conference on computer vision. 360--368.Google ScholarGoogle ScholarCross RefCross Ref
  36. Mark EJ Newman. 2005. Power laws, Pareto distributions and Zipf's law. Contemporary physics, Vol. 46, 5 (2005), 323--351.Google ScholarGoogle Scholar
  37. Sarah Parisot, Pedro M Esperancc a, Steven McDonagh, Tamas J Madarasz, Yongxin Yang, and Zhenguo Li. 2022. Long-tail recognition via compositional knowledge transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6939--6948.Google ScholarGoogle ScholarCross RefCross Ref
  38. Seulki Park, Youngkyu Hong, Byeongho Heo, Sangdoo Yun, and Jin Young Choi. 2022. The majority can help the minority: Context-rich minority oversampling for long-tailed classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6887--6896.Google ScholarGoogle ScholarCross RefCross Ref
  39. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, Vol. 32 (2019).Google ScholarGoogle Scholar
  40. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International journal of computer vision, Vol. 115 (2015), 211--252.Google ScholarGoogle Scholar
  41. Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep asymmetric pairwise hashing. In Proceedings of the 25th ACM international conference on Multimedia. 1522--1530.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Jinan Sun, Haixin Wang, Xiao Luo, Shikun Zhang, Wei Xiang, Chong Chen, and Xian-Sheng Hua. 2022. HEART: Towards Effective Hash Codes under Label Noise. In Proceedings of the 30th ACM International Conference on Multimedia. 366--375.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Eu Wern Teh, Terrance DeVries, and Graham W Taylor. 2020. Proxynca: Revisiting and revitalizing proxy neighborhood component analysis. In ECCV. Springer, 448--464.Google ScholarGoogle Scholar
  44. Rong-Cheng Tu, Xian-Ling Mao, Jia-Nan Guo, Wei Wei, and Heyan Huang. 2021. Partial-softmax loss based deep hashing. In Proceedings of the Web Conference 2021. 2869--2878.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).Google ScholarGoogle Scholar
  46. Jiaqi Wang, Wenwei Zhang, Yuhang Zang, Yuhang Cao, Jiangmiao Pang, Tao Gong, Kai Chen, Ziwei Liu, Chen Change Loy, and Dahua Lin. 2021. Seesaw loss for long-tailed instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9695--9704.Google ScholarGoogle ScholarCross RefCross Ref
  47. Tao Wang, Yu Li, Bingyi Kang, Junnan Li, Junhao Liew, Sheng Tang, Steven Hoi, and Jiashi Feng. 2020. The devil is in classification: A simple framework for long-tail instance segmentation. In ECCV. Springer, 728--744.Google ScholarGoogle Scholar
  48. Xiaofang Wang, Yi Shi, and Kris M Kitani. 2017. Deep supervised hashing with triplet labels. In ACCV. Springer, 70--84.Google ScholarGoogle Scholar
  49. Yulin Wang, Xuran Pan, Shiji Song, Hong Zhang, Cheng Wu, and Gao Huang. 2019. Implicit semantic data augmentation for deep networks. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. 12635--12644.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Yair Weiss, Antonio Torralba, and Rob Fergus. 2008. Spectral hashing. Advances in neural information processing systems, Vol. 21 (2008).Google ScholarGoogle Scholar
  51. Zhenzhen Weng, Mehmet Giray Ogut, Shai Limonchik, and Serena Yeung. 2021. Unsupervised discovery of the long-tail in instance segmentation using hierarchical self-supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2603--2612.Google ScholarGoogle ScholarCross RefCross Ref
  52. Tobias Weyand, Andre Araujo, Bingyi Cao, and Jack Sim. 2020. Google landmarks dataset v2-a large-scale benchmark for instance-level recognition and retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2575--2584.Google ScholarGoogle ScholarCross RefCross Ref
  53. Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang, and Junsong Yuan. 2020b. Forest r-cnn: Large-vocabulary long-tailed object detection and instance segmentation. In Proceedings of the 28th ACM International Conference on Multimedia. 1570--1578.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Tong Wu, Qingqiu Huang, Ziwei Liu, Yu Wang, and Dahua Lin. 2020a. Distribution-balanced loss for multi-label classification in long-tailed datasets. In ECCV. Springer, 162--178.Google ScholarGoogle Scholar
  55. Li Yuan, Tao Wang, Xiaopeng Zhang, Francis EH Tay, Zequn Jie, Wei Liu, and Jiashi Feng. 2020. Central similarity quantization for efficient image and video retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3083--3092.Google ScholarGoogle ScholarCross RefCross Ref
  56. Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, and Youngjoon Yoo. 2019. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision. 6023--6032.Google ScholarGoogle ScholarCross RefCross Ref
  57. Fang Zhao, Yongzhen Huang, Liang Wang, and Tieniu Tan. 2015. Deep semantic ranking based hashing for multi-label image retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1556--1564.Google ScholarGoogle Scholar
  58. Dengyong Zhou, Jiayuan Huang, and Bernhard Schölkopf. 2006. Learning with hypergraphs: Clustering, classification, and embedding. Advances in neural information processing systems, Vol. 19 (2006).Google ScholarGoogle Scholar

Index Terms

  1. Dual Dynamic Proxy Hashing Network for Long-tailed Image Retrieval

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '23: Proceedings of the 31st ACM International Conference on Multimedia
        October 2023
        9913 pages
        ISBN:9798400701085
        DOI:10.1145/3581783

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 27 October 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia
      • Article Metrics

        • Downloads (Last 12 months)128
        • Downloads (Last 6 weeks)28

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader