skip to main content
10.1145/3503161.3548403acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Improved Deep Unsupervised Hashing via Prototypical Learning

Published: 10 October 2022 Publication History

Abstract

Hashing has become increasingly popular in approximate nearest neighbor search in recent years due to its storage and computational efficiency. While deep unsupervised hashing has shown encouraging performance recently, its efficacy in the more realistic unsupervised situation is far from satisfactory due to two limitations. On one hand, they usually neglect the underlying global semantic structure in the deep feature space. On the other hand, they also ignore reconstructing the global structure in the hash code space. In this research, we develop a simple yet effective approach named deeP U nsupeR vised hashing via P rototypical LEarning. Specifically, introduces both feature prototypes and hashing prototypes to model the underlying semantic structures of the images in both deep feature space and hash code space. Then we impose a smoothness constraint to regularize the consistency of the global structures in two spaces through our semantic prototypical consistency learning. Moreover, our method encourages the prototypical consistency for different augmentations of each image via contrastive prototypical consistency learning. Comprehensive experiments on three benchmark datasets demonstrate that our proposed performs better than a variety of state-of-the-art retrieval methods.

Supplementary Material

MP4 File (MM22-fp3065.mp4)
While deep unsupervised hashing has shown encouraging performance recently, its efficacy in the more realistic unsupervised situation is far from satisfactory due to two limitations. First, they usually neglect the underlying global semantic structure in the deep feature space. Second, they also ignore reconstructing the global structure in the hash code space. In this research, we develop a simple yet effective approach named deeP UnsupeRvised hashing via Prototypical LEarning (PURPLE). Specifically, PURPLE introduces both feature prototypes and hashing prototypes to model the underlying semantic structures of the images in both deep feature space and hash code space. Then we impose a smoothness constraint to regularize the consistency of the global structures in two spaces through our semantic prototypical consistency learning. Moreover, our method encourages the prototypical consistency for different augmentations of each image via contrastive prototypical consistency learning.

References

[1]
Yuki Markus Asano, Christian Rupprecht, and Andrea Vedaldi. 2020. Self-labelling via simultaneous clustering and representation learning. In ICLR.
[2]
Yue Cao, Mingsheng Long, Bin Liu, and Jianmin Wang. 2018a. Deep cauchy hashing for hamming space retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1229--1237.
[3]
Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2017. Hashnet: Deep learning to hash by continuation. In Proceedings of the IEEE international conference on computer vision. 5608--5617.
[4]
Zhangjie Cao, Ziping Sun, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2018b. Deep priority hashing. In Proceedings of the 26th ACM international conference on Multimedia. 1653--1661.
[5]
Mathilde Caron, Piotr Bojanowski, Armand Joulin, and Matthijs Douze. 2018. Deep clustering for unsupervised learning of visual features. In Proceedings of the European Conference on Computer Vision. 132--149.
[6]
Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised Learning of Visual Features by Contrasting Cluster Assignments. In Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS).
[7]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709 (2020).
[8]
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: a real-world web image database from National University of Singapore. In Proceedings of the ACM international conference on image and video retrieval. 1--9.
[9]
Marco Cuturi. 2013. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, Vol. 26 (2013), 2292--2300.
[10]
Bo Dai, Ruiqi Guo, Sanjiv Kumar, Niao He, and Le Song. 2017. Stochastic generative hashing. In International Conference on Machine Learning. PMLR, 913--922.
[11]
Alexey Dosovitskiy, Jost Tobias Springenberg, Martin Riedmiller, and Thomas Brox. 2014. Discriminative unsupervised feature learning with convolutional neural networks. Advances in neural information processing systems (2014).
[12]
Aristides Gionis, Piotr Indyk, Rajeev Motwani, et al. 1999. Similarity search in high dimensions via hashing. In Vldb, Vol. 99. 518--529.
[13]
Yifan Gu, Shidong Wang, Haofeng Zhang, Yazhou Yao, and Li Liu. 2019. Clustering-driven unsupervised deep hashing for image retrieval. Neurocomputing, Vol. 368 (2019), 114--123.
[14]
Raia Hadsell, Sumit Chopra, and Yann LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2. 1735--1742.
[15]
Zhongkai Hao, Chengqiang Lu, Zhenya Huang, Hao Wang, Zheyuan Hu, Qi Liu, Enhong Chen, and Cheekong Lee. 2020. ASGN: An active semi-supervised graph neural network for molecular property prediction. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.
[16]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9729--9738.
[17]
Jiabo Huang, Shaogang Gong, and Xiatian Zhu. 2020. Deep semantic clustering by partition confidence maximisation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8849--8858.
[18]
Mark J Huiskes and Michael S Lew. 2008. The MIR flickr retrieval evaluation. In Proceedings of the 1st ACM international conference on Multimedia information retrieval. 39--43.
[19]
Young Kyun Jang and Nam Ik Cho. 2021. Self-supervised product quantization for deep unsupervised image retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12085--12094.
[20]
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).
[21]
Hanjiang Lai, Yan Pan, Ye Liu, and Shuicheng Yan. 2015. Simultaneous feature learning and hash coding with deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3270--3278.
[22]
Junnan Li, Caiming Xiong, and Steven CH Hoi. 2021c. Comatch: Semi-supervised learning with contrastive graph regularization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9475--9484.
[23]
Qi Li, Zhenan Sun, Ran He, and Tieniu Tan. 2017. Deep supervised discrete hashing. In Advances in Neural Information Processing Systems. 2482--2491.
[24]
Yingjian Li, Yingnan Gao, Bingzhi Chen, Zheng Zhang, Guangming Lu, and David Zhang. 2021a. Self-supervised exclusive-inclusive interactive learning for multi-label facial expression recognition in the wild. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 32, 5 (2021), 3190--3202.
[25]
Yunfan Li, Peng Hu, Zitao Liu, Dezhong Peng, Joey Tianyi Zhou, and Xi Peng. 2021b. Contrastive clustering. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 8547--8555.
[26]
Yingjian Li, Guangming Lu, Jinxing Li, Zheng Zhang, and David Zhang. 2020. Facial Expression Recognition in the Wild Using Multi-level Features and Attention Mechanisms. IEEE Transactions on Affective Computing (2020).
[27]
Yingjian Li, Zheng Zhang, Bingzhi Chen, Guangming Lu, and David Zhang. 2022. Deep margin-sensitive representation learning for cross-domain facial expression recognition. IEEE Transactions on Multimedia (2022).
[28]
Kevin Lin, Jiwen Lu, Chu-Song Chen, and Jie Zhou. 2016. Learning compact binary descriptors with unsupervised deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1183--1192.
[29]
Mingbao Lin, Rongrong Ji, Hong Liu, and Yongjian Wu. 2018. Supervised online hashing via hadamard codebook learning. In Proceedings of the 26th ACM international conference on Multimedia. 1635--1643.
[30]
Shuai Lin, Pan Zhou, Zi-Yuan Hu, Shuojia Wang, Ruihui Zhao, Yefeng Zheng, Liang Lin, Eric Xing, and Xiaodan Liang. 2021. Prototypical Graph Contrastive Learning. arXiv preprint arXiv:2106.09645 (2021).
[31]
Xiao Luo, Haixin Wang, Chong Chen, Huasong Zhong, Hao Zhang, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. 2022. A Survey on Deep Hashing Methods. ACM Transactions on Knowledge Discovery from Data (2022).
[32]
Xiao Luo, Daqing Wu, Chong Chen, Jinwen Ma, and Minghua Deng. 2021a. Deep Unsupervised Hashing by Global and Local Consistency. In 2021 IEEE International Conference on Multimedia and Expo (ICME). 1--6.
[33]
Xiao Luo, Daqing Wu, Zeyu Ma, Chong Chen, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. 2021b. A Statistical Approach to Mining Semantic Similarity for Deep Unsupervised Hashing. In Proceedings of the 29th ACM International Conference on Multimedia. 4306--4314.
[34]
Xiao Luo, Daqing Wu, Zeyu Ma, Chong Chen, Minghua Deng, Jinwen Ma, Zhongming Jin, Jianqiang Huang, and Xian-Sheng Hua. 2021c. Cimon: Towards high-quality hash codes. In Proceedings of the International Joint Conference on Artificial Intelligence. 902--908.
[35]
Zeyu Ma, Yuhang Guo, Xiao Luo, Chong Chen, Minghua Deng, Wei Cheng, and Guangming Lu. 2022. DHWP: Learning High-Quality Short Hash Codes Via Weight Pruning. In ICASSP 2022--2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4783--4787.
[36]
Zexuan Qiu, Qinliang Su, Zijing Ou, Jianxing Yu, and Changyou Chen. 2021. Unsupervised Hashing with Contrastive Information Bottleneck. In Proceedings of the International Joint Conference on Artificial Intelligence.
[37]
David Schneider, Saquib Sarfraz, Alina Roitberg, and Rainer Stiefelhagen. 2022. Pose-based contrastive learning for domain agnostic activity representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3433--3443.
[38]
Yuming Shen, Jie Qin, Jiaxin Chen, Mengyang Yu, Li Liu, Fan Zhu, Fumin Shen, and Ling Shao. 2020. Auto-encoding twin-bottleneck hashing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2818--2827.
[39]
K. Simonyan and A. Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In International Conference on Learning Representations.
[40]
Rong-Cheng Tu, Xian-Ling Mao, and Wei Wei. 2020. MLS3RDUH: Deep Unsupervised Hashing via Manifold based Local Semantic Similarity Structure Reconstructing. In Proceedings of the International Joint Conference on Artificial Intelligence. 3466--3472.
[41]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).
[42]
Jingdong Wang, Heng Tao Shen, Jingkuan Song, and Jianqiu Ji. 2014. Hashing for similarity search: A survey. arXiv preprint arXiv:1408.2927 (2014).
[43]
Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral hashing. Advances in neural information processing systems (2009).
[44]
Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3733--3742.
[45]
Cheng Yan, Guansong Pang, Xiao Bai, Chunhua Shen, Jun Zhou, and Edwin Hancock. 2019. Deep hashing by discriminating hard examples. In Proceedings of the 27th ACM International Conference on Multimedia. 1535--1542.
[46]
Erkun Yang, Cheng Deng, Tongliang Liu, Wei Liu, and Dacheng Tao. 2018. Semantic structure-based unsupervised deep hashing. In Proceedings of the International Joint Conference on Artificial Intelligence. 1064--1070.
[47]
Erkun Yang, Tongliang Liu, Cheng Deng, Wei Liu, and Dacheng Tao. 2019. Distillhash: Unsupervised deep hashing by distilling data pairs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2946--2955.
[48]
Li Yuan, Tao Wang, Xiaopeng Zhang, Francis EH Tay, Zequn Jie, Wei Liu, and Jiashi Feng. 2020. Central similarity quantization for efficient image and video retrieval. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3083--3092.
[49]
Wanqian Zhang, Dayan Wu, Yu Zhou, Bo Li, Weiping Wang, and Dan Meng. 2020. Deep unsupervised hybrid-similarity hadamard hashing. In Proceedings of the 28th ACM International Conference on Multimedia. 3274--3282.
[50]
Shu Zhao, Dayan Wu, Wanqian Zhang, Yu Zhou, Bo Li, and Weiping Wang. 2020. Asymmetric Deep Hashing for Efficient Hash Code Compression. In Proceedings of the 28th ACM International Conference on Multimedia. 763--771.
[51]
Huasong Zhong, Jianlong Wu, Chong Chen, Jianqiang Huang, Minghua Deng, Liqiang Nie, Zhouchen Lin, and Xian-Sheng Hua. 2021. Graph Contrastive Clustering. In Proceedings of the IEEE/CVF International Conference on Computer Vision.

Cited By

View all
  • (2025)Self-Supervised Locality-Sensitive Deep Hashing for the Robust Retrieval of Degraded ImagesIEEE Transactions on Information Forensics and Security10.1109/TIFS.2025.353110420(1582-1596)Online publication date: 2025
  • (2025)Precise occlusion-aware and feature-level reconstruction for occluded person re-identificationNeurocomputing10.1016/j.neucom.2024.128919616(128919)Online publication date: Feb-2025
  • (2025)Rethinking neural architecture representation for predictors: Topological encoding in pixel spaceInformation Fusion10.1016/j.inffus.2024.102925118(102925)Online publication date: Jun-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
October 2022
7537 pages
ISBN:9781450392037
DOI:10.1145/3503161
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep hashing
  2. image retrieval
  3. unsupervised learning

Qualifiers

  • Research-article

Funding Sources

  • Medical Biometrics Perception and Analysis Engineering Laboratory
  • Guangdong Basic and Applied Basic Research Foundation
  • Shenzhen Key Technical Project
  • Shenzhen Fundamental Research Fund
  • NSFC fund
  • Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies

Conference

MM '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)83
  • Downloads (Last 6 weeks)4
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Self-Supervised Locality-Sensitive Deep Hashing for the Robust Retrieval of Degraded ImagesIEEE Transactions on Information Forensics and Security10.1109/TIFS.2025.353110420(1582-1596)Online publication date: 2025
  • (2025)Precise occlusion-aware and feature-level reconstruction for occluded person re-identificationNeurocomputing10.1016/j.neucom.2024.128919616(128919)Online publication date: Feb-2025
  • (2025)Rethinking neural architecture representation for predictors: Topological encoding in pixel spaceInformation Fusion10.1016/j.inffus.2024.102925118(102925)Online publication date: Jun-2025
  • (2025)Audio meets text: a loss-enhanced journey with manifold mixup and re-rankingKnowledge and Information Systems10.1007/s10115-024-02283-467:3(2195-2231)Online publication date: 1-Mar-2025
  • (2024)HARR: Learning Discriminative and High-Quality Hash Codes for Image RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/362716220:5(1-23)Online publication date: 22-Jan-2024
  • (2024)Semantic-Enhanced Proxy-Guided Hashing for Long-Tailed Image RetrievalIEEE Transactions on Multimedia10.1109/TMM.2024.339468426(9499-9514)Online publication date: 29-Apr-2024
  • (2024)Discrepancy and Structure-Based Contrast for Test-Time Adaptive RetrievalIEEE Transactions on Multimedia10.1109/TMM.2024.338133726(8665-8677)Online publication date: 25-Mar-2024
  • (2024)Exploring Hierarchical Information in Hyperbolic Space for Self-Supervised Image HashingIEEE Transactions on Image Processing10.1109/TIP.2024.337135833(1768-1781)Online publication date: 5-Mar-2024
  • (2024)ROSE: Relational and Prototypical Structure Learning for Universal Domain Adaptive HashingIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.344431919(7690-7704)Online publication date: 2024
  • (2024)Unsupervised Deep Hashing With Fine-Grained Similarity-Preserving Contrastive Learning for Image RetrievalIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.332044434:5(4095-4108)Online publication date: May-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media