research-article

Unsupervised Hashing with Contrastive Learning by Exploiting Similarity Knowledge and Hidden Structure of Data

Authors:

Jiayang ChenAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 6350 - 6358

https://doi.org/10.1145/3581783.3612596

Published: 27 October 2023 Publication History

Abstract

By noticing the superior ability of contrastive learning in representation learning, several recent works have proposed to use it to learn semantic-rich hash codes. However, due to the absence of label information, existing contrastive-based hashing methods simply follow contrastive learning by only using the augmentation of the anchor as positive, while treating all other samples in the batch as negatives, resulting in the ignorance of a large number of potential positives. Consequently, the learned hash codes tend to be distributed dispersedly in the space, making their distances unable to accurately reflect their semantic similarities. To address this issue, we propose to exploit the similarity knowledge and hidden structure of the dataset. Specifically, we first develop an intuitive approach based on self-training that comprises two main components, a pseudo-label predictor and a hash code improving module, which mutually benefit from each other by utilizing the output from one another, in conjunction with the similarity knowledge obtained from pre-trained models. Furthermore, we subjected the intuitive approach to a more rigorous probabilistic framework and propose CGHash, a probabilistic hashing model based on conditional generative models, which is theoretically more reasonable and could model the similarity knowledge and the hidden group structure more accurately. Our extensive experimental results on three image datasets demonstrate that CGHash exhibits significant superiority when compared to both the proposed intuitive approach and existing baselines. Our code is available at https://github.com/KARLSZP/CGHash.

References

[1]

Yuki M. Asano, Christian Rupprecht, and Andrea Vedaldi. 2020. Self-labelling via simultaneous clustering and representation learning. In International Conference on Learning Representations.

[2]

Yoshua Bengio, Nicholas Léonard, and Aaron Courville. 2013. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013).

[3]

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning. 1597--1607.

[4]

Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. Nus-wide: a real-world web image database from national university of singapore. In Proceedings of the ACM international conference on image and video retrieval. 1--9.

Digital Library

[5]

Ekin D Cubuk, Barret Zoph, Jonathon Shlens, and Quoc V Le. 2020. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 702--703.

[6]

Bo Dai, Ruiqi Guo, Sanjiv Kumar, Niao He, and Le Song. 2017. Stochastic generative hashing. In International Conference on Machine Learning. 913--922.

[7]

Arthur P Dempster, Nan M Laird, and Donald B Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the royal statistical society: series B (methodological), Vol. 39, 1 (1977), 1--22.

[8]

Terrance DeVries and Graham W Taylor. 2017. Improved Regularization of Convolutional Neural Networks with Cutout. arXiv preprint arXiv:1708.04552 (2017).

[9]

Kamran Ghasedi Dizaji, Feng Zheng, Najmeh Sadoughi, Yanhua Yang, Cheng Deng, and Heng Huang. 2018. Unsupervised deep generative adversarial hashing network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3664--3673.

[10]

Wei Dong, Qinliang Su, Dinghan Shen, and Changyou Chen. 2019. Document Hashing with Mixture-Prior Generative Models. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 5226--5235.

[11]

Debidatta Dwibedi, Yusuf Aytar, Jonathan Tompson, Pierre Sermanet, and Andrew Zisserman. 2021. With a little help from my friends: Nearest-neighbor contrastive learning of visual representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9588--9597.

[12]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger (Eds.), Vol. 27.

[13]

Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9729--9738.

[14]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[15]

Qinghao Hu, Jiaxiang Wu, Jian Cheng, Lifang Wu, and Hanqing Lu. 2017. Pseudo label based unsupervised deep discriminative hashing for image retrieval. In Proceedings of the 25th ACM international conference on Multimedia. 1584--1590.

Digital Library

[16]

Shanshan Huang, Yichao Xiong, Ya Zhang, and Jia Wang. 2017. Unsupervised triplet hashing for fast image retrieval. In Proceedings of the on Thematic Workshops of ACM Multimedia 2017. 84--92.

Digital Library

[17]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations.

[18]

Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).

[19]

Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).

[20]

Yunqiang Li and Jan van Gemert. 2021. Deep unsupervised image hashing by maximizing bit entropy. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 2002--2010.

[21]

Kevin Lin, Jiwen Lu, Chu-Song Chen, and Jie Zhou. 2016. Learning compact binary descriptors with unsupervised deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1183--1192.

[22]

Qinghong Lin, Xiaojun Chen, Qin Zhang, Shaotian Cai, Wenzhe Zhao, and Hongfa Wang. 2022. Deep Unsupervised Hashing with Latent Semantic Components. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 7488--7496.

[23]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision.

[24]

Xiao Luo, Daqing Wu, Zeyu Ma, Chong Chen, Minghua Deng, Jianqiang Huang, and Xian-Sheng Hua. 2021a. A statistical approach to mining semantic similarity for deep unsupervised hashing. In Proceedings of the 29th ACM International Conference on Multimedia. 4306--4314.

Digital Library

[25]

Xiao Luo, Daqing Wu, Zeyu Ma, Chong Chen, Minghua Deng, Jinwen Ma, Zhongming Jin, Jianqiang Huang, and Xian-Sheng Hua. 2021b. CIMON: Towards High-quality Hash Codes. In International Joint Conferences on Artificial Intelligence Organization. 902--908.

[26]

Zeyu Ma, Xiao Luo, Yingjie Chen, Mimiao Hou, Jinxing Li, Minghua Deng, and Guangming Lu. 2022. Improved Deep Unsupervised Hashing with Fine-grained Semantic Similarity Mining for Multi-Label Image Retrieval. In Proceedings of the International Joint Conference on Artificial Intelligence.

[27]

Zijing Ou, Qinliang Su, Jianxing Yu, Ruihui Zhao, Yefeng Zheng, and Bang Liu. 2021. Refining BERT Embeddings for Document Hashing via Mutual Information Maximization. In Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Punta Cana, Dominican Republic, 2360--2369. https://doi.org/10.18653/v1/2021.findings-emnlp.203

[28]

Zexuan Qiu, Qinliang Su, Zijing Ou, Jianxing Yu, and Changyou Chen. 2021. Unsupervised Hashing with Contrastive Information Bottleneck. In International Joint Conferences on Artificial Intelligence Organization. 959--965.

[29]

Zexuan Qiu, Qinliang Su, Jianxing Yu, and Shijing Si. 2022. Efficient Document Retrieval by End-to-End Refining and Quantizing BERT Embedding with Contrastive Product Quantization. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 853--863. https://aclanthology.org/2022.emnlp-main.54

[30]

Dinghan Shen, Qinliang Su, Paidamoyo Chapfuwa, Wenlin Wang, Guoyin Wang, Ricardo Henao, and Lawrence Carin. 2018. NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2041--2050.

[31]

Yuming Shen, Li Liu, and Ling Shao. 2019. Unsupervised binary representation learning with deep variational networks. International Journal of Computer Vision, Vol. 127, 11 (2019), 1614--1628.

Digital Library

[32]

Yuming Shen, Jie Qin, Jiaxin Chen, Mengyang Yu, Li Liu, Fan Zhu, Fumin Shen, and Ling Shao. 2020. Auto-encoding twin-bottleneck hashing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2818--2827.

[33]

Jingkuan Song, Tao He, Lianli Gao, Xing Xu, Alan Hanjalic, and Heng Tao Shen. 2018. Binary generative adversarial networks for image retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.

[34]

Shupeng Su, Chao Zhang, Kai Han, and Yonghong Tian. 2018. Greedy hash: Towards fast optimization for accurate hash coding in cnn. Advances in neural information processing systems, Vol. 31 (2018).

[35]

Rong-Cheng Tu, Xianling Mao, and Wei Wei. 2020. MLS3RDUH: Deep Unsupervised Hashing via Manifold based Local Semantic Similarity Structure Reconstructing. In International Joint Conferences on Artificial Intelligence Organization. 3466--3472.

[36]

Rong-Cheng Tu, Xian-Ling Mao, Kevin Qinghong Lin, Chengfei Cai, Weize Qin, Hongfa Wang, Wei Wei, and Heyan Huang. 2022. Unsupervised Hashing with Semantic Concept Mining. arXiv preprint arXiv:2209.11475 (2022).

[37]

Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Marc Proesmans, and Luc Van Gool. 2020. Scan: Learning to classify images without labels. In European conference on computer vision. 268--285.

Digital Library

[38]

Don Van Ravenzwaaij, Pete Cassey, and Scott D Brown. 2018. A simple introduction to Markov Chain Monte--Carlo sampling. Psychonomic bulletin & review, Vol. 25, 1 (2018), 143--154.

[39]

Erkun Yang, Cheng Deng, Tongliang Liu, Wei Liu, and Dacheng Tao. 2018. Semantic structure-based unsupervised deep hashing. In International Joint Conferences on Artificial Intelligence Organization. 1064--1070.

[40]

Erkun Yang, Tongliang Liu, Cheng Deng, Wei Liu, and Dacheng Tao. 2019. Distillhash: Unsupervised deep hashing by distilling data pairs. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2946--2955.

[41]

Jiaguo Yu, Yuming Shen, Menghan Wang, Haofeng Zhang, and Philip H.S. Torr. 2022. Learning to Hash Naturally Sorts. In International Joint Conferences on Artificial Intelligence Organization. 1587--1593.

[42]

Haofeng Zhang, Li Liu, Yang Long, and Ling Shao. 2017. Unsupervised deep hashing with pseudo labels for scalable image retrieval. IEEE Transactions on Image Processing, Vol. 27, 4 (2017), 1626--1638.

Digital Library

[43]

Wanqian Zhang, Dayan Wu, Yu Zhou, Bo Li, Weiping Wang, and Dan Meng. 2020. Deep unsupervised hybrid-similarity hadamard hashing. In Proceedings of the 28th ACM International Conference on Multimedia. 3274--3282.

Digital Library

[44]

Huasong Zhong, Jianlong Wu, Chong Chen, Jianqiang Huang, Minghua Deng, Liqiang Nie, Zhouchen Lin, and Xian-Sheng Hua. 2021. Graph contrastive clustering. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 9224--9233.

[45]

Maciej Zieba, Piotr Semberecki, Tarek El-Gaaly, and Tomasz Trzcinski. 2018. Bingan: Learning compact binary descriptors with a regularized gan. Advances in neural information processing systems, Vol. 31 (2018).

Cited By

Guan ZZhao WLiu HNakashima YBabaguchi NHe X(2025)Cross-Modal Guided Visual Representation Learning for Social Image RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.351911247:3(2186-2198)Online publication date: Mar-2025
https://doi.org/10.1109/TPAMI.2024.3519112
Chen YLong YYang ZLong J(2025)Correlation embedding semantic-enhanced hashing for multimedia retrievalImage and Vision Computing10.1016/j.imavis.2025.105421(105421)Online publication date: Jan-2025
https://doi.org/10.1016/j.imavis.2025.105421
Liang YLi TYu XLi BJin T(2024)Self-Quantization with Adaptive Codebooks for Unsupervised Image RetrievalPattern Recognition and Computer Vision10.1007/978-981-97-8792-0_38(546-560)Online publication date: 9-Nov-2024
https://doi.org/10.1007/978-981-97-8792-0_38

Index Terms

Unsupervised Hashing with Contrastive Learning by Exploiting Similarity Knowledge and Hidden Structure of Data
1. Information systems
  1. Information retrieval

Recommendations

A Statistical Approach to Mining Semantic Similarity for Deep Unsupervised Hashing
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

The majority of deep unsupervised hashing methods usually first construct pairwise semantic similarity information and then learn to map images into compact hash codes while preserving the similarity structure, which implies that the quality of hash ...
Unsupervised Deep Hashing via Adaptive Clustering
Web and Big Data
Abstract
Similarity-preserved hashing has become a popular technique for large-scale image retrieval because of its low storage cost and high search efficiency. Unsupervised hashing has high practical value because it learns hash functions without any ...
Multi-similarity reconstructing and clustering-based contrastive hashing for cross-modal retrieval
Abstract
In unsupervised cross-modal hashing, there are two notable issues that require attention. The inter- and intra-modal similarity matrices in the original and Hamming spaces lack sufficient neighborhood information and semantic consistency, while ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
223
Total Downloads

Downloads (Last 12 months)111
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Guan ZZhao WLiu HNakashima YBabaguchi NHe X(2025)Cross-Modal Guided Visual Representation Learning for Social Image RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.351911247:3(2186-2198)Online publication date: Mar-2025
https://doi.org/10.1109/TPAMI.2024.3519112
Chen YLong YYang ZLong J(2025)Correlation embedding semantic-enhanced hashing for multimedia retrievalImage and Vision Computing10.1016/j.imavis.2025.105421(105421)Online publication date: Jan-2025
https://doi.org/10.1016/j.imavis.2025.105421
Liang YLi TYu XLi BJin T(2024)Self-Quantization with Adaptive Codebooks for Unsupervised Image RetrievalPattern Recognition and Computer Vision10.1007/978-981-97-8792-0_38(546-560)Online publication date: 9-Nov-2024
https://doi.org/10.1007/978-981-97-8792-0_38

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten