research-article

Deep Scaling Factor Quantization Network for Large-scale Image Retrieval

Authors:

Xu WuAuthors Info & Claims

ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval

Pages 851 - 859

https://doi.org/10.1145/3652583.3658017

Published: 07 June 2024 Publication History

Abstract

Hash learning aims to map multimedia data into Hamming space, in which the data point is represented by low-dimensional binary codes and the similarity relationships are preserved. Despite existing hash learning methods have been effectively used in data retrieval tasks for its merits of low memory cost and high computational efficiency, there still remain two major technical challenges. Firstly, due to the discrete constraints of hash codes, traditional hash methods typically use relaxation strategy to learn real-value features and then quantize them into binary codes through a sign function, resulting in significant quantization errors. Secondly, hash codes are usually low-dimensional, which would be inadequate to preserve either the information of each data point or the relationship between two. These two challenges would greatly limit the retrieval performance of learned hash codes. To solve these problems, we introduce a novel quantization method called scaling factor quantization to enhance hash learning. Unlike traditional hashing methods, we propose to map the data into two parts, i.e., hash codes and scaling factors, to learn the representative codes for the use of retrieval. Specifically, we design a multi-output branch network structure, i.e., Deep Scaling factor Quantization Network (DSQN) and an iterative training strategy for DSQN to learn the two parts of mapping. Comprehensive experiments conducted on three benchmark datasets demonstrate that the hash codes and scaling factors learned by DSQN significantly improve retrieval accuracy compared to existing hash learning methods.

References

[1]

Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).

[2]

Nikolaos Passalis and Anastasios Tefas. 2021. Deep supervised hashing using quadratic spherical mutual information for efficient image retrieval. Signal Processing: Image Communication 93 (2021), 116146.

[3]

Xiangtao Zheng, Yichao Zhang, and Xiaoqiang Lu. 2020. Deep balanced discrete hashing for image retrieval. Neurocomputing 403 (2020), 224--236.

[4]

Li Yuan, Tao Wang, Xiaopeng Zhang, Francis EH Tay, Zequn Jie, Wei Liu, and Jiashi Feng. 2020. Central similarity quantization for efficient image and video retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3083--3092.

[5]

Aristides Gionis, Piotr Indyk, Rajeev Motwani, et al. 1999. Similarity search in high dimensions via hashing. In Vldb, Vol. 99. 518--529.

Digital Library

[6]

Brian Kulis and Kristen Grauman. 2009. Kernelized locality-sensitive hashing for scalable image search. In 2009 IEEE 12th international conference on computer vision. IEEE, 2130--2137.

[7]

Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In Proceedings of the IEEE conference on computer vision and pattern recognition. 37--45.

[8]

Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, and Shih-Fu Chang. 2012. Supervised hashing with kernels. In 2012 IEEE conference on computer vision and pattern recognition. IEEE, 2074--2081.

Digital Library

[9]

Ming Zhang and Hong Yan. 2021. Improved deep classwise hashing with centers similarity learning for image retrieval. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 10516--10523.

[10]

Xiushan Nie, Xingbo Liu, Jie Guo, Letian Wang, and Yilong Yin. 2022. Supervised discrete multiple-length hashing for image retrieval. IEEE Transactions on Big Data 9, 1 (2022), 312--327.

[11]

Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2012. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE transactions on pattern analysis and machine intelligence 35, 12 (2012), 2916--2929.

[12]

Jingjing Liu, Shaoting Zhang, Wei Liu, Cheng Deng, Yuanjie Zheng, and Dimitris N Metaxas. 2016. Scalable mammogram retrieval using composite anchor graph hashing with iterative quantization. IEEE transactions on circuits and systems for video technology 27, 11 (2016), 2450--2460.

[13]

Yair Weiss, Antonio Torralba, and Rob Fergus. 2008. Spectral hashing. Advances in neural information processing systems 21 (2008).

[14]

Zhihui Lai, Yudong Chen, Jian Wu, Wai Keung Wong, and Fumin Shen. 2018. Jointly sparse hashing for image retrieval. IEEE transactions on image processing 27, 12 (2018), 6147--6158.

[15]

Haifeng Hu, Kun Wang, Chenggang Lv, Jiansheng Wu, and Zhen Yang. 2018. Semi-supervised metric learning-based anchor graph hashing for large-scale image retrieval. IEEE Transactions on Image Processing 28, 2 (2018), 739--754.

Digital Library

[16]

Weiwei Shi, Yihong Gong, Badong Chen, and Xinhong Hei. 2021. Transductive semisupervised deep hashing. IEEE Transactions on Neural Networks and Learning Systems 33, 8 (2021), 3713--3726.

[17]

Guan'an Wang, Qinghao Hu, Jian Cheng, and Zengguang Hou. 2018. Semisupervised generative adversarial hashing for image retrieval. In Proceedings of the European conference on computer vision (ECCV). 469--485.

[18]

Young Kyun Jang and Nam Ik Cho. 2020. Generalized product quantization network for semi-supervised image retrieval. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3420--3429.

[19]

Hua-Junjie Huang, Rui Yang, Chuan-Xiang Li, Yuliang Shi, Shanqing Guo, and Xin-Shun Xu. 2017. Supervised cross-modal hashing without relaxation. In 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1159--1164.

[20]

Jile Zhou, Guiguang Ding, and Yuchen Guo. 2014. Latent semantic sparse hashing for cross-modal similarity search. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. 415--424.

Digital Library

[21]

Qing-Yuan Jiang and Wu-Jun Li. 2017. Deep cross-modal hashing. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3232--3240.

[22]

Gengshen Wu, Zijia Lin, Jungong Han, Li Liu, Guiguang Ding, Baochang Zhang, and Jialie Shen. 2018. Unsupervised Deep Hashing via Binary Latent Factor Models for Large-scale Cross-modal Retrieval. In IJCAI, Vol. 1. 5.

[23]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradientbased learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.

[24]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[25]

Yali Zhao, Yali Li, and Shengjin Wang. 2021. Asymmetric deep hashing for person re-identifications. Tsinghua Science and Technology 27, 2 (2021), 396--411.

[26]

Erkun Yang, Cheng Deng, Tongliang Liu, Wei Liu, and Dacheng Tao. 2018. Semantic structure-based unsupervised deep hashing. In Proceedings of the 27th international joint conference on artificial intelligence. 1064--1070.

[27]

Nikolaos Passalis and Anastasios Tefas. 2021. Deep supervised hashing using quadratic spherical mutual information for efficient image retrieval. Signal Processing: Image Communication 93 (2021), 116146.

[28]

Fumin Shen, Yan Xu, Li Liu, Yang Yang, Zi Huang, and Heng Tao Shen. 2018. Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE transactions on pattern analysis and machine intelligence 40, 12 (2018), 3034--3044.

Digital Library

[29]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).

[30]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[31]

Rongkai Xia, Yan Pan, Hanjiang Lai, Cong Liu, and Shuicheng Yan. 2014. Supervised hashing for image retrieval via image representation learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 28.

[32]

Wu-Jun Li, Sheng Wang, and Wang-Cheng Kang. 2015. Feature learning based deep supervised hashing with pairwise labels. arXiv preprint arXiv:1511.03855 (2015).

Digital Library

[33]

Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2017. Hashnet: Deep learning to hash by continuation. In Proceedings of the IEEE international conference on computer vision. 5608--5617.

[34]

Jacek P Dmochowski, Paul Sajda, and Lucas C Parra. 2010. Maximum Likelihood in Cost-Sensitive Learning: Model Specification, Approximations, and Upper Bounds. Journal of Machine Learning Research 11, 12 (2010).

[35]

Xiaofang Wang, Yi Shi, and Kris M Kitani. 2017. Deep supervised hashing with triplet labels. In Computer Vision-ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part I 13. Springer, 70--84.

[36]

Qing-Yuan Jiang and Wu-Jun Li. 2018. Asymmetric deep supervised hashing. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.

[37]

Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep asymmetric pairwise hashing. In Proceedings of the 25th ACM international conference on Multimedia. 1522--1530.

Digital Library

[38]

Xuefei Zhe, Shifeng Chen, and Hong Yan. 2019. Deep class-wise hashing: Semantics-preserving hashing via class-wise loss. IEEE transactions on neural networks and learning systems 31, 5 (2019), 1681--1695.

[39]

Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).

[40]

Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. Nus-wide: a real-world web image database from national university of singapore. In Proceedings of the ACM international conference on image and video retrieval. 1--9.

Digital Library

[41]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 740--755.

[42]

Shupeng Su, Chao Zhang, Kai Han, and Yonghong Tian. 2018. Greedy hash: Towards fast optimization for accurate hash coding in cnn. Advances in neural information processing systems 31 (2018).

[43]

Qi Li, Zhenan Sun, Ran He, and Tieniu Tan. 2017. Deep supervised discrete hashing. Advances in neural information processing systems 30 (2017).

[44]

Yue Cao, Mingsheng Long, Bin Liu, and Jianmin Wang. 2018. Deep cauchy hashing for hamming space retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1229--1237.

Cited By

Zhang JCheng SWang L(2025)Embedded Separate Deep Localization Feature Information Vision Transformer for Hash Image RetrievalExpert Systems with Applications10.1016/j.eswa.2025.126902(126902)Online publication date: Feb-2025
https://doi.org/10.1016/j.eswa.2025.126902
Fan WDing YNing LWang SLi HYin DChua TLi QBaeza-Yates RBonchi F(2024)A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671470(6491-6501)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671470
Xu LLi HZheng BLi WLv J(2024)Deep Lifelong Cross-Modal HashingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.345049034:12(13478-13493)Online publication date: Dec-2024
https://doi.org/10.1109/TCSVT.2024.3450490

Index Terms

Deep Scaling Factor Quantization Network for Large-scale Image Retrieval
1. Information systems
  1. Information retrieval

Recommendations

Control scaling factor of cuckoo search algorithm using learning automata

In this study, we seek an optimal scaling factor of cuckoo search algorithm by using learning automata. In the presented method, the same learning automaton is built for each individual, and a set of actions of each learning automaton are set to several ...
Deep Quantization Network for efficient image retrieval
AAAI'16: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence

Hashing has been widely applied to approximate nearest neighbor search for large-scale multimedia retrieval. Supervised hashing improves the quality of hash coding by exploiting the semantic similarity on data pairs and has received increasing attention ...
Joint learning based deep supervised hashing for large-scale image retrieval
Abstract
Hashing has been widely used for large-scale image retrieval due to its high storage efficiency and fast calculation speed. Recent works have found that deep-supervised hashing methods are superior to non-deep-supervised hashing methods and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval

May 2024

1379 pages

ISBN:9798400706196

DOI:10.1145/3652583

General Chairs:
Cathal Gurrin
Dublin City University, Ireland
,
Rachada Kongkachandra
Thammasat University, Thailand
,
Klaus Schoeffmann
Klagenfurt University, Austria
,
Program Chairs:
Duc-Tien Dang-Nguyen
University of Bergen, Norway
,
Luca Rossetto
University of Zurich, Switzerland
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Liting Zhou
Dublin City University, Ireland

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Science and Technology Planning Project of Shenzhen Municipality
The Natural Science Foundation of Guangdong Province
The Natural Science Foundation of China

Conference

ICMR '24

Sponsor:

ICMR '24: International Conference on Multimedia Retrieval

June 10 - 14, 2024

Phuket, Thailand

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
82
Total Downloads

Downloads (Last 12 months)82
Downloads (Last 6 weeks)11

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang JCheng SWang L(2025)Embedded Separate Deep Localization Feature Information Vision Transformer for Hash Image RetrievalExpert Systems with Applications10.1016/j.eswa.2025.126902(126902)Online publication date: Feb-2025
https://doi.org/10.1016/j.eswa.2025.126902
Fan WDing YNing LWang SLi HYin DChua TLi QBaeza-Yates RBonchi F(2024)A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671470(6491-6501)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671470
Xu LLi HZheng BLi WLv J(2024)Deep Lifelong Cross-Modal HashingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.345049034:12(13478-13493)Online publication date: Dec-2024
https://doi.org/10.1109/TCSVT.2024.3450490

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten