skip to main content
10.1145/3652583.3658017acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Deep Scaling Factor Quantization Network for Large-scale Image Retrieval

Published: 07 June 2024 Publication History

Abstract

Hash learning aims to map multimedia data into Hamming space, in which the data point is represented by low-dimensional binary codes and the similarity relationships are preserved. Despite existing hash learning methods have been effectively used in data retrieval tasks for its merits of low memory cost and high computational efficiency, there still remain two major technical challenges. Firstly, due to the discrete constraints of hash codes, traditional hash methods typically use relaxation strategy to learn real-value features and then quantize them into binary codes through a sign function, resulting in significant quantization errors. Secondly, hash codes are usually low-dimensional, which would be inadequate to preserve either the information of each data point or the relationship between two. These two challenges would greatly limit the retrieval performance of learned hash codes. To solve these problems, we introduce a novel quantization method called scaling factor quantization to enhance hash learning. Unlike traditional hashing methods, we propose to map the data into two parts, i.e., hash codes and scaling factors, to learn the representative codes for the use of retrieval. Specifically, we design a multi-output branch network structure, i.e., Deep Scaling factor Quantization Network (DSQN) and an iterative training strategy for DSQN to learn the two parts of mapping. Comprehensive experiments conducted on three benchmark datasets demonstrate that the hash codes and scaling factors learned by DSQN significantly improve retrieval accuracy compared to existing hash learning methods.

References

[1]
Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).
[2]
Nikolaos Passalis and Anastasios Tefas. 2021. Deep supervised hashing using quadratic spherical mutual information for efficient image retrieval. Signal Processing: Image Communication 93 (2021), 116146.
[3]
Xiangtao Zheng, Yichao Zhang, and Xiaoqiang Lu. 2020. Deep balanced discrete hashing for image retrieval. Neurocomputing 403 (2020), 224--236.
[4]
Li Yuan, Tao Wang, Xiaopeng Zhang, Francis EH Tay, Zequn Jie, Wei Liu, and Jiashi Feng. 2020. Central similarity quantization for efficient image and video retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3083--3092.
[5]
Aristides Gionis, Piotr Indyk, Rajeev Motwani, et al. 1999. Similarity search in high dimensions via hashing. In Vldb, Vol. 99. 518--529.
[6]
Brian Kulis and Kristen Grauman. 2009. Kernelized locality-sensitive hashing for scalable image search. In 2009 IEEE 12th international conference on computer vision. IEEE, 2130--2137.
[7]
Fumin Shen, Chunhua Shen, Wei Liu, and Heng Tao Shen. 2015. Supervised discrete hashing. In Proceedings of the IEEE conference on computer vision and pattern recognition. 37--45.
[8]
Wei Liu, Jun Wang, Rongrong Ji, Yu-Gang Jiang, and Shih-Fu Chang. 2012. Supervised hashing with kernels. In 2012 IEEE conference on computer vision and pattern recognition. IEEE, 2074--2081.
[9]
Ming Zhang and Hong Yan. 2021. Improved deep classwise hashing with centers similarity learning for image retrieval. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 10516--10523.
[10]
Xiushan Nie, Xingbo Liu, Jie Guo, Letian Wang, and Yilong Yin. 2022. Supervised discrete multiple-length hashing for image retrieval. IEEE Transactions on Big Data 9, 1 (2022), 312--327.
[11]
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, and Florent Perronnin. 2012. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE transactions on pattern analysis and machine intelligence 35, 12 (2012), 2916--2929.
[12]
Jingjing Liu, Shaoting Zhang, Wei Liu, Cheng Deng, Yuanjie Zheng, and Dimitris N Metaxas. 2016. Scalable mammogram retrieval using composite anchor graph hashing with iterative quantization. IEEE transactions on circuits and systems for video technology 27, 11 (2016), 2450--2460.
[13]
Yair Weiss, Antonio Torralba, and Rob Fergus. 2008. Spectral hashing. Advances in neural information processing systems 21 (2008).
[14]
Zhihui Lai, Yudong Chen, Jian Wu, Wai Keung Wong, and Fumin Shen. 2018. Jointly sparse hashing for image retrieval. IEEE transactions on image processing 27, 12 (2018), 6147--6158.
[15]
Haifeng Hu, Kun Wang, Chenggang Lv, Jiansheng Wu, and Zhen Yang. 2018. Semi-supervised metric learning-based anchor graph hashing for large-scale image retrieval. IEEE Transactions on Image Processing 28, 2 (2018), 739--754.
[16]
Weiwei Shi, Yihong Gong, Badong Chen, and Xinhong Hei. 2021. Transductive semisupervised deep hashing. IEEE Transactions on Neural Networks and Learning Systems 33, 8 (2021), 3713--3726.
[17]
Guan'an Wang, Qinghao Hu, Jian Cheng, and Zengguang Hou. 2018. Semisupervised generative adversarial hashing for image retrieval. In Proceedings of the European conference on computer vision (ECCV). 469--485.
[18]
Young Kyun Jang and Nam Ik Cho. 2020. Generalized product quantization network for semi-supervised image retrieval. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3420--3429.
[19]
Hua-Junjie Huang, Rui Yang, Chuan-Xiang Li, Yuliang Shi, Shanqing Guo, and Xin-Shun Xu. 2017. Supervised cross-modal hashing without relaxation. In 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1159--1164.
[20]
Jile Zhou, Guiguang Ding, and Yuchen Guo. 2014. Latent semantic sparse hashing for cross-modal similarity search. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. 415--424.
[21]
Qing-Yuan Jiang and Wu-Jun Li. 2017. Deep cross-modal hashing. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3232--3240.
[22]
Gengshen Wu, Zijia Lin, Jungong Han, Li Liu, Guiguang Ding, Baochang Zhang, and Jialie Shen. 2018. Unsupervised Deep Hashing via Binary Latent Factor Models for Large-scale Cross-modal Retrieval. In IJCAI, Vol. 1. 5.
[23]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradientbased learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
[24]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[25]
Yali Zhao, Yali Li, and Shengjin Wang. 2021. Asymmetric deep hashing for person re-identifications. Tsinghua Science and Technology 27, 2 (2021), 396--411.
[26]
Erkun Yang, Cheng Deng, Tongliang Liu, Wei Liu, and Dacheng Tao. 2018. Semantic structure-based unsupervised deep hashing. In Proceedings of the 27th international joint conference on artificial intelligence. 1064--1070.
[27]
Nikolaos Passalis and Anastasios Tefas. 2021. Deep supervised hashing using quadratic spherical mutual information for efficient image retrieval. Signal Processing: Image Communication 93 (2021), 116146.
[28]
Fumin Shen, Yan Xu, Li Liu, Yang Yang, Zi Huang, and Heng Tao Shen. 2018. Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE transactions on pattern analysis and machine intelligence 40, 12 (2018), 3034--3044.
[29]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).
[30]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[31]
Rongkai Xia, Yan Pan, Hanjiang Lai, Cong Liu, and Shuicheng Yan. 2014. Supervised hashing for image retrieval via image representation learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 28.
[32]
Wu-Jun Li, Sheng Wang, and Wang-Cheng Kang. 2015. Feature learning based deep supervised hashing with pairwise labels. arXiv preprint arXiv:1511.03855 (2015).
[33]
Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S Yu. 2017. Hashnet: Deep learning to hash by continuation. In Proceedings of the IEEE international conference on computer vision. 5608--5617.
[34]
Jacek P Dmochowski, Paul Sajda, and Lucas C Parra. 2010. Maximum Likelihood in Cost-Sensitive Learning: Model Specification, Approximations, and Upper Bounds. Journal of Machine Learning Research 11, 12 (2010).
[35]
Xiaofang Wang, Yi Shi, and Kris M Kitani. 2017. Deep supervised hashing with triplet labels. In Computer Vision-ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part I 13. Springer, 70--84.
[36]
Qing-Yuan Jiang and Wu-Jun Li. 2018. Asymmetric deep supervised hashing. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
[37]
Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep asymmetric pairwise hashing. In Proceedings of the 25th ACM international conference on Multimedia. 1522--1530.
[38]
Xuefei Zhe, Shifeng Chen, and Hong Yan. 2019. Deep class-wise hashing: Semantics-preserving hashing via class-wise loss. IEEE transactions on neural networks and learning systems 31, 5 (2019), 1681--1695.
[39]
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).
[40]
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. Nus-wide: a real-world web image database from national university of singapore. In Proceedings of the ACM international conference on image and video retrieval. 1--9.
[41]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 740--755.
[42]
Shupeng Su, Chao Zhang, Kai Han, and Yonghong Tian. 2018. Greedy hash: Towards fast optimization for accurate hash coding in cnn. Advances in neural information processing systems 31 (2018).
[43]
Qi Li, Zhenan Sun, Ran He, and Tieniu Tan. 2017. Deep supervised discrete hashing. Advances in neural information processing systems 30 (2017).
[44]
Yue Cao, Mingsheng Long, Bin Liu, and Jianmin Wang. 2018. Deep cauchy hashing for hamming space retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1229--1237.

Cited By

View all
  • (2025)Embedded Separate Deep Localization Feature Information Vision Transformer for Hash Image RetrievalExpert Systems with Applications10.1016/j.eswa.2025.126902(126902)Online publication date: Feb-2025
  • (2024)A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671470(6491-6501)Online publication date: 25-Aug-2024
  • (2024)Deep Lifelong Cross-Modal HashingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.345049034:12(13478-13493)Online publication date: Dec-2024

Index Terms

  1. Deep Scaling Factor Quantization Network for Large-scale Image Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval
    May 2024
    1379 pages
    ISBN:9798400706196
    DOI:10.1145/3652583
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 June 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. hash learning
    2. quantization errors
    3. scaling factor

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICMR '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)82
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Embedded Separate Deep Localization Feature Information Vision Transformer for Hash Image RetrievalExpert Systems with Applications10.1016/j.eswa.2025.126902(126902)Online publication date: Feb-2025
    • (2024)A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language ModelsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671470(6491-6501)Online publication date: 25-Aug-2024
    • (2024)Deep Lifelong Cross-Modal HashingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.345049034:12(13478-13493)Online publication date: Dec-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media