skip to main content
10.1145/3539618.3591660acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Data-Aware Proxy Hashing for Cross-modal Retrieval

Published: 18 July 2023 Publication History

Abstract

Recently, numerous proxy hash code based methods, which sufficiently exploit the label information of data to supervise the training of hashing models, have been proposed. Although these methods have made impressive progress, their generating processes of proxy hash codes are based only on the class information of the dataset or labels of data but do not take the data themselves into account. Therefore, these methods will probably generate some inappropriate proxy hash codes, thus damaging the retrieval performance of the hash models. To solve the aforementioned problem, we propose a novel Data-Aware Proxy Hashing for cross-modal retrieval, called DAPH. Specifically, our proposed method first train a data-aware proxy network that takes the data points, label vectors of data, and the class vectors of the dataset as inputs to generate class-based data-aware proxy hash codes, label-fused image-aware proxy hash codes and label-fused text-aware proxy hash codes. Then, we propose a novel hash loss that exploits the three types of data-aware proxy hash codes to supervise the training of modality-specific hashing networks. After training, DAPH is able to generate discriminate hash codes with the semantic information preserved adequately. Extensive experiments on three benchmark datasets show that the proposed DAPH outperforms the state-of-the-art baselines in cross-modal retrieval tasks.

References

[1]
Yue Cao, Bin Liu, Mingsheng Long, and Jianmin Wang. 2018. Cross-modal Hamming hashing. In Proceedings of the European Conference on Computer Vision. 202--218.
[2]
Miaomiao Cheng, Liping Jing, and Michael K Ng. 2020. Robust unsupervised cross-modal hashing for multimedia retrieval. ACM Transactions on Information Systems (TOIS), Vol. 38, 3 (2020), 1--25.
[3]
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: a real-world web image database from National University of Singapore. In Proceedings of the ACM International Conference on Image and Video Retrieval. ACM, 48.
[4]
Guiguang Ding, Yuchen Guo, and Jile Zhou. 2014. Collective matrix factorization hashing for multimodal data. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2075--2082.
[5]
Venice Erin Liong, Jiwen Lu, Yap-Peng Tan, and Jie Zhou. 2017. Cross-modal deep variational hashing. In Proceedings of the IEEE International Conference on Computer Vision. 4077--4085.
[6]
Hugo Jair Escalante, Carlos A Hernández, Jesus A Gonzalez, Aurelio López-López, Manuel Montes, Eduardo F Morales, L Enrique Sucar, Luis Villase nor, and Michael Grubinger. 2010. The segmented and annotated IAPR TC-12 benchmark. Computer Vision and Image Understanding, Vol. 114, 4 (2010), 419--428.
[7]
Yixian Fang and Yuwei Ren. 2020. Supervised discrete cross-modal hashing based on kernel discriminant analysis. Pattern Recognition, Vol. 98 (2020), 107062.
[8]
Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth International Conference on Artificial Intelligence and Statistics. 249--256.
[9]
Hengtong Hu, Lingxi Xie, Richang Hong, and Qi Tian. 2020. Creating something from nothing: Unsupervised knowledge distillation for cross-modal hashing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3123--3132.
[10]
Mengqiu Hu, Yang Yang, Fumin Shen, Ning Xie, Richang Hong, and Heng Tao Shen. 2018. Collective reconstructive embeddings for cross-modal hashing. IEEE Transactions on Image Processing, Vol. 28, 6 (2018), 2770--2784.
[11]
Qing-Yuan Jiang and Wu-Jun Li. 2017. Deep cross-modal hashing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3232--3240.
[12]
Qing-Yuan Jiang and Wu-Jun Li. 2019. Discrete Latent Factor Model for Cross-Modal Hashing. IEEE Transactions on Image Processing (2019).
[13]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, Vol. 25 (2012).
[14]
Shaishav Kumar and Raghavendra Udupa. 2011. Learning hash functions for cross-view similarity search. In Twenty-Second International Joint Conference on Artificial Intelligence.
[15]
Chao Li, Cheng Deng, Ning Li, Wei Liu, Xinbo Gao, and Dacheng Tao. 2018. Self-supervised adversarial hashing networks for cross-modal retrieval. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4242--4251.
[16]
Chao Li, Cheng Deng, Lei Wang, De Xie, and Xianglong Liu. 2019. Coupled cyclegan: Unsupervised hashing network for cross-modal retrieval. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 176--183.
[17]
Xuelong Li, Di Hu, and Feiping Nie. 2017. Deep binary reconstruction for cross-modal hashing. In Proceedings of the 25th ACM international conference on Multimedia. 1398--1406.
[18]
Qiubin Lin, Wenming Cao, Zhihai He, and Zhiquan He. 2020. Semantic deep cross-modal hashing. Neurocomputing (2020).
[19]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European Conference on Computer Vision. Springer, 740--755.
[20]
Venice Erin Liong, Jiwen Lu, and Yap-Peng Tan. 2018. Cross-modal discrete hashing. Pattern Recognition, Vol. 79 (2018), 114--129.
[21]
Xin Liu, Xingzhi Wang, and Yiu-ming Cheung. 2021. FDDH: Fast Discriminative Discrete Hashing for Large-Scale Cross-Modal Retrieval. IEEE Transactions on Neural Networks and Learning Systems (2021).
[22]
Xin Luo, Xiao-Ya Yin, Liqiang Nie, Xuemeng Song, Yongxin Wang, and Xin-Shun Xu. 2018. SDMCH: Supervised Discrete Manifold-Embedded Cross-Modal Hashing. In Twenty-Seventh International Ioint Conference on Artificial Intelligence. 2518--2524.
[23]
Herbert Robbins and Sutton Monro. 1951. A stochastic approximation method. The annals of mathematical statistics (1951), 400--407.
[24]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, Vol. 115, 3 (2015), 211--252.
[25]
Yufeng Shi, Xinge You, Feng Zheng, Shuo Wang, and Qinmu Peng. 2019. Equally-Guided Discriminative Hashing for Cross-modal Retrieval. In Twenty-Eighth International Ioint Conference on Artificial Intelligence. 4767--4773.
[26]
Jingkuan Song, Yang Yang, Yi Yang, Zi Huang, and Heng Tao Shen. 2013. Inter-media hashing for large-scale retrieval from heterogeneous data sources. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 785--796.
[27]
Shupeng Su, Zhisheng Zhong, and Chao Zhang. 2019. Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-modal retrieval. In Proceedings of the IEEE International Conference on Computer Vision. 3027--3035.
[28]
Changchang Sun, Xuemeng Song, Fuli Feng, Wayne Xin Zhao, Hao Zhang, and Liqiang Nie. 2019. Supervised hierarchical cross-modal hashing. In Proceedings of the 42nd international ACM SIGIR conference on research and development in information retrieval. 725--734.
[29]
Rong-Cheng Tu, Xianling Mao, and Wei Wei. 2020a. MLS3RDUH: Deep Unsupervised Hashing via Manifold based Local Semantic Similarity Structure Reconstructing. In IJCAI. 3466--3472.
[30]
Rong-Cheng Tu, Xian-Ling Mao, Jia-Nan Guo, Wei Wei, and Heyan Huang. 2021a. Partial-Softmax Loss based Deep Hashing. Proceedings of The Web Conference 2021 (2021).
[31]
Rong-Cheng Tu, Xian-Ling Mao, Cihang Kong, Zihang Shao, Ze-Lin Li, Wei Wei, and Heyan Huang. 2021b. Weighted gaussian loss based hamming hashing. In Proceedings of the 29th ACM International Conference on Multimedia. 3409--3417.
[32]
Rong-Cheng Tu, Xian-Ling Mao, Kevin Qinghong Lin, Chengfei Cai, Weize Qin, Hongfa Wang, Wei Wei, and Heyan Huang. 2022a. Unsupervised Hashing with Semantic Concept Mining. arXiv preprint arXiv:2209.11475 (2022).
[33]
Rong-Cheng Tu, Xian-Ling Mao, Qinghong Lin, Wenjin Ji, Weize Qin, Wei Wei, and Heyan Huang. 2023. Unsupervised Cross-modal Hashing via Semantic Text Mining. IEEE Transactions on Multimedia (2023).
[34]
Rong-Cheng Tu, Xian-Ling Mao, Bing Ma, Yong Hu, Tan Yan, Wei Wei, and Heyan Huang. 2020c. Deep Cross-Modal Hashing with Hashing Functions and Unified Hash Codes Jointly Learning. IEEE Transactions on Knowledge and Data Engineering (2020).
[35]
Rong-Cheng Tu, Xian-Ling Mao, Rong-Xin Tu, Binbin Bian, Chengfei Cai, Wei Wei, Heyan Huang, et al. 2022b. Deep cross-modal proxy hashing. IEEE Transactions on Knowledge and Data Engineering (2022).
[36]
Rong-Cheng Tu, Xian-Ling Mao, and Wei Wei. 2020b. MLS3RDUH: Deep Unsupervised Hashing via Manifold based Local Semantic Similarity Structure Reconstructing. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence.
[37]
Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, and Heng Tao Shen. 2017b. Adversarial cross-modal retrieval. In Proceedings of the 25th ACM International Conference on Multimedia. ACM, 154--162.
[38]
Di Wang, Xinbo Gao, Xiumei Wang, and Lihuo He. 2015. Semantic topic multimodal hashing for cross-media retrieval. In Twenty-Fourth International Ioint Conference on Artificial Intelligence.
[39]
Di Wang, Xinbo Gao, Xiumei Wang, and Lihuo He. 2018. Label consistent matrix factorization hashing for large-scale cross-modal similarity search. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 41, 10 (2018), 2466--2479.
[40]
Di Wang, Quan Wang, and Xinbo Gao. 2017a. Robust and flexible discrete hashing for cross-modal similarity search. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 28, 10 (2017), 2703--2715.
[41]
Tong Wang, Lei Zhu, Zhiyong Cheng, Jingjing Li, and Zan Gao. 2020b. Unsupervised deep cross-modal hashing with virtual label regression. Neurocomputing, Vol. 386 (2020), 84--96.
[42]
Weiwei Wang, Yuming Shen, Haofeng Zhang, Yazhou Yao, and Li Liu. 2021. Set and rebase: determining the semantic graph connectivity for unsupervised cross-modal hashing. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 853--859.
[43]
Xinzhi Wang, Xitao Zou, Erwin M Bakker, and Song Wu. 2020c. Self-constraining and attention-based hashing network for bit-scalable cross-modal retrieval. Neurocomputing, Vol. 400 (2020), 255--271.
[44]
Yongxin Wang, Xin Luo, Liqiang Nie, Jingkuan Song, Wei Zhang, and Xin-Shun Xu. 2020a. BATCH: A scalable asymmetric discrete cross-modal hashing. IEEE Transactions on Knowledge and Data Engineering, Vol. 33, 11 (2020), 3507--3519.
[45]
Hongfa Wu, Lisai Zhang, Qingcai Chen, Yimeng Deng, Joanna Siebert, Yunpeng Han, Zhonghua Li, Dejiang Kong, and Zhao Cao. 2022. Contrastive Label Correlation Enhanced Unified Hashing Encoder for Cross-modal Retrieval. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2158--2168.
[46]
Xing Xu, Fumin Shen, Yang Yang, Heng Tao Shen, and Xuelong Li. 2017. Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Transactions on Image Processing, Vol. 26, 5 (2017), 2494--2507.
[47]
Dejie Yang, Dayan Wu, Wanqian Zhang, Haisu Zhang, Bo Li, and Weiping Wang. 2020. Deep Semantic-Alignment Hashing for Unsupervised Cross-Modal Retrieval. In Proceedings of the 2020 International Conference on Multimedia Retrieval. 44--52.
[48]
Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise relationship guided deep hashing for cross-modal retrieval. In proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31.
[49]
Erkun Yang, Dongren Yao, Tongliang Liu, and Cheng Deng. 2022. Mutual Quantization for Cross-Modal Search With Noisy Labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7551--7560.
[50]
Jun Yu, Hao Zhou, Yibing Zhan, and Dacheng Tao. 2021. Deep Graph-neighbor Coherence Preserving Network for Unsupervised Cross-modal Hashing. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4626--4634.
[51]
Yu-Wei Zhan, Yongxin Wang, Yu Sun, Xiao-Ming Wu, Xin Luo, and Xin-Shun Xu. 2022. Discrete online cross-modal hashing. Pattern Recognition, Vol. 122 (2022), 108262.
[52]
Dongqing Zhang and Wu-Jun Li. 2014. Large-scale supervised multimodal hashing with semantic correlation maximization. In Twenty-Eighth AAAI Conference on Artificial Intelligence.
[53]
Donglin Zhang, Xiao-Jun Wu, and Jun Yu. 2021. Label consistent flexible matrix factorization hashing for efficient cross-modal retrieval. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol. 17, 3 (2021), 1--18.
[54]
Jian Zhang, Yuxin Peng, and Mingkuan Yuan. 2018. Unsupervised generative adversarial cross-modal hashing. In Thirty-Second AAAI Conference on Artificial Intelligence.
[55]
Jile Zhou, Guiguang Ding, and Yuchen Guo. 2014. Latent semantic sparse hashing for cross-modal similarity search. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. 415--424.
[56]
Xiaofeng Zhu, Zi Huang, Heng Tao Shen, and Xin Zhao. 2013. Linear cross-modal hashing for efficient multimedia search. In Proceedings of the 21st ACM international conference on Multimedia. 143--152.

Cited By

View all
  • (2025)Proxy-Based Semi-Supervised Cross-Modal HashingApplied Sciences10.3390/app1505239015:5(2390)Online publication date: 23-Feb-2025
  • (2025)All Points Guided Adversarial Generator for Targeted Attack Against Deep Hashing RetrievalIEEE Transactions on Information Forensics and Security10.1109/TIFS.2025.353458520(1695-1709)Online publication date: 2025
  • (2025)Efficient Parameter-free Adaptive Hashing for Large-Scale Cross-Modal RetrievalInternational Journal of Approximate Reasoning10.1016/j.ijar.2025.109383(109383)Online publication date: Feb-2025
  • Show More Cited By

Index Terms

  1. Data-Aware Proxy Hashing for Cross-modal Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2023
    3567 pages
    ISBN:9781450394086
    DOI:10.1145/3539618
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 July 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cross-modal
    2. data-aware
    3. hashing

    Qualifiers

    • Research-article

    Funding Sources

    • National Key R\&D Plan
    • CCF-AFSG Research Fund under Grant
    • the fund of Joint Laboratory of HUST and Pingan Property \& Casualty Research (HPL)
    • National Natural Science Foundation of China

    Conference

    SIGIR '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)199
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Proxy-Based Semi-Supervised Cross-Modal HashingApplied Sciences10.3390/app1505239015:5(2390)Online publication date: 23-Feb-2025
    • (2025)All Points Guided Adversarial Generator for Targeted Attack Against Deep Hashing RetrievalIEEE Transactions on Information Forensics and Security10.1109/TIFS.2025.353458520(1695-1709)Online publication date: 2025
    • (2025)Efficient Parameter-free Adaptive Hashing for Large-Scale Cross-Modal RetrievalInternational Journal of Approximate Reasoning10.1016/j.ijar.2025.109383(109383)Online publication date: Feb-2025
    • (2025)Modality-Specific Hashing: Transform Cross-Modal Retrieval Into Single-Modal RetrievalMultiMedia Modeling10.1007/978-981-96-2061-6_32(438-451)Online publication date: 9-Jan-2025
    • (2024)Text-Enhanced Graph Attention Hashing for Cross-Modal RetrievalEntropy10.3390/e2611091126:11(911)Online publication date: 27-Oct-2024
    • (2024)Privacy-Enhanced Prototype-Based Federated Cross-Modal Hashing for Cross-Modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/367450720:9(1-19)Online publication date: 23-Sep-2024
    • (2024)Contrastive Multi-Bit Collaborative Learning for Deep Cross-Modal HashingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.341957736:11(5835-5848)Online publication date: 1-Nov-2024
    • (2024)Efficient Image-Text Retrieval via Keyword-Guided Pre-ScreeningIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.333948934:6(5132-5145)Online publication date: 1-Jun-2024
    • (2024)Semantic Reconstruction Guided Missing Cross-modal Hashing2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650222(1-8)Online publication date: 30-Jun-2024
    • (2024)Data-Focus Proxy Hashing2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD)10.1109/CSCWD61410.2024.10580005(3152-3157)Online publication date: 8-May-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media