research-article

Deep Adversarial Discrete Hashing for Cross-Modal Retrieval

Authors:

Shengyong ChenAuthors Info & Claims

ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval

Pages 525 - 531

https://doi.org/10.1145/3372278.3390711

Published: 08 June 2020 Publication History

Abstract

Cross-modal hashing has received widespread attentions on cross-modal retrieval task due to its superior retrieval efficiency and low storage cost. However, most existing cross-modal hashing methods learn binary codes directly from multimedia data, which cannot fully utilize the semantic knowledge of the data. Furthermore, they cannot learn the ranking based similarity relevance of data points with multi-label. And they usually use a relax constraint of hash code which causes non-negligible quantization loss in the optimization. In this paper, a hashing method called Deep Adversarial Discrete Hashing (DADH) is proposed to address these issues for cross-modal retrieval. The proposed method uses adversarial training to learn features across modalities and ensure the distribution consistency of feature representations across modalities. We also introduce a weighted cosine triplet constraint which can make full use of semantic knowledge from the multi-label to ensure the precise ranking relevance of item pairs. In addition, we use a discrete hashing strategy to learn the discrete binary codes without relaxation, by which the semantic knowledge from label in the hash codes can be preserved while the quantization loss can be minimized. Ablation experiments and comparison experiments on two cross-modal databases show that the proposed DADH improves the performance and outperforms several state-of-the-art hashing methods for cross-modal retrieval.

References

[1]

Cong Bai, Ling Huang, Xiang Pan, Jianwei Zheng, and Shengyong Chen. 2018a. Optimization of deep convolutional neural network for large scale image retrieval. Neurocomputing, Vol. 303 (2018), 60 -- 67.

[2]

Cong Bai, Jia nan Chen, Ling Huang, Kidiyo Kpalma, and Shengyong Chen. 2018b. Saliency-based multi-feature modeling for semantic image retrieval. Journal of Visual Communication and Image Representation, Vol. 50 (2018), 199 -- 204.

Digital Library

[3]

Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the Devil in the Details: Delving Deep into Convolutional Nets. In Proceedings of the British Machine Vision Conference. BMVA Press.

[4]

Tat Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In CIVR 2009 - Proceedings of the ACM International Conference on Image and Video Retrieval. 368--375.

Digital Library

[5]

Wen Gu, Xiaoyan Gu, Jingzi Gu, Bo Li, Zhi Xiong, and Weiping Wang. 2019. Adversary guided asymmetric hashing for cross-modal retrieval. In ICMR 2019 - Proceedings of the 2019 ACM International Conference on Multimedia Retrieval. 159--167.

Digital Library

[6]

Jie Gui, Tongliang Liu, Zhenan Sun, Dacheng Tao, and Tieniu Tan. 2018. Fast Supervised Discrete Hashing. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 2 (2018), 490--496.

Digital Library

[7]

Mark J. Huiskes and Michael S. Lew. 2008. The MIR Flickr Retrieval Evaluation. In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval (MIR '08). Association for Computing Machinery, New York, NY, USA, 39--43.

Digital Library

[8]

Q. Jiang and W. Li. 2017. Deep Cross-Modal Hashing. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3270--3278.

[9]

Qing Yuan Jiang and Wu Jun Li. 2019. Discrete Latent Factor Model for Cross-Modal Hashing. IEEE Transactions on Image Processing, Vol. 28, 7 (2019), 3490--3501.

Digital Library

[10]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 1097--1105.

Digital Library

[11]

C. Li, C. Deng, N. Li, W. Liu, X. Gao, and D. Tao. 2018. Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4242--4251.

[12]

Yifan Li, Xuan Wang, Shuhan Qi, Chengkai Huang, Zoe L. Jiang, Qing Liao, Jian Guan, and Jiajia Zhang. 2019. Self-supervised learning-based weight adaptive hashing for fast cross-modal retrieval. International Journal of Machine Learning and Cybernetics (2019), 1--14.

[13]

Jing Liu, Changsheng Xu, and Hanqing Lu. 2010. Cross-media retrieval: state-of-the-art and open issues. International Journal of Multimedia Intelligence and Security, Vol. 1, 1 (2010), 33--52.

[14]

Jiwen Lu, Venice Erin Liong, and Jie Zhou. 2017. Deep Hashing for Scalable Image Search. IEEE Transactions on Image Processing, Vol. 26, 5 (may 2017), 2352--2367. https://doi.org/10.1109/TIP.2017.2678163

Digital Library

[15]

Henning Muller and Devrim Unay. 2017. Retrieval From and Understanding of Large-Scale Multi-modal Medical Datasets: A Review. IEEE Transactions on Multimedia, Vol. 19, 9 (2017), 2093--2104. https://doi.org/10.1109/TMM.2017.2729400

Digital Library

[16]

Di Wang, Xinbo Gao, Xiumei Wang, and Lihuo He. 2015. Semantic topic multimodal hashing for cross-media retrieval. In IJCAI International Joint Conference on Artificial Intelligence, Vol. 2015-Janua. 3890--3896.

[17]

Jun Wang, Sanjiv Kumar, and Shih Fu Chang. 2012. Semi-Supervised Hashing for Large-Scale Search. IEEE Trans Pattern Anal Mach Intell, Vol. 34, 12 (2012), 2393--2406.

Digital Library

[18]

Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral Hashing. In Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou (Eds.). Curran Associates, Inc., 1753--1760.

[19]

Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise relationship guided deep hashing for cross-modal retrieval. In 31st AAAI Conference on Artificial Intelligence, AAAI 2017. 1618--1625.

[20]

Mingsheng Long Yue Cao and Jianmin Wang. 2017. Correlation Hashing Network for Efficient Cross-Modal Retrieval. In Proceedings of the British Machine Vision Conference (BMVC). BMVA Press, 128.1--128.12.

[21]

Jinglin Zhang, Pu Liu, Feng Zhang, and Qianqian Song. 2018. CloudNet: Ground-Based Cloud Classification With Deep Convolutional Neural Network. Geophysical Research Letters, Vol. 45, 16 (2018), 8665--8672.

[22]

Jian Zhang, Yuxin Peng, and Mingkuan Yuan. 2020. SCH-GAN: Semi-Supervised Cross-Modal Hashing by Generative Adversarial Network. IEEE Transactions on Cybernetics, Vol. 50, 2 (feb 2020), 489--502.

[23]

Shifeng Zhang, Jianmin Li, and Bo Zhang. 2019. Joint Cluster Unary Loss for efficient cross-modal hashing. In ICMR 2019 - Proceedings of the 2019 ACM International Conference on Multimedia Retrieval. 212--216.

Digital Library

[24]

Fang Zhao, Yongzhen Huang, Liang Wang, and Tieniu Tan. 2015. Deep semantic ranking based hashing for multi-label image retrieval. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 07--12-June. IEEE, 1556--1564.

Cited By

Jiang KWong WFang XLi JQin JXie S(2025)Random Online Hashing for Cross-Modal RetrievalIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.333097536:1(677-691)Online publication date: Jan-2025
https://doi.org/10.1109/TNNLS.2023.3330975
Guo XZhang HLiu LLiu DLu XMeng H(2025)Primary Code Guided Targeted Attack against Cross-modal Hashing RetrievalIEEE Transactions on Multimedia10.1109/TMM.2024.352169727(312-326)Online publication date: 2025
https://doi.org/10.1109/TMM.2024.3521697
Jiang ZWeng ZLi RZhuang HLin Z(2025)Online weighted hashing for cross-modal retrievalPattern Recognition10.1016/j.patcog.2024.111232161(111232)Online publication date: May-2025
https://doi.org/10.1016/j.patcog.2024.111232
Show More Cited By

Index Terms

Deep Adversarial Discrete Hashing for Cross-Modal Retrieval

Recommendations

Discrete Fusion Adversarial Hashing for cross-modal retrieval
Abstract
Deep cross-modal hashing enables a flexible and efficient way for large-scale cross-modal retrieval. Existing cross-modal retrieval methods based on deep hashing aim to learn the unified hashing representation for different modalities ...
Supervised Discriminative Discrete Hashing for Cross-Modal Retrieval
Advanced Data Mining and Applications
Abstract
With the growing interest in cross-modal retrieval technology, cross-modal hashing has become a mainstream trend for comparing and searching between different modalities. However, when faced with multi-label information, existing research has ... $^{}$ $^{}$
Semi-supervised discrete hashing for efficient cross-modal retrieval
Abstract
Cross-modal hashing has recently gained significant popularity to facilitate multimedia retrieval across different modalities. Since the acquisition of large-scale labeled training data are very labor intensive, most supervised cross-modal hashing ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval

June 2020

605 pages

ISBN:9781450370875

DOI:10.1145/3372278

General Chairs:
Cathal Gurrin
Dublin City University, Ireland
,
Björn Þór Jónsson
IT University of Copenhagen, Denmark
,
Noriko Kando
National Institute of Informatics, Tokyo
,
Program Chairs:
Klaus Schoeffmann
Klagenfurt University, Austria
,
Phoebe Chen
La Trobe University, Australia
,
Noel E. O'Connor
Dublin City University, Ireland

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Natural Science Foundation of China

Conference

ICMR '20

Sponsor:

SIGMM

ICMR '20: International Conference on Multimedia Retrieval

June 8 - 11, 2020

Dublin, Ireland

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

76
Total Citations
View Citations
859
Total Downloads

Downloads (Last 12 months)124
Downloads (Last 6 weeks)7

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jiang KWong WFang XLi JQin JXie S(2025)Random Online Hashing for Cross-Modal RetrievalIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.333097536:1(677-691)Online publication date: Jan-2025
https://doi.org/10.1109/TNNLS.2023.3330975
Guo XZhang HLiu LLiu DLu XMeng H(2025)Primary Code Guided Targeted Attack against Cross-modal Hashing RetrievalIEEE Transactions on Multimedia10.1109/TMM.2024.352169727(312-326)Online publication date: 2025
https://doi.org/10.1109/TMM.2024.3521697
Jiang ZWeng ZLi RZhuang HLin Z(2025)Online weighted hashing for cross-modal retrievalPattern Recognition10.1016/j.patcog.2024.111232161(111232)Online publication date: May-2025
https://doi.org/10.1016/j.patcog.2024.111232
Wu LQin QHou JDai JHuang LZhang W(2025)Deep multi-similarity hashing via label-guided network for cross-modal retrievalNeurocomputing10.1016/j.neucom.2024.128830616(128830)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.128830
Zou QCheng SDu AChen J(2024)Text-Enhanced Graph Attention Hashing for Cross-Modal RetrievalEntropy10.3390/e2611091126:11(911)Online publication date: 27-Oct-2024
https://doi.org/10.3390/e26110911
Chen BWu ZLiu YZeng BLu GZhang ZLarson K(2024)Enhancing cross-modal retrieval via visual-textual prompt hashingProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/69(623-631)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/69
Liu KGong YCao YRen ZPeng DSun YLarson K(2024)Dual semantic fusion hashing for multi-label cross-modal retrievalProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/505(4569-4577)Online publication date: 3-Aug-2024
https://dl.acm.org/doi/10.24963/ijcai.2024/505
Wang TLi FZhu LLi JZhang ZShen H(2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 2-Mar-2024
https://dl.acm.org/doi/10.1145/3650205
Zhang ZChen YLi TPei L(2024)Dual-Pathway Deep Hashing-Based Adversarial Learning for Cross-Modal RetrievalInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142451017038:09Online publication date: 29-Jun-2024
https://doi.org/10.1142/S0218001424510170
Huang JKang PFang XHan NXie SGao H(2024)Efficient Discriminative Hashing for Cross-Modal RetrievalIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2024.337361254:6(3865-3878)Online publication date: Jun-2024
https://doi.org/10.1109/TSMC.2024.3373612
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten