skip to main content
10.1145/3372278.3390711acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Deep Adversarial Discrete Hashing for Cross-Modal Retrieval

Published: 08 June 2020 Publication History

Abstract

Cross-modal hashing has received widespread attentions on cross-modal retrieval task due to its superior retrieval efficiency and low storage cost. However, most existing cross-modal hashing methods learn binary codes directly from multimedia data, which cannot fully utilize the semantic knowledge of the data. Furthermore, they cannot learn the ranking based similarity relevance of data points with multi-label. And they usually use a relax constraint of hash code which causes non-negligible quantization loss in the optimization. In this paper, a hashing method called Deep Adversarial Discrete Hashing (DADH) is proposed to address these issues for cross-modal retrieval. The proposed method uses adversarial training to learn features across modalities and ensure the distribution consistency of feature representations across modalities. We also introduce a weighted cosine triplet constraint which can make full use of semantic knowledge from the multi-label to ensure the precise ranking relevance of item pairs. In addition, we use a discrete hashing strategy to learn the discrete binary codes without relaxation, by which the semantic knowledge from label in the hash codes can be preserved while the quantization loss can be minimized. Ablation experiments and comparison experiments on two cross-modal databases show that the proposed DADH improves the performance and outperforms several state-of-the-art hashing methods for cross-modal retrieval.

References

[1]
Cong Bai, Ling Huang, Xiang Pan, Jianwei Zheng, and Shengyong Chen. 2018a. Optimization of deep convolutional neural network for large scale image retrieval. Neurocomputing, Vol. 303 (2018), 60 -- 67.
[2]
Cong Bai, Jia nan Chen, Ling Huang, Kidiyo Kpalma, and Shengyong Chen. 2018b. Saliency-based multi-feature modeling for semantic image retrieval. Journal of Visual Communication and Image Representation, Vol. 50 (2018), 199 -- 204.
[3]
Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the Devil in the Details: Delving Deep into Convolutional Nets. In Proceedings of the British Machine Vision Conference. BMVA Press.
[4]
Tat Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In CIVR 2009 - Proceedings of the ACM International Conference on Image and Video Retrieval. 368--375.
[5]
Wen Gu, Xiaoyan Gu, Jingzi Gu, Bo Li, Zhi Xiong, and Weiping Wang. 2019. Adversary guided asymmetric hashing for cross-modal retrieval. In ICMR 2019 - Proceedings of the 2019 ACM International Conference on Multimedia Retrieval. 159--167.
[6]
Jie Gui, Tongliang Liu, Zhenan Sun, Dacheng Tao, and Tieniu Tan. 2018. Fast Supervised Discrete Hashing. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 2 (2018), 490--496.
[7]
Mark J. Huiskes and Michael S. Lew. 2008. The MIR Flickr Retrieval Evaluation. In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval (MIR '08). Association for Computing Machinery, New York, NY, USA, 39--43.
[8]
Q. Jiang and W. Li. 2017. Deep Cross-Modal Hashing. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3270--3278.
[9]
Qing Yuan Jiang and Wu Jun Li. 2019. Discrete Latent Factor Model for Cross-Modal Hashing. IEEE Transactions on Image Processing, Vol. 28, 7 (2019), 3490--3501.
[10]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.). Curran Associates, Inc., 1097--1105.
[11]
C. Li, C. Deng, N. Li, W. Liu, X. Gao, and D. Tao. 2018. Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4242--4251.
[12]
Yifan Li, Xuan Wang, Shuhan Qi, Chengkai Huang, Zoe L. Jiang, Qing Liao, Jian Guan, and Jiajia Zhang. 2019. Self-supervised learning-based weight adaptive hashing for fast cross-modal retrieval. International Journal of Machine Learning and Cybernetics (2019), 1--14.
[13]
Jing Liu, Changsheng Xu, and Hanqing Lu. 2010. Cross-media retrieval: state-of-the-art and open issues. International Journal of Multimedia Intelligence and Security, Vol. 1, 1 (2010), 33--52.
[14]
Jiwen Lu, Venice Erin Liong, and Jie Zhou. 2017. Deep Hashing for Scalable Image Search. IEEE Transactions on Image Processing, Vol. 26, 5 (may 2017), 2352--2367. https://doi.org/10.1109/TIP.2017.2678163
[15]
Henning Muller and Devrim Unay. 2017. Retrieval From and Understanding of Large-Scale Multi-modal Medical Datasets: A Review. IEEE Transactions on Multimedia, Vol. 19, 9 (2017), 2093--2104. https://doi.org/10.1109/TMM.2017.2729400
[16]
Di Wang, Xinbo Gao, Xiumei Wang, and Lihuo He. 2015. Semantic topic multimodal hashing for cross-media retrieval. In IJCAI International Joint Conference on Artificial Intelligence, Vol. 2015-Janua. 3890--3896.
[17]
Jun Wang, Sanjiv Kumar, and Shih Fu Chang. 2012. Semi-Supervised Hashing for Large-Scale Search. IEEE Trans Pattern Anal Mach Intell, Vol. 34, 12 (2012), 2393--2406.
[18]
Yair Weiss, Antonio Torralba, and Rob Fergus. 2009. Spectral Hashing. In Advances in Neural Information Processing Systems 21, D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou (Eds.). Curran Associates, Inc., 1753--1760.
[19]
Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise relationship guided deep hashing for cross-modal retrieval. In 31st AAAI Conference on Artificial Intelligence, AAAI 2017. 1618--1625.
[20]
Mingsheng Long Yue Cao and Jianmin Wang. 2017. Correlation Hashing Network for Efficient Cross-Modal Retrieval. In Proceedings of the British Machine Vision Conference (BMVC). BMVA Press, 128.1--128.12.
[21]
Jinglin Zhang, Pu Liu, Feng Zhang, and Qianqian Song. 2018. CloudNet: Ground-Based Cloud Classification With Deep Convolutional Neural Network. Geophysical Research Letters, Vol. 45, 16 (2018), 8665--8672.
[22]
Jian Zhang, Yuxin Peng, and Mingkuan Yuan. 2020. SCH-GAN: Semi-Supervised Cross-Modal Hashing by Generative Adversarial Network. IEEE Transactions on Cybernetics, Vol. 50, 2 (feb 2020), 489--502.
[23]
Shifeng Zhang, Jianmin Li, and Bo Zhang. 2019. Joint Cluster Unary Loss for efficient cross-modal hashing. In ICMR 2019 - Proceedings of the 2019 ACM International Conference on Multimedia Retrieval. 212--216.
[24]
Fang Zhao, Yongzhen Huang, Liang Wang, and Tieniu Tan. 2015. Deep semantic ranking based hashing for multi-label image retrieval. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 07--12-June. IEEE, 1556--1564.

Cited By

View all
  • (2025)Random Online Hashing for Cross-Modal RetrievalIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.333097536:1(677-691)Online publication date: Jan-2025
  • (2025)Primary Code Guided Targeted Attack against Cross-modal Hashing RetrievalIEEE Transactions on Multimedia10.1109/TMM.2024.352169727(312-326)Online publication date: 2025
  • (2025)Online weighted hashing for cross-modal retrievalPattern Recognition10.1016/j.patcog.2024.111232161(111232)Online publication date: May-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval
June 2020
605 pages
ISBN:9781450370875
DOI:10.1145/3372278
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. adversarial learning
  2. cross-modal retrieval
  3. discrete hashing

Qualifiers

  • Research-article

Funding Sources

  • Natural Science Foundation of China

Conference

ICMR '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)124
  • Downloads (Last 6 weeks)7
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Random Online Hashing for Cross-Modal RetrievalIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2023.333097536:1(677-691)Online publication date: Jan-2025
  • (2025)Primary Code Guided Targeted Attack against Cross-modal Hashing RetrievalIEEE Transactions on Multimedia10.1109/TMM.2024.352169727(312-326)Online publication date: 2025
  • (2025)Online weighted hashing for cross-modal retrievalPattern Recognition10.1016/j.patcog.2024.111232161(111232)Online publication date: May-2025
  • (2025)Deep multi-similarity hashing via label-guided network for cross-modal retrievalNeurocomputing10.1016/j.neucom.2024.128830616(128830)Online publication date: Feb-2025
  • (2024)Text-Enhanced Graph Attention Hashing for Cross-Modal RetrievalEntropy10.3390/e2611091126:11(911)Online publication date: 27-Oct-2024
  • (2024)Enhancing cross-modal retrieval via visual-textual prompt hashingProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/69(623-631)Online publication date: 3-Aug-2024
  • (2024)Dual semantic fusion hashing for multi-label cross-modal retrievalProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/505(4569-4577)Online publication date: 3-Aug-2024
  • (2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 2-Mar-2024
  • (2024)Dual-Pathway Deep Hashing-Based Adversarial Learning for Cross-Modal RetrievalInternational Journal of Pattern Recognition and Artificial Intelligence10.1142/S021800142451017038:09Online publication date: 29-Jun-2024
  • (2024)Efficient Discriminative Hashing for Cross-Modal RetrievalIEEE Transactions on Systems, Man, and Cybernetics: Systems10.1109/TSMC.2024.337361254:6(3865-3878)Online publication date: Jun-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media