skip to main content
10.1145/3323873.3325045acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Adversary Guided Asymmetric Hashing for Cross-Modal Retrieval

Published: 05 June 2019 Publication History

Abstract

Cross-modal hashing has attracted considerable attention for large-scale multimodal retrieval task. A majority of hashing methods have been proposed for cross-modal retrieval. However, these methods inadequately focus on feature learning process and cannot fully preserve higher-ranking correlation of various item pairs as well as the multi-label semantics of each item, so that the quality of binary codes may be downgraded. To tackle these problems, in this paper, we propose a novel deep cross-modal hashing method, called Adversary Guided Asymmetric Hashing (AGAH). Specifically, it employs an adversarial learning guided multi-label attention module to enhance the feature learning part which can learn discriminative feature representations and keep the cross-modal invariability. Furthermore, in order to generate hash codes which can fully preserve the multi-label semantics of all items, we propose an asymmetric hashing method which utilizes a multi-label binary code map that can equip the hash codes with multi-label semantic information. In addition, to ensure higher-ranking correlation of all similar item pairs than those of dissimilar ones, we adopt a new triplet-margin constraint and a cosine quantization technique for Hamming space similarity preservation. Extensive empirical studies show that AGAH outperforms several state-of-the-art methods for cross-modal retrieval.

References

[1]
Michael M. Bronstein, Alexander M. Bronstein, Fabrice Michel, and Nikos Paragios. 2010. Data fusion through cross-modality metric learning using similarity-sensitive hashing. In CVPR. 3594--3601.
[2]
Yue Cao, Mingsheng Long, and Jianmin Wang. 2017. Correlation Hashing Network for Efficient Cross-Modal Retrieval. In BMVC.
[3]
Ken Chatfield, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2014. Return of the Devil in the Details: Delving Deep into Convolutional Nets. In BMVC.
[4]
Zhen-Duo Chen, Wan-Jin Yu, Chuan-Xiang Li, Liqiang Nie, and Xin-Shun Xu. 2018. Dual Deep Neural Networks Cross-Modal Hashing. In AAAI. 274--281.
[5]
Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, and Yantao Zheng. 2009. NUS-WIDE: a real-world web image database from National University of Singapore. In ACM CIVR.
[6]
Cheng Da, Shibiao Xu, Kun Ding, Gaofeng Meng, Shiming Xiang, and Chunhong Pan. 2017. AMVH: Asymmetric Multi-Valued hashing. In CVPR. 898--906.
[7]
Chuang Gan, Tianbao Yang, and Boqing Gong. 2016. Learning Attributes Equals Multi-Source Domain Generalization. In CVPR. 87--97.
[8]
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In NeurIPS. 2672--2680.
[9]
Li He, Xing Xu, Huimin Lu, Yang Yang, Fumin Shen, and Heng Tao Shen. 2017. Unsupervised cross-modal retrieval through adversarial learning. In IEEE ICME . 1153--1158.
[10]
Mark J. Huiskes and Michael S. Lew. 2008. The MIR flickr retrieval evaluation. In ACM SIGMM. 39--43.
[11]
Qing-Yuan Jiang and Wu-Jun Li. 2017. Deep Cross-Modal Hashing. In CVPR. 3270--3278.
[12]
Qing-Yuan Jiang and Wu-Jun Li. 2018. Asymmetric Deep Supervised Hashing. In AAAI. 3342--3349.
[13]
Shaishav Kumar and Raghavendra Udupa. 2011. Learning Hash Functions for Cross-View Similarity Search. In IJCAI. 1360--1365.
[14]
Chao Li, Cheng Deng, Ning Li, Wei Liu, Xinbo Gao, and Dacheng Tao. 2018. Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval. In CVPR. 4242--4251.
[15]
Zijia Lin, Guiguang Ding, Mingqing Hu, and Jianmin Wang. 2015. Semantics-preserving hashing for cross-view retrieval. In CVPR. 3864--3872.
[16]
Wu Liu, Huadong Ma, Heng Qi, Dong Zhao, and Zhineng Chen. 2017. Deep learning hashing for mobile visual search. EURASIP J. Image and Video Processing, Vol. 2017 (2017), 17.
[17]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael S. Bernstein, Alexander C. Berg, and Fei-Fei Li. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, Vol. 115, 3 (2015), 211--252.
[18]
Fumin Shen, Xin Gao, Li Liu, Yang Yang, and Heng Tao Shen. 2017. Deep Asymmetric Pairwise Hashing. In ACM MM. 1522--1530.
[19]
Jingkuan Song, Tao He, Lianli Gao, Xing Xu, Alan Hanjalic, and Heng Tao Shen. 2018. Binary Generative Adversarial Networks for Image Retrieval. In AAAI. 394--401.
[20]
Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, and Heng Tao Shen. 2017. Adversarial Cross-Modal Retrieval. In ACM MM. 154--162.
[21]
Di Wang, Xinbo Gao, Xiumei Wang, and Lihuo He. 2015. Semantic Topic Multimodal Hashing for Cross-Media Retrieval. In IJCAI. 3890--3896.
[22]
Jingdong Wang, Ting Zhang, Jingkuan Song, Nicu Sebe, and Heng Tao Shen. 2018. A Survey on Learning to Hash. IEEE TPAMI, Vol. 40, 4 (2018), 769--790.
[23]
Wei Wang, Beng Chin Ooi, Xiaoyan Yang, Dongxiang Zhang, and Yueting Zhuang. 2014. Effective Multi-Modal Retrieval based on Stacked Auto-Encoders. PVLDB, Vol. 7, 8 (2014), 649--660.
[24]
Botong Wu, Qiang Yang, Wei-Shi Zheng, Yizhou Wang, and Jingdong Wang. 2015. Quantized Correlation Hashing for Fast Cross-Modal Search. In IJCAI. 3946--3952.
[25]
Xing Xu, Fumin Shen, Yang Yang, Heng Tao Shen, and Xuelong Li. 2017. Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval. IEEE TIP, Vol. 26, 5 (2017), 2494--2507.
[26]
Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval. In AAAI. 1618--1625.
[27]
Zhaoda Ye and Yuxin Peng. 2018. Multi-Scale Correlation for Sequential Cross-modal Hashing Learning. In ACM MM . 852--860.
[28]
Zhou Yu, Fei Wu, Yi Yang, Qi Tian, Jiebo Luo, and Yueting Zhuang. 2014. Discriminative coupled dictionary hashing for fast cross-media retrieval. In ACM SIGIR. 395--404.
[29]
Dongqing Zhang and Wu-Jun Li. 2014. Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization. In AAAI. 2177--2183.
[30]
Jian Zhang and Yuxin Peng. 2018. Query-Adaptive Image Retrieval by Deep-Weighted Hashing. IEEE Trans. Multimedia, Vol. 20, 9 (2018), 2400--2414.
[31]
Jian Zhang, Yuxin Peng, and Mingkuan Yuan. 2018b. SCH-GAN: Semi-supervised Cross-modal Hashing by Generative Adversarial Network. CoRR, Vol. abs/1802.02488 (2018). arxiv: 1802.02488
[32]
Jian Zhang, Yuxin Peng, and Mingkuan Yuan. 2018c. Unsupervised Generative Adversarial Cross-Modal Hashing. In AAAI. 539--546.
[33]
Xi Zhang, Hanjiang Lai, and Jiashi Feng. 2018a. Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval. In ECCV. 614--629.

Cited By

View all
  • (2025)Cross-Modal 3D Shape Retrieval via Heterogeneous Dynamic Graph RepresentationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.352444047:4(2370-2387)Online publication date: Apr-2025
  • (2025)Adversarial Contrastive Autoencoder With Shared Attention for Audio-Visual Correlation LearningIEEE Access10.1109/ACCESS.2025.354661013(39753-39764)Online publication date: 2025
  • (2025)Semantic decomposition and enhancement hashing for deep cross-modal retrievalPattern Recognition10.1016/j.patcog.2024.111225160:COnline publication date: 1-Apr-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval
June 2019
427 pages
ISBN:9781450367653
DOI:10.1145/3323873
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2019

Permissions

Request permissions for this article.

Check for updates

Badges

  • Best Student Paper

Author Tags

  1. adversary learning
  2. asymmetric hashing
  3. cross-modal hashing
  4. multimodal retrieval

Qualifiers

  • Research-article

Funding Sources

  • Chinese Academy of Sciences

Conference

ICMR '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)72
  • Downloads (Last 6 weeks)8
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Cross-Modal 3D Shape Retrieval via Heterogeneous Dynamic Graph RepresentationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.352444047:4(2370-2387)Online publication date: Apr-2025
  • (2025)Adversarial Contrastive Autoencoder With Shared Attention for Audio-Visual Correlation LearningIEEE Access10.1109/ACCESS.2025.354661013(39753-39764)Online publication date: 2025
  • (2025)Semantic decomposition and enhancement hashing for deep cross-modal retrievalPattern Recognition10.1016/j.patcog.2024.111225160:COnline publication date: 1-Apr-2025
  • (2025)Deep multi-similarity hashing via label-guided network for cross-modal retrievalNeurocomputing10.1016/j.neucom.2024.128830616(128830)Online publication date: Feb-2025
  • (2025)Enhancing semantic audio-visual representation learning with supervised multi-scale attentionPattern Analysis and Applications10.1007/s10044-025-01414-z28:2Online publication date: 11-Feb-2025
  • (2025)Adversarial Graph Convolutional Network Hashing for Cross-Modal RetrievalWeb and Big Data. APWeb-WAIM 2024 International Workshops10.1007/978-981-96-0055-7_6(69-80)Online publication date: 31-Jan-2025
  • (2024)Text-Enhanced Graph Attention Hashing for Cross-Modal RetrievalEntropy10.3390/e2611091126:11(911)Online publication date: 27-Oct-2024
  • (2024)Multi-Grained Similarity Preserving and Updating for Unsupervised Cross-Modal HashingApplied Sciences10.3390/app1402087014:2(870)Online publication date: 19-Jan-2024
  • (2024)DAC: 2D-3D Retrieval with Noisy Labels via Divide-and-Conquer Alignment and CorrectionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680859(4217-4226)Online publication date: 28-Oct-2024
  • (2024)Anchor-aware Deep Metric Learning for Audio-visual RetrievalProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658067(211-219)Online publication date: 30-May-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media