Dual discriminant adversarial cross-modal retrieval

He, Pei; Wang, Meng; Tu, Ding; Wang, Zhuo

doi:10.1007/s10489-022-03653-7

Dual discriminant adversarial cross-modal retrieval

Published: 06 June 2022

Volume 53, pages 4257–4267, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Pei He¹,
Meng Wang ORCID: orcid.org/0000-0001-8388-0055²,
Ding Tu² &
…
Zhuo Wang¹

606 Accesses
1 Altmetric
Explore all metrics

Abstract

In order to improve the accuracy of cross-modal retrieval tasks and achieve flexible retrieval between different modalities, we propose a Dual Discriminant Adversarial cross-modal Retrieval (DDAC) method in this paper. First, DDAC integrates adversarial learning and minimization of feature projection distances and introduces label information in it. It can eliminate the same semantic heterogeneity between modalities while maintaining the distinguishability of different semantic features between modalities. Then, cosine distance is used to minimize and maximize the inter-modal distance of features with the same and different labels respectively to solve the inter-modal discrimination problem. Different from the general method, DDAC performs dual discrimination in the label space and solves the intra-modal discrimination problem from two perspectives of probability distribution and distance. Extensive experiments carried out on three public datasets validate that the proposed DDAC outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic consistent adversarial cross-modal retrieval exploiting semantic similarity

Article 21 February 2019

Adversarial Learning for Cross-Modal Retrieval with Wasserstein Distance

Modality Consistent Generative Adversarial Network for Cross-Modal Retrieval

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

Wang K, Yin Q, Wang W, Wu S, Wang L (2016) A Comprehensive Survey on Cross-modal Retrieval. arXiv:abs/1607.06215
Hardoon D, Szedmák S, Shawe-Taylor J (2004) Canonical correlation analysis: An overview with application to learning methods. Neural Comput 16:2639–2664
Article MATH Google Scholar
Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: A real-world web image database from National University of Singapore. CIVR ’09
Chen W, Liu Y, Bakker EM, Lew MS (2021) Integrating Information Theory and Adversarial Learning for Cross-modal Retrieval. arXiv:abs/2104.04991
Zhang X, Lai H, Feng J (2018) Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval. ECCV
Zhang Y, Feng Y, Liu D, Shang J, Qiang B (2020) FRWCAE: Joint faster-RCNN and Wasserstein convolutional auto-encoder for instance retrieval. Appl Intell 50:2208–2221
Article Google Scholar
Andrew G, Arora R, Bilmes J, Livescu K (2013) Deep Canonical Correlation Analysis. ICML
He X, Peng Y, Xi-e L (2019) A new benchmark and approach for fine-grained cross-media retrieval. In: Proceedings of the 27th ACM international conference on multimedia
Wei Y, Zhao Y, Lu C, Wei S, Liu L, Zhu Z, Yan S (2017) Cross-Modal Retrieval with CNN visual features: A new baseline. IEEE Trans Cybern 47:449–460
Google Scholar
Wang C, Yang H, Meinel C (2015) Deep semantic mapping for cross-modal retrieval. In: 2015 IEEE 27th international conference on tools with artificial intelligence (ICTAI), pp 234–241
Wang X, Hu P, Zhen L, Peng D (2021) DRSL: Deep relational similarity learning for cross-modal retrieval. Inf Sci 546:298– 311
Article Google Scholar
Castellano G, Fanelli A, Sforza G, Torsello MA (2015) Shape annotation for intelligent image retrieval. Appl Intell 44:179–195
Article Google Scholar
Wang B, Yang Y, Xu X, Hanjalic A, Shen HT (2017) Adversarial cross-modal retrieval. In: Proceedings of the 25th ACM international conference on multimedia
Zhen L, Hu P, Wang X, Peng D (2019) Deep supervised cross-modal retrieval. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10386–10395
Peng Y, Qi J, Yuan Y (2019) CM-GANS. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 15:1–24
Article Google Scholar
Li C, Deng C, Li N, Liu W, Gao X, Tao D (2018) Self-supervised adversarial hashing networks for cross-modal retrieval. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 4242–4251
Bai C, Zeng C, Ma Q, Zhang J, Chen S (2020) Deep Adversarial discrete hashing for Cross-Modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval
Simonyan K, Zisserman A (2015) Very deep convolutional networks for Large-Scale image recognition. CoRR, arXiv:abs/1409.1556
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed Representations of Words and Phrases and their Compositionality. NIPS
Kang P, Lin Z, Yang Z, Fang X, Bronstein A, Li Q, Liu W (2021) Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval. Appl Intell, pp 1–22
Kingma DP, Ba J (2015) Adam: A method for stochastic optimization. CoRR, arXiv:abs/1412.6980
Rashtchian C, Young P, Hodosh M, Hockenmaier J (2010) Collecting Image Annotations Using Amazon’s Mechanical Turk. Mturk@HLT-NAACL
Zhai X, Peng Y, Xiao J (2014) Learning Cross-Media joint representation with sparse and semisupervised regularization. IEEE Trans Circuits Syst Video Technol 24:965–978
Article Google Scholar
Zhang C, Song J, Zhu X, Zhu L, Zhang S (2021) HCMSL: Hybrid cross-modal similarity learning for cross-modal retrieval. ACM Trans Multimedia Comput Commun Appl (TOMM) 17:1–22
Google Scholar
Peng Y, Qi J, Huang X, Yuan Y (2018) CCL: Cross-modal correlation learning with multigrained fusion by hierarchical network. IEEE Trans Multimedia 20:405–420
Article Google Scholar
Wei Y, Zhao Y, Lu C, Wei S, Liu L, Zhu Z, Yan S (2017) Cross-Modal Retrieval with CNN visual features: A new baseline. IEEE Trans Cybern 47:449–460
Google Scholar
Wang W, Yang X, Ooi B, Zhang D, Zhuang Y (2015) Effective deep learning-based multi-modal retrieval. The VLDB J 25:79–101
Article Google Scholar
Li Z, Lu W, Bao E, Xing W (2015) Learning a Semantic Space by Deep Network for Cross-media Retrieval. DMS
Pereira JC, Coviello E, Doyle G, Rasiwasia N, Lanckriet G, Levy R, Vasconcelos N (2014) On the role of correlation and abstraction in Cross-Modal multimedia retrieval. IEEE Trans Pattern Anal Mach Intell 36:521–535
Article Google Scholar
Huang X, Peng Y, Yuan M (2020) MHTN: Modal-Adversarial Hybrid transfer network for Cross-Modal retrieval. IEEE Trans Cybern 50:1047–1059
Article Google Scholar
Zhou Y, Feng Y, Zhou M, Qiang B, UL, Zhu J (2021) Deep adversarial quantization network for Cross-Modal retrieval. In: ICASSP 2021 - 2021 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 4325–4329
Zhang JG, Peng Y, Yuan M (2020) SCH-GAN: Semi-Supervised Cross-Modal Hashing by generative adversarial network. IEEE Trans Cybern 50:489–502
Article Google Scholar
Song G, Wang D, Tan X (2019) Deep memory network for Cross-Modal retrieval. IEEE Transactions on Multimedia 21:1261–1275
Article Google Scholar
Wei Y, Zhao Y, Lu C, Wei S, Liu L, Zhu Z, Yan S (2017) Cross-Modal Retrieval with CNN visual features: a new baseline. IEEE Trans Cybern 47:449–460
Google Scholar

Download references

Author information

Authors and Affiliations

College of Science, Guangxi University of Science and Technology, Liuzhou, 545000, China
Pei He & Zhuo Wang
Tus College of Digit, Guangxi University of Science and Technology, Liuzhou, 545000, China
Meng Wang & Ding Tu

Authors

Pei He
View author publications
You can also search for this author inPubMed Google Scholar
Meng Wang
View author publications
You can also search for this author inPubMed Google Scholar
Ding Tu
View author publications
You can also search for this author inPubMed Google Scholar
Zhuo Wang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Meng Wang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, P., Wang, M., Tu, D. et al. Dual discriminant adversarial cross-modal retrieval. Appl Intell 53, 4257–4267 (2023). https://doi.org/10.1007/s10489-022-03653-7

Download citation

Accepted: 19 April 2022
Published: 06 June 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10489-022-03653-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual discriminant adversarial cross-modal retrieval

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Semantic consistent adversarial cross-modal retrieval exploiting semantic similarity

Adversarial Learning for Cross-Modal Retrieval with Wasserstein Distance

Modality Consistent Generative Adversarial Network for Cross-Modal Retrieval

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now