Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval

Kang, Peipei; Lin, Zehang; Yang, Zhenguo; Fang, Xiaozhao; Bronstein, Alexander M.; Li, Qing; Liu, Wenyin

doi:10.1007/s10489-021-02308-3

Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval

Published: 22 April 2021

Volume 52, pages 33–54, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Peipei Kang ORCID: orcid.org/0000-0001-8637-051X^1,2,
Zehang Lin³,
Zhenguo Yang¹,
Xiaozhao Fang⁴,
Alexander M. Bronstein²,
Qing Li³ &
…
Wenyin Liu¹

1020 Accesses
6 Citations
Explore all metrics

Abstract

Cross-modal retrieval aims to retrieve related items across different modalities, for example, using an image query to retrieve related text. The existing deep methods ignore both the intra-modal and inter-modal intra-class low-rank structures when fusing various modalities, which decreases the retrieval performance. In this paper, two deep models (denoted as ILCMR and Semi-ILCMR) based on intra-class low-rank regularization are proposed for supervised and semi-supervised cross-modal retrieval, respectively. Specifically, ILCMR integrates the image network and text network into a unified framework to learn a common feature space by imposing three regularization terms to fuse the cross-modal data. First, to align them in the label space, we utilize semantic consistency regularization to convert the data representations to probability distributions over the classes. Second, we introduce an intra-modal low-rank regularization, which encourages the intra-class samples that originate from the same space to be more relevant in the common feature space. Third, an inter-modal low-rank regularization is applied to reduce the cross-modal discrepancy. To enable the low-rank regularization to be optimized using automatic gradients during network back-propagation, we propose the rank-r approximation and specify the explicit gradients for theoretical completeness. In addition to the three regularization terms that rely on label information incorporated by ILCMR, we propose Semi-ILCMR in the semi-supervised regime, which introduces a low-rank constraint before projecting the general representations into the common feature space. Extensive experiments on four public cross-modal datasets demonstrate the superiority of ILCMR and Semi-ILCMR over other state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning with Noisy Correspondence

Article 13 April 2024

Learning to Prompt for Vision-Language Models

Article 31 July 2022

Robust zero-shot discrete hashing with noisy labels for cross-modal retrieval

Article 13 April 2024

References

Cao W, Lin Q, He Z, He Z (2019) Hybrid representation learning for cross-modal retrieval. Neurocomputing 345:45–57
Article Google Scholar
Catelli R, Casola V, De Pietro G, Fujita H, Esposito M (2021) Combining contextualized word representation and sub-document level analysis through bi-lstm+ crf architecture for clinical de-identification. Knowl-Based Syst 213:106649
Article Google Scholar
Cheng Q, Gu X (2020) Bridging multimedia heterogeneity gap via graph representation learning for cross-modal retrieval. Neural Networks
Chua T S, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from national university of singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp 1–9. https://doi.org/10.1145/1646396.1646452
Deng T, Ye D, Ma R, Fujita H, Xiong L (2020) Low-rank local tangent space embedding for subspace clustering. Inf Sci 508:1–21
Article MathSciNet Google Scholar
Devlin J, Chang M W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp 4171–4186
Ding Z, Fu Y (2018) Deep transfer low-rank coding for cross-domain learning. IEEE Trans Neural Netw Learn Syst 30(6):1768–1779. https://doi.org/10.1109/TNNLS.2018.2874567
Article MathSciNet Google Scholar
Ding Z, Shao M, Fu Y (2018) Generative zero-shot learning via low-rank embedded semantic dictionary. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2018.2867870
Eckart C, Young G (1939) A principal axis transformation for non-hermitian matrices. Bull Am Math Soc 45(2):118–121
Article MathSciNet Google Scholar
Esposito M, Damiano E, Minutolo A, De Pietro G, Fujita H (2020) Hybrid query expansion using lexical resources and word embeddings for sentence retrieval in question answering. Inf Sci 514:88–105
Article Google Scholar
Fang X, Han N, Wu J, Xu Y, Yang J, Wong W K, Li X (2018) Approximate low-rank projection learning for feature extraction. IEEE Trans Neural Netw Learn Syst 29(11):5228–5241. https://doi.org/10.1109/TNNLS.2018.2796133
Article MathSciNet Google Scholar
Fei L, Xu Y, Fang X, Yang J (2017) Low rank representation with adaptive distance penalty for semi-supervised subspace classification. Pattern Recogn 67:252–262. https://doi.org/10.1016/j.patcog.2017.02.017
Article Google Scholar
Feng F, Wang X, Li R (2014) Cross-modal retrieval with correspondence autoencoder. In: Proceedings of the 22nd ACM International Conference on Multimedia. ACM, pp 7–16. https://doi.org/10.1145/2647868.2654902
Golub G H, Hoffman A, Stewart G W (1987) A generalization of the eckart-young-mirsky matrix approximation theorem. Linear Algebra Appl 88:317–327
Article MathSciNet Google Scholar
Hardoon D R, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12):2639–2664. https://doi.org/10.1162/0899766042321814
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
He Y, Xiang S, Kang C, Wang J, Pan C (2016) Cross-modal retrieval via deep and bidirectional representation learning. IEEE Trans Multimed 18(7):1363–1377. https://doi.org/10.1109/TMM.2016.2558463
Article Google Scholar
Hu P, Zhen L, Peng D, Liu P (2019) Scalable deep multimodal learning for cross-modal retrieval. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 635–644
Kang P, Fang X, Zhang W, Teng S, Fei L, Xu Y, Zheng Y (2018) Supervised group sparse representation via intra-class low-rank constraint. In: Chinese Conference on Biometric Recognition. Springer, pp 206–213. https://doi.org/10.1007/978-3-319-97909-0_22
Kang P, Lin Z, Yang Z, Fang X, Li Q, Liu W (2019) Deep semantic space with intra-class low-rank constraint for cross-modal retrieval. In: Proceedings of the 2019 on International Conference on Multimedia Retrieval. ACM, pp 226–234. https://doi.org/10.1145/3323873.3325029
Lezama J, Qiu Q, Musé P, Sapiro G (2018) Ole: Orthogonal low-rank embedding-a plug and play geometric loss for deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8109–8118
Li C, Deng C, Wang L, Xie D, Liu X (2019) Coupled cyclegan: Unsupervised hashing network for cross-modal retrieval. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, pp 176–183
Li K, Qi G J, Ye J, Hua K A (2016) Linear subspace ranking hashing for cross-modal retrieval. IEEE Trans Pattern Anal Mach Intell 39(9):1825–1838. https://doi.org/10.1109/TPAMI.2016.2610969
Article Google Scholar
Liu H, Feng Y, Zhou M, Qiang B (2020) Semantic ranking structure preserving for cross-modal retrieval. Appl Intell:1–11
Maaten Lvd, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9:2579–2605
MATH Google Scholar
Peng Y, Huang X, Qi J (2016) Cross-media shared representation by hierarchical learning with multiple deep networks. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp 3846–3853
Peng Y, Huang X, Zhao Y (2017) An overview of cross-media retrieval: concepts, methodologies, benchmarks, and challenges. IEEE Trans Circ Syst Video Technol 28 (9):2372–2385. https://doi.org/10.1109/TCSVT.2017.2705068
Article Google Scholar
Peng Y, Qi J, Huang X, Yuan Y (2017) Ccl: Cross-modal correlation learning with multigrained fusion by hierarchical network. IEEE Trans Multimed 20(2):405–420. https://doi.org/10.1109/TMM.2017.2742704
Article Google Scholar
Peters M E, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp 2227–2237
Pota M, Marulli F, Esposito M, De Pietro G, Fujita H (2019) Multilingual pos tagging by a composite deep architecture based on character-level features and on-the-fly enriched word embeddings. Knowl-Based Syst 164:309–323
Article Google Scholar
Qi J, Peng Y (2018) Cross-modal bidirectional translation via reinforcement learning. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp 2630–2636
Qiang H, Wan Y, Liu Z, Xiang L, Meng X (2020) Discriminative deep asymmetric supervised hashing for cross-modal retrieval. Knowl-Based Syst 204:106188
Article Google Scholar
Qiu Q, Sapiro G (2015) Learning transformations for clustering and classification. J Mach Learn Res 16(1):187–225
MathSciNet MATH Google Scholar
Rashtchian C, Young P, Hodosh M, Hockenmaier J (2010) Collecting image annotations using amazon’s mechanical turk. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk. Association for Computational Linguistics, pp 139–147
Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet G R, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: Proceedings of the 18th ACM International Conference on Multimedia. ACM, pp 251–260. https://doi.org/10.1145/1873951.1873987
Shang F, Zhang H, Zhu L, Sun J (2019) Adversarial cross-modal retrieval based on dictionary learning. Neurocomputing 355:93–104
Article Google Scholar
Shen T H, Liu L, Yang Y, Xu X, Huang Z, Shen F, Hong R (2020) Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Trans Knowl Data Eng:1–1
Situ R, Yang Z, Lv J, Li Q, Liu W (2018) Cross-modal event retrieval: a dataset and a baseline using deep semantic learning. In: Pacific Rim Conference on Multimedia. Springer, pp 147–157. https://doi.org/10.1007/978-3-030-00767-6_14
Wang B, Yang Y, Xu X, Hanjalic A, Shen H T (2017) Adversarial cross-modal retrieval. In: Proceedings of the 25th ACM International Conference on Multimedia, pp 154–162. https://doi.org/10.1145/3123266.3123326
Wang D, Gao X B, Wang X, He L (2018) Label consistent matrix factorization hashing for large-scale cross-modal similarity search. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.2018.2861000
Wang X, Hu P, Zhen L, Peng D (2021) Drsl: Deep relational similarity learning for cross-modal retrieval. Inf Sci 546:298–311. https://doi.org/10.1016/j.ins.2020.08.009
Article Google Scholar
Wei Y, Zhao Y, Lu C, Wei S, Liu L, Zhu Z, Yan S (2016) Cross-modal retrieval with cnn visual features: a new baseline. IEEE Trans Cybern 47(2):449–460. https://doi.org/10.1109/TCYB.2016.2519449
Google Scholar
Wen J, Xu Y, Liu H (2018) Incomplete multiview spectral clustering with adaptive graph learning. IEEE Transactions on Cybernetics. https://doi.org/10.1109/TCYB.2018.2884715
Wu F, Jing X Y, Wu Z, Ji Y, Dong X, Luo X, Huang Q, Wang R (2020) Modality-specific and shared generative adversarial network for cross-modal retrieval. Pattern Recogn:107335
Xiao Q, Dai J, Luo J, Fujita H (2019) Multi-view manifold regularized learning-based method for prioritizing candidate disease mirnas. Knowl-Based Syst 175:118–129
Article Google Scholar
Xu X, He L, Lu H, Gao L, Ji Y (2019) Deep adversarial metric learning for cross-modal retrieval. World Wide Web 22(2):657–672
Article Google Scholar
Yan F, Mikolajczyk K (2015) Deep correlation for matching images and text. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3441–3450
Yang X, Jiang X, Tian C, Wang P, Zhou F, Fujita H (2020) Inverse projection group sparse representation for tumor classification: a low rank variation dictionary approach. Knowl-Based Syst 196:105768
Article Google Scholar
Yang Z, Lin Z, Kang P, Lv J, Li Q, Liu W (2020) Learning shared semantic space with correlation alignment for cross-modal event retrieval. ACM Trans Multimed Comput Commun Appl 16(1):1–22. https://doi.org/10.1145/3374754
Article Google Scholar
Zhai X, Peng Y, Xiao J (2013) Learning cross-media joint representation with sparse and semisupervised regularization. IEEE Trans Circ Syst Video Technol 24(6):965–978. https://doi.org/10.1109/TCSVT.2013.2276704
Article Google Scholar
Zhan S, Wu J, Han N, Wen J, Fang X (2019) Unsupervised feature extraction by low-rank and sparsity preserving embedding. Neural Netw 109:56–66. https://doi.org/10.1016/j.neunet.2018.10.001
Article Google Scholar
Zhang L, Ma B, Li G, Huang Q, Tian Q (2017) Generalized semi-supervised and structured subspace learning for cross-modal retrieval. IEEE Trans Multimed 20(1):128–141. https://doi.org/10.1109/TMM.2017.2723841
Article Google Scholar
Zhang X (2017) Matrix analysis and applications. Cambridge University Press
Zhen L, Hu P, Wang X, Peng D (2019) Deep supervised cross-modal retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 10394–10403
Zheng Z, Zheng L, Garrett M, Yang Y, Xu M, Shen Y D (2020) Dual-path convolutional image-text embeddings with instance loss. ACM Trans Multimed Comput Commun Appl 16(2):1–23. https://doi.org/10.1145/3383184
Article Google Scholar

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 62076073), the Guangdong Basic and Applied Basic Research Foundation (No. 2020A1515010616), the Guangdong Innovative Research Team Program (No. 2014ZT05G157).

Author information

Authors and Affiliations

School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China
Peipei Kang, Zhenguo Yang & Wenyin Liu
Computer Science Department, Technion, Haifa, Israel
Peipei Kang & Alexander M. Bronstein
Department of Computing, Hong Kong Polytechnic University, Hong Kong, China
Zehang Lin & Qing Li
Department of Automation, Guangdong University of Technology, Guangzhou, China
Xiaozhao Fang

Authors

Peipei Kang
View author publications
You can also search for this author in PubMed Google Scholar
Zehang Lin
View author publications
You can also search for this author in PubMed Google Scholar
Zhenguo Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaozhao Fang
View author publications
You can also search for this author in PubMed Google Scholar
Alexander M. Bronstein
View author publications
You can also search for this author in PubMed Google Scholar
Qing Li
View author publications
You can also search for this author in PubMed Google Scholar
Wenyin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zhenguo Yang or Wenyin Liu.

Ethics declarations

Competing interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A:: Derivation of (13)

We acquired the approximation of rank calculation as shown in (7):

$$ rank(A) \approx \rho(A,r) = \frac{{\sum}_{i=r+1}^{s} \delta_{i}}{{\sum}_{i=1}^{s} \delta_{i}} = 1- \frac{{\sum}_{i=1}^{r} \delta_{i}}{{\sum}_{i=1}^{s} \delta_{i}}, $$

and combine it with the SVD knowledge as shown in (21):

$$ A = U \varSigma V^{T} = \delta_{1} u_{1} {v_{1}^{T}} + \delta_{2} u_{2} {v_{2}^{T}} + {\dots} + \delta_{s} u_{s} {v_{s}^{T}}. $$

(21)

We acquire the derivative in terms of A by the chain rule as shown in (13):

$$ \frac{\partial rank(A)}{\partial A} \approx \frac{\partial \rho(A,r)}{\partial A} = {\sum}_{i=1}^{s} \frac{\partial \rho}{\partial \delta_{i}} \cdot \frac{\partial \delta_{i}}{\partial A}. $$

A.1 $\frac {\partial \rho }{\partial \delta _{i}}$

For i ∈ [1,r], we have the following derivation from (7):

$$ \frac{\partial \rho}{\partial \delta_{i}} = \frac{0*{\sum}_{j=1}^{s} \delta_{j} - 1*{\sum}_{j=r+1}^{s} \delta_{j}}{{({\sum}_{j=1}^{s} \delta_{j})}^{2}} = - \frac{{\sum}_{j=r+1}^{s} \delta_{j}}{{({\sum}_{j=1}^{s} \delta_{j})}^{2}},\! $$

(22)

and for i ∈ [r + 1,s], we have the following derivation from (7):

$$ \frac{\partial \rho}{\partial \delta_{i}} = \frac{1*{\sum}_{j=1}^{s} \delta_{j} - 1*{\sum}_{j=r+1}^{s} \delta_{j}}{{({\sum}_{j=1}^{s} \delta_{j})}^{2}} = \frac{{\sum}_{j=1}^{r} \delta_{j}}{{({\sum}_{j=1}^{s} \delta_{j})}^{2}}. $$

(23)

Therefore, we can derive (14):

$$ \frac{\partial \rho}{\partial \delta_{i}}=\left\{ \begin{array}{rcl} - \frac{{\sum}_{j=r+1}^{s} \delta_{j}}{{\left( {\sum}_{j=1}^{s} \delta_{j} \right)}^{2}}, & & {i=1, \dots, r}\\ \\ \frac{{\sum}_{j=1}^{r} \delta_{j}}{{\left( {\sum}_{j=1}^{s} \delta_{j} \right)}^{2}}, & & {i=r+1, \dots, s} \end{array}. \right. $$

A.2 $\frac {\partial \delta _{i}}{\partial A}$

From (21), we have the following equation:

$$ \begin{array}{lll} &&\begin{pmatrix} a_{11}&a_{12}&\cdots&a_{1d}\\a_{21}&a_{22}&\cdots&a_{2d}\\\vdots&\vdots&\ddots&\vdots\\a_{n1}&a_{n2}&\cdots&a_{nd}\end{pmatrix} \\&=&\delta_{1} \begin{pmatrix} u_{11}v_{11} & u_{11}v_{12} & {\dots} & u_{11}v_{1d} \\ u_{12}v_{11} & u_{12}v_{12} & {\dots} & u_{12}v_{1d} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ u_{1n}v_{11} & u_{1n}v_{12} & {\dots} & u_{1n}v_{1d} \end{pmatrix} \\ &&+ \delta_{2} \begin{pmatrix} u_{21}v_{21} & u_{21}v_{22} & {\dots} & u_{21}v_{2d} \\ u_{22}v_{21} & u_{22}v_{22} & {\dots} & u_{22}v_{2d} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ u_{2n}v_{21} & u_{2n}v_{22} & {\dots} & u_{2n}v_{2d} \end{pmatrix} \\ &&+{\dots} + \delta_{s} \begin{pmatrix} u_{s1}v_{s1} & u_{s1}v_{s2} & {\dots} & u_{s1}v_{sd} \\ u_{s2}v_{s1} & u_{s2}v_{s2} & {\dots} & u_{s2}v_{sd} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ u_{sn}v_{s1} & u_{sn}v_{s2} & {\dots} & u_{sn}v_{1d} \end{pmatrix} \end{array}. $$

(24)

Equivalently, we have

$$ \left\{ \begin{array}{rcl} a_{11} &=& {\sum}_{i=1}^{s} \delta_{i} u_{i1} v_{i1}\\ \\ a_{12} &=& {\sum}_{i=1}^{s} \delta_{i} u_{i1} v_{i2}\\ &\vdots& \\ a_{nd} &=& {\sum}_{i=1}^{s} \delta_{i} u_{in} v_{id} \end{array} .\right. $$

(25)

Then, we can obtain the derivatives as follows:

$$ \left\{ \begin{array}{rcl} \frac{\partial a_{11}}{\partial \delta_{i}} &=& u_{i1} v_{i1} \rightarrow \frac{\partial \delta_{i}}{\partial a_{11}} = \frac{1}{u_{i1} v_{i1}}\\ \\ \frac{\partial a_{12}}{\partial \delta_{i}} &=& u_{i1} v_{i2} \rightarrow \frac{\partial \delta_{i}}{\partial a_{12}} = \frac{1}{u_{i1} v_{i2}}\\ &\vdots& \\ \frac{\partial a_{nd}}{\partial \delta_{i}} &=& u_{in} v_{id} \rightarrow \frac{\partial \delta_{i}}{\partial a_{nd}} = \frac{1}{u_{in} v_{id}} \end{array} .\right. $$

(26)

Therefore, the derivative in terms of A can be obtained as follows:

$$ \frac{\partial \delta_{i}}{\partial A} = 1./ \begin{pmatrix} u_{i1}v_{i1} & u_{i1}v_{i2} & {\dots} & u_{i1}v_{id} \\ u_{i2}v_{i1} & u_{i2}v_{i2} & {\dots} & u_{i2}v_{id} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ u_{in}v_{i1} & u_{in}v_{i2} & {\dots} & u_{in}v_{id} \end{pmatrix} =1./u_{i} {v_{i}^{T}}, $$

(27)

where ‘./’ denotes the elementwise division operation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kang, P., Lin, Z., Yang, Z. et al. Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval. Appl Intell 52, 33–54 (2022). https://doi.org/10.1007/s10489-021-02308-3

Download citation

Accepted: 03 March 2021
Published: 22 April 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10489-021-02308-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval

Abstract

Access this article

Similar content being viewed by others

Learning with Noisy Correspondence

Learning to Prompt for Vision-Language Models

Robust zero-shot discrete hashing with noisy labels for cross-modal retrieval

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s note

Appendices

Appendix A:: Derivation of (13)

A.1 \(\frac {\partial \rho }{\partial \delta _{i}}\)

A.2 \(\frac {\partial \delta _{i}}{\partial A}\)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval

Abstract

Access this article

Similar content being viewed by others

Learning with Noisy Correspondence

Learning to Prompt for Vision-Language Models

Robust zero-shot discrete hashing with noisy labels for cross-modal retrieval

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher’s note

Appendices

Appendix A:: Derivation of (13)

A.1 \(\frac {\partial \rho }{\partial \delta _{i}}\)

A.2 \(\frac {\partial \delta _{i}}{\partial A}\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation