Journals & Magazines >IEEE Transactions on Image Pr... >Volume: 31

Latent Space Semantic Supervision Based on Knowledge Distillation for Cross-Modal Retrieval

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

As an important field in information retrieval, fine-grained cross-modal retrieval has received great attentions from researchers. Existing fine-grained cross-modal retri...Show More

Metadata

Abstract:

As an important field in information retrieval, fine-grained cross-modal retrieval has received great attentions from researchers. Existing fine-grained cross-modal retrieval methods made several improvements in capturing the fine-grained interplay between vision and language, failing to consider the fine-grained correspondences between the features in the image latent space and the text latent space respectively, which may lead to inaccurate inference of intra-modal relations or false alignment of cross-modal information. Considering that object detection can get the fine-grained correspondences of image region features and the corresponding semantic features, this paper proposed a novel latent space semantic supervision model based on knowledge distillation (L3S-KD), which trains classifiers supervised by the fine-grained correspondences obtained from an object detection model by using knowledge distillation for image latent space fine-grained alignment, and by the labels of objects and attributes for text latent space fine-grained alignment. Compared with existing fine-grained correspondence matching methods, L3S-KD can learn more accurate semantic similarities for local fragments in image-text pairs. Extensive experiments on MS-COCO and Flickr30K datasets demonstrate that the L3S-KD model consistently outperforms state-of-the-art methods for image-text matching.

Published in: IEEE Transactions on Image Processing ( Volume: 31)

Page(s): 7154 - 7164

Date of Publication: 10 November 2022

ISSN Information:

PubMed ID: 36355734

DOI: 10.1109/TIP.2022.3220051

Funding Agency:

Contents

References is not available for this document.

Latent Space Semantic Supervision Based on Knowledge Distillation for Cross-Modal Retrieval

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Latent Space Semantic Supervision Based on Knowledge Distillation for Cross-Modal Retrieval

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

References

IEEE Account

Purchase Details

Profile Information

Need Help?