Abstract:
Fine-grained image-text retrieval aims at searching relevant images among fine-grained classes given a text query or in a reverse way. The challenges are not only bridgin...Show MoreMetadata
Abstract:
Fine-grained image-text retrieval aims at searching relevant images among fine-grained classes given a text query or in a reverse way. The challenges are not only bridging the gap between two heterogeneous modalities but also dealing with large inter-class similarity and intra-class variance existed in fine-grained data. To deal with the above challenges, we propose a Discriminative Latent Space Learning (DLSL) method for fine-grained image-text retrieval. Concretely, image and text features are extracted for capturing the subtle difference in fine-grained data. Subsequently, based on the extracted features, we perform couple dictionary learning to align the heterogeneous data in a uniform latent space. To make such alignment discriminative enough for the fine-grained task, the learned latent space is endowed with discriminative property via learning a discriminative map. Comprehensive experiments on fine-grained datasets demonstrate the effectiveness of our approach.
Published in: IEEE Signal Processing Letters ( Volume: 28)