Loading [a11y]/accessibility-menu.js
Image Difference Captioning With Instance-Level Fine-Grained Feature Representation | IEEE Journals & Magazine | IEEE Xplore

Image Difference Captioning With Instance-Level Fine-Grained Feature Representation


Abstract:

The task of image difference captioning aims at locating changed objects in similar image pairs and describing the difference with natural language. The key challenges of...Show More

Abstract:

The task of image difference captioning aims at locating changed objects in similar image pairs and describing the difference with natural language. The key challenges of this task are to comprehend the context of image pairs sufficiently and locate the changed objects accurately in the presence of viewpoint change. Previous studies focus on pixel-level image features, neglecting rich explicit features of objects in an image pair which are beneficial to generate a fine-grained difference caption. Additionally, existing generative models suffer from accurately locate the differences in the interference of viewpoint change. To address these issues, we propose an Instance-Level Fine-Grained Difference Captioning (IFDC) model, which consists of a fine-grained feature extraction module, a multi-round feature fusion module, a similarity-based difference finding module, and a difference captioning module. To describe the changed objects comprehensively, we extract the fine-grained features, i.e., visual features, semantic features, and positional features at instance-level, as the objects’ representation. To enhance the model’s immunity to viewpoint change, we design a similarity-based difference finding module to locate the changed objects accurately. Extensive experiments show that our IFDC model achieves comparable performance with the state-of-the-art models on the datasets of CLEVR-Change and Spot-the-Diff, thus verifying the effectiveness of our proposed model. Our source code is available at https://github.com/VISLANG-Lab/IFDC.
Published in: IEEE Transactions on Multimedia ( Volume: 24)
Page(s): 2004 - 2017
Date of Publication: 21 April 2021

ISSN Information:

Funding Agency:


References

References is not available for this document.