Loading [MathJax]/extensions/MathZoom.js
Language-Based Image Manipulation Built on Language-Guided Ranking | IEEE Journals & Magazine | IEEE Xplore

Language-Based Image Manipulation Built on Language-Guided Ranking


Abstract:

Text-based image manipulation is a popular subject and has many applications. However, it is a challenging task because there is no ground-truth edited dataset and textua...Show More

Abstract:

Text-based image manipulation is a popular subject and has many applications. However, it is a challenging task because there is no ground-truth edited dataset and textual descriptions have abstractive and ambiguous properties. To alleviate the difficult issues, we propose a manipulation framework consisting of the proposal attentional GANs, language-related semantic mask, and language-guided ranker. Specially, we construct an editing proposal generator to generate the suitable edited proposals with and without semantic conditions, which supports the reorganization of sub-generators to output proposals in various aspects as many as possible. To distinguish the text-relevant and the text-irrelevant regions, we introduce a language-related semantic mask based on the source image and target caption. Then, we exploit a language-guided ranker to retrieve the best edited result from the edited proposals through using the multi-modal similarity and the language-related semantic mask. Extensive experiments on widely-used datasets demonstrate that our model could manipulate images interactively and improve the editing quality effectively.
Published in: IEEE Transactions on Multimedia ( Volume: 25)
Page(s): 6219 - 6231
Date of Publication: 15 September 2022

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.