TRTST: Arbitrary High-Quality Text-Guided Style Transfer With Transformers | IEEE Journals & Magazine | IEEE Xplore

TRTST: Arbitrary High-Quality Text-Guided Style Transfer With Transformers


Abstract:

Text-guided style transfer aims to repaint a content image with the target style described by a text prompt, offering greater flexibility and creativity compared to tradi...Show More

Abstract:

Text-guided style transfer aims to repaint a content image with the target style described by a text prompt, offering greater flexibility and creativity compared to traditional image-guided style transfer. Despite the potential, existing text-guided style transfer methods often suffer from many issues, including insufficient visual quality, poor generalization ability, or a reliance on large amounts of paired training data. To address these limitations, we leverage the inherent strengths of transformers in handling multimodal data and propose a novel transformer-based framework called TRTST that not only achieves unpaired arbitrary text-guided style transfer but also significantly improves the visual quality. Specifically, TRTST explores combining a text transformer encoder with an image transformer encoder to project the input text prompt and content image into a joint embedding space and extract the desired style and content features. These features are then input into a multimodal co-attention module to stylize the image sequence based on the text sequence. We also propose a new adaptive parametric positional encoding (APPE) scheme which can adaptively produce different positional encodings to optimally match different inputs with a position encoder. In addition, to further improve content preservation, we introduce a text-guided identity loss to our model. Extensive results and comparisons are conducted to demonstrate the effectiveness and superiority of our method.
Published in: IEEE Transactions on Image Processing ( Volume: 34)
Page(s): 759 - 771
Date of Publication: 23 January 2025

ISSN Information:

PubMed ID: 40031276

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.