skip to main content
10.1145/3550082.3564169acmconferencesArticle/Chapter ViewAbstractPublication Pagessiggraph-asiaConference Proceedingsconference-collections
poster

Language-driven Diversified Image Retargeting

Published: 13 December 2022 Publication History

Abstract

Content-aware image resizing could automatically retarget an image to different aspect ratios while preserving visually salient contents. However, it is difficult for users to interact with the retargeting process and control the results. In this paper, we propose a language-driven diversified image retargeting (LDIR) method that allows the users to control the retargeting process by providing additional textual descriptions. Taking the original image and user-provided texts as inputs, LDIR retargets the image into the desired resolution while preserving the content indicated by texts. Following a self-play reinforcement learning pipeline, a multimodel reward function is proposed by considering both the visual quality and language guidance. Preliminary experiments manifest that LDIR can achieve diversified image retargeting guided by texts.

References

[1]
Nobukatsu Kajiura, Satoshi Kosugi, Xueting Wang, and Toshihiko Yamasaki. 2020. Self-Play Reinforcement Learning for Fast Image Retargeting. In Proceedings of the 28th ACM International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 1755–1763.
[2]
Si Liu, Zhen Wei, Yao Sun, Xinyu Ou, Junyu Lin, Bin Liu, and Ming-Hsuan Yang. 2018. Composing Semantic Collage for Image Retargeting. IEEE Transactions on Image Processing 27, 10 (2018), 5032–5043.
[3]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748–8763.
[4]
Michael Rubinstein, Ariel Shamir, and Shai Avidan. 2009. Multi-operator media retargeting. ACM Transactions on Graphics 28, 3, Article 23 (jul 2009), 11 pages.
[5]
Yu Song, Fan Tang, Weiming Dong, Xiaopeng Zhang, Oliver Deussen, and Tong-Yee Lee. 2018. Photo Squarization by Deep Multi-Operator Retargeting. In Proceedings of the 26th ACM International Conference on Multimedia. 1047–1055.

Cited By

View all

Index Terms

  1. Language-driven Diversified Image Retargeting
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SA '22: SIGGRAPH Asia 2022 Posters
        December 2022
        120 pages
        ISBN:9781450394628
        DOI:10.1145/3550082
        • Editors:
        • Soon Ki Jung,
        • Neil Dodgson
        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 13 December 2022

        Check for updates

        Qualifiers

        • Poster
        • Research
        • Refereed limited

        Funding Sources

        Conference

        SA '22
        Sponsor:
        SA '22: SIGGRAPH Asia 2022
        December 6 - 9, 2022
        Daegu, Republic of Korea

        Acceptance Rates

        Overall Acceptance Rate 178 of 869 submissions, 20%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 86
          Total Downloads
        • Downloads (Last 12 months)22
        • Downloads (Last 6 weeks)3
        Reflects downloads up to 03 Mar 2025

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media