skip to main content
10.1145/3664647.3681245acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Towards Artist-Like Painting Agents with Multi-Granularity Semantic Alignment

Published: 28 October 2024 Publication History

Abstract

Mainstream painting agents based on stroke-based rendering (SBR) attempt to translate visual appearance into a sequence of vectorized painting-style strokes. Lacking a direct mapping (and consequently the differentiable ability) between pixel domain and stroke parameter searching space, these methods often yield non-realistic/artist-incompatible stroke decompositions, hindering its further application in high quality art generation. To explicitly address this issue, we propose a novel SBR based image-to-painting framework which aligns with artistic oil painting behaviors/techniques. In the heart is a semantic content stratification module which decomposes images into hierarchical painting regions encapsulated with semantics, according to which a coarse-to-fine strategy is developed to first fill-in the abstract structure of the painting with coarse brushstrokes; and then depict the detailed texture portrayal with parallel-run localized multi-scale stroke search. In the meantime, we also propose a novel method that integrates SBR frameworks into a simulation-based interactive painting system for stroke quality assessment. Extensive experimental results on a wide range of images show that our method not only achieves high fidelity and artist-like painting rendering effect with a reduced number of strokes, but also exhibits greater stroke quality over prior methods.

References

[1]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems, Vol. 33 (2020), 1877--1901.
[2]
Renwang Chen, Xuanhong Chen, Bingbing Ni, and Yanhao Ge. 2020. Simswap: An efficient framework for high fidelity face swapping. In Proceedings of the 28th ACM international conference on multimedia. 2003--2011.
[3]
Xuanhong Chen, Bingbing Ni, Yutian Liu, Naiyuan Liu, Zhilin Zeng, and Hang Wang. 2023. Simswap: Towards faster and high-quality identity swapping. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
[4]
Ye Chen, Bingbing Ni, Xuanhong Chen, and Zhangli Hu. 2023. Editable Image Geometric Abstraction via Neural Primitive Assembly. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 23514--23523.
[5]
Ye Chen, Bingbing Ni, Jinfan Liu, Xiaoyang Huang, and Xuanhong Chen. 2024. Towards High-fidelity Artistic Image Vectorization via Texture-Encapsulated Shape Parameterization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15877--15886.
[6]
Zhili Chen, Byungmoon Kim, Daichi Ito, and Huamin Wang. 2015. Wetbrush: GPU-based 3D painting simulation at the bristle level. ACM Transactions on Graphics (TOG), Vol. 34, 6 (2015), 1--11.
[7]
Yizong Cheng. 1995. Mean shift, mode seeking, and clustering. IEEE transactions on pattern analysis and machine intelligence, Vol. 17, 8 (1995), 790--799.
[8]
Bruce Gooch, Greg Coombe, and Peter Shirley. 2002. Artistic vision: painterly rendering using computer vision techniques. In Proceedings of the 2nd international symposium on Non-photorealistic animation and rendering. 83--ff.
[9]
Paul Haeberli. 1990. Paint by numbers: Abstract image representations. In Proceedings of the 17th annual conference on Computer graphics and interactive techniques. 207--214.
[10]
Aaron Hertzmann. 1998. Painterly rendering with curved brush strokes of multiple sizes. In Proceedings of the 25th annual conference on Computer graphics and interactive techniques. 453--460.
[11]
Aaron Hertzmann. 2001. Paint by relaxation. In Proceedings. Computer Graphics International 2001. IEEE, 47--54.
[12]
Teng Hu, Ran Yi, Haokun Zhu, Liang Liu, Jinlong Peng, Yabiao Wang, Chengjie Wang, and Lizhuang Ma. 2023. Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region. In Proceedings of the 31st ACM International Conference on Multimedia. 7470--7480.
[13]
Zhewei Huang, Wen Heng, and Shuchang Zhou. 2019. Learning to paint with model-based deep reinforcement learning. In Proceedings of the IEEE/CVF international conference on computer vision. 8709--8718.
[14]
Henry Kang, Seungyong Lee, and Charles K Chui. 2007. Coherent line drawing. In Proceedings of the 5th international symposium on Non-photorealistic animation and rendering. 43--50.
[15]
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. 2023. Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4015--4026.
[16]
David Li. 2017. Fluid Paint. https://david.li/paint/.
[17]
Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).
[18]
Peter Litwinowicz. 1997. Processing images and video for an impressionist effect. In Proceedings of the 24th annual conference on Computer graphics and interactive techniques. 407--414.
[19]
Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, and Hao Wang. 2021. Paint transformer: Feed forward neural painting with stroke prediction. In Proceedings of the IEEE/CVF international conference on computer vision. 6598--6607.
[20]
Stuart Lloyd. 1982. Least squares quantization in PCM. IEEE transactions on information theory, Vol. 28, 2 (1982), 129--137.
[21]
Florian Nolte, Andrew Melnik, and Helge Ritter. 2022. Stroke-based Rendering: From Heuristics to Deep Learning. arXiv preprint arXiv:2302.00595 (2022).
[22]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.
[23]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10684--10695.
[24]
Jaskirat Singh, Cameron Smith, Jose Echevarria, and Liang Zheng. 2022. Intelli-Paint: Towards developing more human-intelligible painting agents. In European Conference on Computer Vision. Springer, 685--701.
[25]
Zhengyan Tong, Xiaohang Wang, Shengchao Yuan, Xuanhong Chen, Junjie Wang, and Xiangzhong Fang. 2022. Im2oil: Stroke-based oil painting rendering with linearly controllable fineness via adaptive sampling. In Proceedings of the 30th ACM International Conference on Multimedia. 1035--1046.
[26]
Greg Turk and David Banks. 1996. Image-guided streamline placement. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. 453--460.
[27]
Yael Vinker, Ehsan Pajouheshgar, Jessica Y Bo, Roman Christian Bachmann, Amit Haim Bermano, Daniel Cohen-Or, Amir Zamir, and Ariel Shamir. 2022. Clipasso: Semantically-aware object sketching. ACM Transactions on Graphics (TOG), Vol. 41, 4 (2022), 1--11.
[28]
Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, and Hengshuang Zhao. 2024. Depth anything: Unleashing the power of large-scale unlabeled data. arXiv preprint arXiv:2401.10891 (2024).
[29]
Kun Zeng, Mingtian Zhao, Caiming Xiong, and Song Chun Zhu. 2009. From image parsing to painterly rendering. ACM Trans. Graph., Vol. 29, 1 (2009), 2--1.
[30]
Zhongyin Zhao, Ye Chen, Zhangli Hu, Xuanhong Chen, and Bingbing Ni. 2024. Vector Graphics Generation via Mutually Impulsed Dual-domain Diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4420--4428.
[31]
Zhengxia Zou, Tianyang Shi, Shuang Qiu, Yi Yuan, and Zhenwei Shi. 2021. Stylized neural painting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15689--15698.

Index Terms

  1. Towards Artist-Like Painting Agents with Multi-Granularity Semantic Alignment

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
      October 2024
      11719 pages
      ISBN:9798400706868
      DOI:10.1145/3664647
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 October 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. painting agent
      2. semantic stratification
      3. stroke-based rendering

      Qualifiers

      • Research-article

      Funding Sources

      • National Science Foundation of China

      Conference

      MM '24
      Sponsor:
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne VIC, Australia

      Acceptance Rates

      MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 204
        Total Downloads
      • Downloads (Last 12 months)204
      • Downloads (Last 6 weeks)144
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media