research-article

Towards Artist-Like Painting Agents with Multi-Granularity Semantic Alignment

Authors:

Bingbing NiAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 10191 - 10199

https://doi.org/10.1145/3664647.3681245

Published: 28 October 2024 Publication History

Abstract

Mainstream painting agents based on stroke-based rendering (SBR) attempt to translate visual appearance into a sequence of vectorized painting-style strokes. Lacking a direct mapping (and consequently the differentiable ability) between pixel domain and stroke parameter searching space, these methods often yield non-realistic/artist-incompatible stroke decompositions, hindering its further application in high quality art generation. To explicitly address this issue, we propose a novel SBR based image-to-painting framework which aligns with artistic oil painting behaviors/techniques. In the heart is a semantic content stratification module which decomposes images into hierarchical painting regions encapsulated with semantics, according to which a coarse-to-fine strategy is developed to first fill-in the abstract structure of the painting with coarse brushstrokes; and then depict the detailed texture portrayal with parallel-run localized multi-scale stroke search. In the meantime, we also propose a novel method that integrates SBR frameworks into a simulation-based interactive painting system for stroke quality assessment. Extensive experimental results on a wide range of images show that our method not only achieves high fidelity and artist-like painting rendering effect with a reduced number of strokes, but also exhibits greater stroke quality over prior methods.

References

[1]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems, Vol. 33 (2020), 1877--1901.

[2]

Renwang Chen, Xuanhong Chen, Bingbing Ni, and Yanhao Ge. 2020. Simswap: An efficient framework for high fidelity face swapping. In Proceedings of the 28th ACM international conference on multimedia. 2003--2011.

Digital Library

[3]

Xuanhong Chen, Bingbing Ni, Yutian Liu, Naiyuan Liu, Zhilin Zeng, and Hang Wang. 2023. Simswap: Towards faster and high-quality identity swapping. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).

[4]

Ye Chen, Bingbing Ni, Xuanhong Chen, and Zhangli Hu. 2023. Editable Image Geometric Abstraction via Neural Primitive Assembly. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 23514--23523.

[5]

Ye Chen, Bingbing Ni, Jinfan Liu, Xiaoyang Huang, and Xuanhong Chen. 2024. Towards High-fidelity Artistic Image Vectorization via Texture-Encapsulated Shape Parameterization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15877--15886.

[6]

Zhili Chen, Byungmoon Kim, Daichi Ito, and Huamin Wang. 2015. Wetbrush: GPU-based 3D painting simulation at the bristle level. ACM Transactions on Graphics (TOG), Vol. 34, 6 (2015), 1--11.

Digital Library

[7]

Yizong Cheng. 1995. Mean shift, mode seeking, and clustering. IEEE transactions on pattern analysis and machine intelligence, Vol. 17, 8 (1995), 790--799.

[8]

Bruce Gooch, Greg Coombe, and Peter Shirley. 2002. Artistic vision: painterly rendering using computer vision techniques. In Proceedings of the 2nd international symposium on Non-photorealistic animation and rendering. 83--ff.

Digital Library

[9]

Paul Haeberli. 1990. Paint by numbers: Abstract image representations. In Proceedings of the 17th annual conference on Computer graphics and interactive techniques. 207--214.

Digital Library

[10]

Aaron Hertzmann. 1998. Painterly rendering with curved brush strokes of multiple sizes. In Proceedings of the 25th annual conference on Computer graphics and interactive techniques. 453--460.

Digital Library

[11]

Aaron Hertzmann. 2001. Paint by relaxation. In Proceedings. Computer Graphics International 2001. IEEE, 47--54.

[12]

Teng Hu, Ran Yi, Haokun Zhu, Liang Liu, Jinlong Peng, Yabiao Wang, Chengjie Wang, and Lizhuang Ma. 2023. Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region. In Proceedings of the 31st ACM International Conference on Multimedia. 7470--7480.

Digital Library

[13]

Zhewei Huang, Wen Heng, and Shuchang Zhou. 2019. Learning to paint with model-based deep reinforcement learning. In Proceedings of the IEEE/CVF international conference on computer vision. 8709--8718.

[14]

Henry Kang, Seungyong Lee, and Charles K Chui. 2007. Coherent line drawing. In Proceedings of the 5th international symposium on Non-photorealistic animation and rendering. 43--50.

Digital Library

[15]

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. 2023. Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4015--4026.

[16]

David Li. 2017. Fluid Paint. https://david.li/paint/.

[17]

Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).

[18]

Peter Litwinowicz. 1997. Processing images and video for an impressionist effect. In Proceedings of the 24th annual conference on Computer graphics and interactive techniques. 407--414.

Digital Library

[19]

Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, and Hao Wang. 2021. Paint transformer: Feed forward neural painting with stroke prediction. In Proceedings of the IEEE/CVF international conference on computer vision. 6598--6607.

[20]

Stuart Lloyd. 1982. Least squares quantization in PCM. IEEE transactions on information theory, Vol. 28, 2 (1982), 129--137.

Digital Library

[21]

Florian Nolte, Andrew Melnik, and Helge Ritter. 2022. Stroke-based Rendering: From Heuristics to Deep Learning. arXiv preprint arXiv:2302.00595 (2022).

[22]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.

[23]

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10684--10695.

[24]

Jaskirat Singh, Cameron Smith, Jose Echevarria, and Liang Zheng. 2022. Intelli-Paint: Towards developing more human-intelligible painting agents. In European Conference on Computer Vision. Springer, 685--701.

Digital Library

[25]

Zhengyan Tong, Xiaohang Wang, Shengchao Yuan, Xuanhong Chen, Junjie Wang, and Xiangzhong Fang. 2022. Im2oil: Stroke-based oil painting rendering with linearly controllable fineness via adaptive sampling. In Proceedings of the 30th ACM International Conference on Multimedia. 1035--1046.

Digital Library

[26]

Greg Turk and David Banks. 1996. Image-guided streamline placement. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. 453--460.

Digital Library

[27]

Yael Vinker, Ehsan Pajouheshgar, Jessica Y Bo, Roman Christian Bachmann, Amit Haim Bermano, Daniel Cohen-Or, Amir Zamir, and Ariel Shamir. 2022. Clipasso: Semantically-aware object sketching. ACM Transactions on Graphics (TOG), Vol. 41, 4 (2022), 1--11.

Digital Library

[28]

Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, and Hengshuang Zhao. 2024. Depth anything: Unleashing the power of large-scale unlabeled data. arXiv preprint arXiv:2401.10891 (2024).

[29]

Kun Zeng, Mingtian Zhao, Caiming Xiong, and Song Chun Zhu. 2009. From image parsing to painterly rendering. ACM Trans. Graph., Vol. 29, 1 (2009), 2--1.

Digital Library

[30]

Zhongyin Zhao, Ye Chen, Zhangli Hu, Xuanhong Chen, and Bingbing Ni. 2024. Vector Graphics Generation via Mutually Impulsed Dual-domain Diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4420--4428.

[31]

Zhengxia Zou, Tianyang Shi, Shuang Qiu, Yi Yuan, and Zhenwei Shi. 2021. Stylized neural painting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15689--15698.

Index Terms

Towards Artist-Like Painting Agents with Multi-Granularity Semantic Alignment
1. Applied computing
  1. Arts and humanities
    1. Fine arts
    2. Media arts

Recommendations

Stroke-based Neural Painting and Stylization with Dynamically Predicted Painting Region
MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Stroke-based rendering aims to recreate an image with a set of strokes. Most existing methods render complex images using an uniform-block-dividing strategy, which leads to boundary inconsistency artifacts. To solve the problem, we propose Compositional ...
Portrait painting using active templates
NPAR '11: Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Non-Photorealistic Animation and Rendering

Portraiture plays a substantial role in traditional painting, yet it has not been studied in depth in painterly rendering research. The difficulty in rendering human portraits is due to our acute visual perception to the structure of human face. To ...
Robot Artist for colorful picture painting with visual control system
2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
This paper presents a robot capable of painting colorful pictures with a visual control system like human artists. It can use only five basic colors (cyan, magenta, yellow, white and black) to mix a variety of colors. After receiving a picture, the Robot ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

October 2024

11719 pages

ISBN:9798400706868

DOI:10.1145/3664647

General Chairs:
Jianfei Cai
Monash University, Australia
,
Mohan Kankanhalli
NUS, Singapore
,
Balakrishnan Prabhakaran
UT Dallas, USA
,
Susanne Boll
University of Oldenburg, Germany
,
Program Chairs:
Ramanathan Subramanian
University of Canberra & IIT Ropar, Australia
,
Liang Zheng
Australian National University, Australia
,
Vivek K. Singh
Rutgers University, USA
,
Pablo Cesar
Centrum Wiskunde & Informatica, Netherlands
,
Lexing Xie
Australian National University, Australia
,
Dong Xu
University of Hong Kong, Hong Kong

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation of China

Conference

MM '24

Sponsor:

SIGMM

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne VIC, Australia

Acceptance Rates

MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
204
Total Downloads

Downloads (Last 12 months)204
Downloads (Last 6 weeks)144

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten