research-article

StyleFactory: Towards Better Style Alignment in Image Creation through Style-Strength-Based Control and Evaluation

Authors:

Dengming Zhang,

Pei ChenAuthors Info & Claims

UIST '24: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology

Article No.: 107, Pages 1 - 15

https://doi.org/10.1145/3654777.3676370

Published: 11 October 2024 Publication History

UIST '24: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology

StyleFactory: Towards Better Style Alignment in Image Creation through Style-Strength-Based Control and Evaluation

Pages 1 - 15

Abstract
Supplemental Material
References

Abstract

Generative AI models have been widely used for image creation. However, generating images that are well-aligned with users’ personal styles on aesthetic features (e.g., color and texture) can be challenging due to the poor style expression and interpretation between humans and models. Through a formative study, we observed that participants showed a clear subjective perception of the desired style and variations in its strength, which directly inspired us to develop style-strength-based control and evaluation. Building on this, we present StyleFactory, an interactive system that helps users achieve style alignment. Our interface enables users to rank images based on their strengths in the desired style and visualizes the strength distribution of other images in that style from the model’s perspective. In this way, users can evaluate the understanding gap between themselves and the model, and define well-aligned personal styles for image creation through targeted iterations. Our technical evaluation and user study demonstrate that StyleFactory accurately generates images in specific styles, effectively facilitates style alignment in image creation workflow, stimulates creativity, and enhances the user experience in human-AI interactions.

Supplemental Material

MP4 File

Video figure

Download
66.15 MB

References

[1]

Namhyuk Ahn, Junsoo Lee, Chunggi Lee, Kunhee Kim, Daesik Kim, Seung-Hun Nam, and Kibeom Hong. 2023. DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models. arxiv:2309.06933 [cs.CV]

[2]

Atlee Arts. 2024. ELEMENTS OF ART and PRINCIPLES OF DESIGN. https://atleearts.weebly.com/uploads/5/0/4/9/50491891/3.elements_and_principles_of_art-descriptive_words__1_.pdf

[3]

Omri Avrahami, Thomas Hayes, Oran Gafni, Sonal Gupta, Yaniv Taigman, Devi Parikh, Dani Lischinski, Ohad Fried, and Xi Yin. 2023. SpaText: Spatio-Textual Representation for Controllable Image Generation. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 18370–18380. https://doi.org/10.1109/CVPR52729.2023.01762

[4]

Seungho Baek, Hyerin Im, Jiseung Ryu, Juhyeong Park, and Takyeon Lee. 2023. PromptCrafter: Crafting Text-to-Image Prompt through Mixed-Initiative Dialogue with LLM. arxiv:2307.08985 [cs.HC]

[5]

Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, and Tovi Grossman. 2023. Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (San Francisco, CA, USA) (UIST ’23). Association for Computing Machinery, New York, NY, USA, Article 96, 14 pages. https://doi.org/10.1145/3586183.3606725

Digital Library

[6]

Siddhartha Chaudhuri, Evangelos Kalogerakis, Stephen Giguere, and Thomas Funkhouser. 2013. Attribit: content creation with semantic attributes. In Proceedings of the 26th Annual ACM Symposium on User Interface Software and Technology (St. Andrews, Scotland, United Kingdom) (UIST ’13). Association for Computing Machinery, New York, NY, USA, 193–202. https://doi.org/10.1145/2501988.2502008

Digital Library

[7]

Xiang ’Anthony’ Chen, Jeff Burke, Ruofei Du, Matthew K. Hong, Jennifer Jacobs, Philippe Laban, Dingzeyu Li, Nanyun Peng, Karl D. D. Willis, Chien-Sheng Wu, and Bolei Zhou. 2023. Next Steps for Human-Centered Generative AI: A Technical Perspective. arxiv:2306.15774 [cs.HC]

[8]

Erin Cherry and Celine Latulipe. 2014. Quantifying the Creativity Support of Digital Tools through the Creativity Support Index. ACM Trans. Comput.-Hum. Interact. 21, 4, Article 21 (jun 2014), 25 pages. https://doi.org/10.1145/2617588

Digital Library

[9]

John Joon Young Chung and Eytan Adar. 2023. PromptPaint: Steering Text-to-Image Generation Through Paint Medium-like Interactions. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (San Francisco, CA, USA) (UIST ’23). Association for Computing Machinery, New York, NY, USA, Article 6, 17 pages. https://doi.org/10.1145/3586183.3606777

Digital Library

[10]

Guillaume Couairon, Jakob Verbeek, Holger Schwenk, and Matthieu Cord. 2022. DiffEdit: Diffusion-based semantic image editing with mask guidance. arxiv:2210.11427 [cs.CV]

[11]

Tan M. Dinh, Rang Nguyen, and Binh-Son Hua. 2022. TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation. In Computer Vision – ECCV 2022, Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, and Tal Hassner (Eds.). Springer Nature Switzerland, Cham, 594–609.

[12]

Noyan Evirgen and Xiang ’Anthony’ Chen. 2022. GANzilla: User-Driven Direction Discovery in Generative Adversarial Networks. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Association for Computing Machinery, New York, NY, USA, Article 75, 10 pages. https://doi.org/10.1145/3526113.3545638

Digital Library

[13]

Noyan Evirgen and Xiang ’Anthony’ Chen. 2023. GANravel: User-Driven Direction Disentanglement in Generative Adversarial Networks. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 19, 15 pages. https://doi.org/10.1145/3544548.3581226

Digital Library

[14]

Yingchaojie Feng, Xingbo Wang, Kam Kwai Wong, Sijia Wang, Yuhong Lu, Minfeng Zhu, Baicheng Wang, and Wei Chen. 2024. PromptMagician: Interactive Prompt Engineering for Text-to-Image Creation. IEEE Transactions on Visualization and Computer Graphics 30, 1 (Jan 2024), 295–305. https://doi.org/10.1109/TVCG.2023.3327168

Digital Library

[15]

Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, and Daniel Cohen-Or. 2022. An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion. arxiv:2208.01618 [cs.CV]

[16]

Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image Style Transfer Using Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]

Songwei Ge, Taesung Park, Jun-Yan Zhu, and Jia-Bin Huang. 2023. Expressive Text-to-Image Generation with Rich Text. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV). 7511–7522. https://doi.org/10.1109/ICCV51070.2023.00694

[18]

Yuhan Guo, Hanning Shao, Can Liu, Kai Xu, and Xiaoru Yuan. 2024. PrompTHis: Visualizing the Process and Influence of Prompt Editing during Text-to-Image Creation. IEEE Transactions on Visualization and Computer Graphics (2024), 1–12. https://doi.org/10.1109/TVCG.2024.3408255

[19]

Eric Horvitz. 1999. Principles of mixed-initiative user interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Pittsburgh, Pennsylvania, USA) (CHI ’99). Association for Computing Machinery, New York, NY, USA, 159–166. https://doi.org/10.1145/302979.303030

Digital Library

[20]

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. arxiv:2106.09685 [cs.CL]

[21]

Lianghua Huang, Di Chen, Yu Liu, Yujun Shen, Deli Zhao, and Jingren Zhou. 2023. Composer: Creative and Controllable Image Synthesis with Composable Conditions. arxiv:2302.09778 [cs.CV]

[22]

Rong Huang, Haichuan Lin, Chuanzhang Chen, Kang Zhang, and Wei Zeng. 2024. PlantoGraphy: Incorporating Iterative Design Process into Generative Artificial Intelligence for Landscape Rendering. In Proceedings of the CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 168, 19 pages. https://doi.org/10.1145/3613904.3642824

Digital Library

[23]

Jiaming Ji, Tianyi Qiu, Boyuan Chen, Borong Zhang, Hantao Lou, Kaile Wang, Yawen Duan, Zhonghao He, Jiayi Zhou, Zhaowei Zhang, Fanzhi Zeng, Kwan Yee Ng, Juntao Dai, Xuehai Pan, Aidan O’Gara, Yingshan Lei, Hua Xu, Brian Tse, Jie Fu, Stephen McAleer, Yaodong Yang, Yizhou Wang, Song-Chun Zhu, Yike Guo, and Wen Gao. 2024. AI Alignment: A Comprehensive Survey. arxiv:2310.19852 [cs.AI]

[24]

Bahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, and Michal Irani. 2023. Imagic: Text-Based Real Image Editing with Diffusion Models. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6007–6017. https://doi.org/10.1109/CVPR52729.2023.00582

[25]

Hyung-Kwon Ko, Gwanmo Park, Hyeon Jeon, Jaemin Jo, Juho Kim, and Jinwook Seo. 2023. Large-scale Text-to-Image Generation Models for Visual Artists’ Creative Works. In Proceedings of the 28th International Conference on Intelligent User Interfaces (Sydney, NSW, Australia) (IUI ’23). Association for Computing Machinery, New York, NY, USA, 919–933. https://doi.org/10.1145/3581641.3584078

Digital Library

[26]

Chinmay Kulkarni, Stefania Druga, Minsuk Chang, Alex Fiannaca, Carrie Cai, and Michael Terry. 2023. A Word is Worth a Thousand Pictures: Prompts as AI Design Material. arxiv:2303.12647 [cs.HC]

[27]

Zeqiang Lai, Xizhou Zhu, Jifeng Dai, Yu Qiao, and Wenhai Wang. 2023. Mini-DALLE3: Interactive Text to Image by Prompting Large Language Models. arxiv:2310.07653 [cs.AI]

[28]

Yaniv Leviathan, Matan Kalman, and Yossi Matias. 2023. Fast Inference from Transformers via Speculative Decoding. arxiv:2211.17192 [cs.LG]

[29]

Congcong Li and Tsuhan Chen. 2009. Aesthetic Visual Quality Assessment of Paintings. IEEE Journal of Selected Topics in Signal Processing 3, 2 (April 2009), 236–252. https://doi.org/10.1109/JSTSP.2009.2015077

[30]

Pengyang Ling, Lin Chen, Pan Zhang, Huaian Chen, Yi Jin, and Jinjin Zheng. 2024. FreeDrag: Feature Dragging for Reliable Point-based Image Editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 6860–6870.

[31]

Vivian Liu and Lydia B Chilton. 2022. Design Guidelines for Prompt Engineering Text-to-Image Generative Models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 384, 23 pages. https://doi.org/10.1145/3491102.3501825

Digital Library

[32]

Vivian Liu, Han Qiao, and Lydia Chilton. 2022. Opal: Multimodal Image Generation for News Illustration. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Association for Computing Machinery, New York, NY, USA, Article 73, 17 pages. https://doi.org/10.1145/3526113.3545621

Digital Library

[33]

J. Marks, B. Andalman, P. A. Beardsley, W. Freeman, S. Gibson, J. Hodgins, T. Kang, B. Mirtich, H. Pfister, W. Ruml, K. Ryall, J. Seims, and S. Shieber. 1997. Design galleries: a general approach to setting parameters for computer graphics and animation. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques(SIGGRAPH ’97). ACM Press/Addison-Wesley Publishing Co., USA, 389–400. https://doi.org/10.1145/258734.258887

Digital Library

[34]

Damien Masson, Sylvain Malacria, Géry Casiez, and Daniel Vogel. 2024. DirectGPT: A Direct Manipulation Interface to Interact with Large Language Models. In Proceedings of the CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 975, 16 pages. https://doi.org/10.1145/3613904.3642462

Digital Library

[35]

Midjourney. 2024 (Accessed Apr 1, 2024). Midjourney. https://www.midjourney.com.

[36]

Chong Mou, Xintao Wang, Liangbin Xie, Yanze Wu, Jian Zhang, Zhongang Qi, and Ying Shan. 2024. T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion Models. Proceedings of the AAAI Conference on Artificial Intelligence 38, 5 (Mar. 2024), 4296–4304. https://doi.org/10.1609/aaai.v38i5.28226

Digital Library

[37]

Peter O’Donovan, Jundefinednis Lundefinedbeks, Aseem Agarwala, and Aaron Hertzmann. 2014. Exploratory font selection using crowdsourced attributes. ACM Trans. Graph. 33, 4, Article 92 (jul 2014), 9 pages. https://doi.org/10.1145/2601097.2601110

Digital Library

[38]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748–8763.

[39]

Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical Text-Conditional Image Generation with CLIP Latents. arxiv:2204.06125 [cs.CV]

[40]

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-Shot Text-to-Image Generation. In Proceedings of the 38th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8821–8831. https://proceedings.mlr.press/v139/ramesh21a.html

[41]

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-Resolution Image Synthesis with Latent Diffusion Models. arxiv:2112.10752 [cs.CV]

[42]

Andrew Ross, Nina Chen, Elisa Zhao Hang, Elena L. Glassman, and Finale Doshi-Velez. 2021. Evaluating the Interpretability of Generative Models by Interactive Reconstruction. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA, Article 80, 15 pages. https://doi.org/10.1145/3411764.3445296

Digital Library

[43]

Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. 2023. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation. arxiv:2208.12242 [cs.CV]

[44]

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, and Mohammad Norouzi. 2022. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. arxiv:2205.11487 [cs.CV]

[45]

Yujun Shi, Chuhui Xue, Jun Hao Liew, Jiachun Pan, Hanshu Yan, Wenqing Zhang, Vincent Y. F. Tan, and Song Bai. 2024. DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8839–8849.

[46]

Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, and Dilip Krishnan. 2023. StyleDrop: Text-to-Image Generation in Any Style. arxiv:2306.00983 [cs.CV]

[47]

Kihoon Son, DaEun Choi, Tae Soo Kim, Young-Ho Kim, and Juho Kim. 2024. GenQuery: Supporting Expressive Visual Search with Generative Models. In Proceedings of the CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 180, 19 pages. https://doi.org/10.1145/3613904.3642847

Digital Library

[48]

Michael Terry, Chinmay Kulkarni, Martin Wattenberg, Lucas Dixon, and Meredith Ringel Morris. 2023. AI Alignment in the Design of Interactive AI: Specification Alignment, Process Alignment, and Evaluation Support. arxiv:2311.00710 [cs.HC]

[49]

Maddalena Torricelli, Mauro Martino, Andrea Baronchelli, and Luca Maria Aiello. 2023. The role of interface design on prompt-mediated creativity in Generative AI. arXiv preprint arXiv:2312.00233 (2023).

[50]

Michihiko Ueno and Shin’ichi Satoh. 2021. Continuous and Gradual Style Changes of Graphic Designs with Generative Model. In 26th International Conference on Intelligent User Interfaces (College Station, TX, USA) (IUI ’21). Association for Computing Machinery, New York, NY, USA, 280–289. https://doi.org/10.1145/3397481.3450666

Digital Library

[51]

Andrey Voynov, Kfir Aberman, and Daniel Cohen-Or. 2023. Sketch-Guided Text-to-Image Diffusion Models. In ACM SIGGRAPH 2023 Conference Proceedings (Los Angeles, CA, USA) (SIGGRAPH ’23). Association for Computing Machinery, New York, NY, USA, Article 55, 11 pages. https://doi.org/10.1145/3588432.3591560

Digital Library

[52]

Jianyi Wang, Kelvin C. K. Chan, and Chen Change Loy. 2022. Exploring CLIP for Assessing the Look and Feel of Images. arxiv:2207.12396 [cs.CV]

[53]

Yunlong Wang, Shuyuan Shen, and Brian Y Lim. 2023. RePrompt: Automatic Prompt Editing to Refine AI-Generative Art Towards Precise Expressions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 22, 29 pages. https://doi.org/10.1145/3544548.3581402

Digital Library

[54]

Zhijie Wang, Yuheng Huang, Da Song, Lei Ma, and Tianyi Zhang. 2024. PromptCharm: Text-to-Image Generation through Multi-modal Prompting and Refinement. In Proceedings of the CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 185, 21 pages. https://doi.org/10.1145/3613904.3642803

Digital Library

[55]

Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Geiping, and Tom Goldstein. 2023. Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery. In Advances in Neural Information Processing Systems, A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (Eds.). Vol. 36. Curran Associates, Inc., 51008–51025. https://proceedings.neurips.cc/paper_files/paper/2023/file/a00548031e4647b13042c97c922fadf1-Paper-Conference.pdf

[56]

Tongshuang Wu, Michael Terry, and Carrie Jun Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 385, 22 pages. https://doi.org/10.1145/3491102.3517582

Digital Library

[57]

Xiaoshi Wu, Keqiang Sun, Feng Zhu, Rui Zhao, and Hongsheng Li. 2023. Human Preference Score: Better Aligning Text-to-Image Models with Human Preference. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2096–2105.

[58]

Zhuohao Wu, Danwen Ji, Kaiwen Yu, Xianxu Zeng, Dingming Wu, and Mohammad Shidujaman. 2021. AI Creativity and the Human-AI Co-creation Model. In Human-Computer Interaction. Theory, Methods and Tools, Masaaki Kurosu (Ed.). Springer International Publishing, Cham, 171–190.

[59]

Shaoan Xie, Zhifei Zhang, Zhe Lin, Tobias Hinz, and Kun Zhang. 2023. SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 22428–22437. https://doi.org/10.1109/CVPR52729.2023.02148

[60]

Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, and Yuxiao Dong. 2023. ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation. arxiv:2304.05977 [cs.CV]

[61]

Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, and Fang Wen. 2023. Paint by Example: Exemplar-Based Image Editing With Diffusion Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 18381–18391.

[62]

Qian Yang, Aaron Steinfeld, Carolyn Rosé, and John Zimmerman. 2020. Re-examining Whether, Why, and How Human-AI Interaction Is Uniquely Difficult to Design. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376301

Digital Library

[63]

Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, and Wei Yang. 2023. Ip-adapter: Text compatible image prompt adapter for text-to-image diffusion models. arXiv preprint arXiv:2308.06721 (2023).

[64]

Ahmet Burak Yildirim, Vedat Baday, Erkut Erdem, Aykut Erdem, and Aysegul Dundar. 2023. Inst-Inpaint: Instructing to Remove Objects with Diffusion Models. arxiv:2304.03246 [cs.CV]

[65]

J.D. Zamfirescu-Pereira, Shm Garanganao Almeda, Kyu Won Kim, and Bjoern Hartmann. 2023. Towards Image Design Space Exploration in Spreadsheets with LLM Formulae. In Adjunct Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (San Francisco, CA, USA) (UIST ’23 Adjunct). Association for Computing Machinery, New York, NY, USA, Article 82, 3 pages. https://doi.org/10.1145/3586182.3615790

Digital Library

[66]

J.D. Zamfirescu-Pereira, Richmond Y. Wong, Bjoern Hartmann, and Qian Yang. 2023. Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 437, 21 pages. https://doi.org/10.1145/3544548.3581388

Digital Library

[67]

Xingchen Zeng, Ziyao Gao, Yilin Ye, and Wei Zeng. 2024. IntentTuner: An Interactive Framework for Integrating Human Intentions in Fine-tuning Text-to-Image Generative Models. In Proceedings of the CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’24). Association for Computing Machinery, New York, NY, USA, Article 182, 18 pages. https://doi.org/10.1145/3613904.3642165

Digital Library

[68]

Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. 2023. Adding Conditional Control to Text-to-Image Diffusion Models. In 2023 IEEE/CVF International Conference on Computer Vision (ICCV). 3813–3824. https://doi.org/10.1109/ICCV51070.2023.00355

[69]

Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet, and Yunhao Ge. 2023. DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models. arxiv:2312.14216 [cs.CV]

[70]

Guangcong Zheng, Xianpan Zhou, Xuewei Li, Zhongang Qi, Ying Shan, and Xi Li. 2023. LayoutDiffusion: Controllable Diffusion Model for Layout-to-Image Generation. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 22490–22499. https://doi.org/10.1109/CVPR52729.2023.02154

Index Terms

StyleFactory: Towards Better Style Alignment in Image Creation through Style-Strength-Based Control and Evaluation
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interactive systems and tools

Recommendations

CNN-based Style Vector for Style Image Retrieval
ICMR '16: Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval

In this paper, we have examined the effectiveness of "style matrix" which is used in the works on style transfer and texture synthesis by Gatys et al. in the context of image retrieval as image features. A style matrix is presented by Gram matrix of the ...
Style-oriented representative paintings selection
SA '17: SIGGRAPH Asia 2017 Posters

Style transfer is used as a means to render an image in the artistic style of another one. An ideal style transfer algorithm should be able to extract and represent the semantic image content from the source image and then render the content in the ...
Style language: creating words for sharing diverse ways of doing
PLoP '18: Proceedings of the 25th Conference on Pattern Languages of Programs

This paper presents style language as a new method for sharing various ways of conducting a specific activity, with support for designing your own activity. A style language describes the elements of style as a type of approach and implementation. By ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

UIST '24: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology

October 2024

2334 pages

ISBN:9798400706288

DOI:10.1145/3654777

Editors:
Lining Yao
University of California, Berkeley
,
Mayank Goel
Carnegie Mellon University
,
Alexandra Ion
Carnegie Mellon University
,
Pedro Lopes
University of Chicago

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Key Research and Development Program of China

Conference

UIST '24

UIST '24: The 37th Annual ACM Symposium on User Interface Software and Technology

October 13 - 16, 2024

PA, Pittsburgh, USA

Acceptance Rates

Overall Acceptance Rate 561 of 2,567 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
431
Total Downloads

Downloads (Last 12 months)431
Downloads (Last 6 weeks)115

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten