skip to main content
10.1145/3635636.3664261acmconferencesArticle/Chapter ViewAbstractPublication Pagesc-n-cConference Proceedingsconference-collections
poster

Communicating Design Intent Using Drawing and Text

Published: 23 June 2024 Publication History

Abstract

Realizing a designer’s intent in software currently requires tedious manipulation of geometric primitives, such as points and curves. By contrast, designers routinely communicate more abstract design goals to one another using an efficient combination of natural language and drawings. What would it take to develop artificial systems that understand how humans naturally convey design intent, and thereby enable more seamless interactions between humans and machines throughout the design process? First, it is vital to establish benchmarks that showcase the full range of strategies that humans use to successfully communicate about design intent. Here we take initial steps towards that goal by conducting an online study in which pairs of human participants – a “Designer” and “Maker” – collaborated over multiple turns to recreate target designs. In each turn, Designers sent messages containing language, drawings, or both to the Maker, describing how to modify an existing design toward the target. We found a preference for communicating using drawings in early turns and observed several multimodal strategies for conveying design intent. By comparing how human Makers and GPT-4V carried out instructions, we identify a gap in human and machine understanding of multimodal instructions and suggest a path for bridging this gap.

References

[1]
Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
[2]
Panos Achlioptas, Ian Huang, Minhyuk Sung, Sergey Tulyakov, and Leonidas Guibas. 2023. ShapeTalk: A language dataset and framework for 3d shape edits and deformations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12685–12694.
[3]
James Betker, Gabriel Goh, Li Jing, Tim Brooks, Jianfeng Wang, Linjie Li, Long Ouyang, Juntang Zhuang, Joyce Lee, Yufei Guo, 2023. Improving image generation with better captions. Computer Science. https://cdn. openai. com/papers/dall-e-3. pdf 2, 3 (2023), 8.
[4]
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
[5]
Hansheng Chen, Ruoxi Shi, Yulin Liu, Bokui Shen, Jiayuan Gu, Gordon Wetzstein, Hao Su, and Leonidas Guibas. 2024. Generic 3D Diffusion Adapter Using Controlled Multi-View Editing. arXiv preprint arXiv:2403.12032 (2024).
[6]
Judith E Fan, Wilma A Bainbridge, Rebecca Chamberlain, and Jeffrey D Wammes. 2023. Drawing as a versatile cognitive tool. Nature Reviews Psychology 2, 9 (2023), 556–568.
[7]
Judith E Fan, Robert D Hawkins, Mike Wu, and Noah D Goodman. 2020. Pragmatic inference and visual abstraction enable contextual flexibility during visual communication. Computational Brain & Behavior 3, 1 (2020), 86–101.
[8]
Yaroslav Ganin, Sergey Bartunov, Yujia Li, Ethan Keller, and Stefano Saliceti. 2021. Computer-aided design as language. Advances in Neural Information Processing Systems 34 (2021), 5885–5897.
[9]
Robert XD Hawkins, Mike Frank, and Noah D Goodman. 2017. Convention-formation in iterated reference games. In CogSci.
[10]
Holly Huey, Xuanchen Lu, Caren M Walker, and Judith E Fan. 2023. Visual explanations prioritize functional properties at the expense of visual fidelity. Cognition 236 (2023), 105414.
[11]
Bryan Lawson. 2006. How designers think. Routledge.
[12]
Chunyuan Li, Zhe Gan, Zhengyuan Yang, Jianwei Yang, Linjie Li, Lijuan Wang, and Jianfeng Gao. 2023. Multimodal foundation models: From specialists to general-purpose assistants. arXiv preprint arXiv:2309.10020 1, 2 (2023), 2.
[13]
Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, 2022. Competition-level code generation with alphacode. Science 378, 6624 (2022), 1092–1097.
[14]
William P McCarthy, Robert D Hawkins, Haoliang Wang, Cameron Holdaway, and Judith E Fan. 2021. Learning to communicate about shared procedural abstractions. arXiv preprint arXiv:2107.00077 (2021).
[15]
Aditya Sanghi, Hang Chu, Joseph G Lambourne, Ye Wang, Chin-Yi Cheng, Marco Fumero, and Kamal Rahimi Malekshan. 2022. Clip-forge: Towards zero-shot text-to-shape generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18603–18613.
[16]
Ari Seff, Wenda Zhou, Nick Richardson, and Ryan P Adams. 2021. Vitruvion: A generative model of parametric cad sketches. arXiv preprint arXiv:2109.14124 (2021).
[17]
Anthony Williams and Robert Cowdroy. 2002. How designers communicate ideas to each other in design meetings. In DS 30: Proceedings of DESIGN 2002, the 7th International Design Conference, Dubrovnik.
[18]
Karl DD Willis, Yewen Pu, Jieliang Luo, Hang Chu, Tao Du, Joseph G Lambourne, Armando Solar-Lezama, and Wojciech Matusik. 2021. Fusion 360 gallery: A dataset and environment for programmatic cad construction from human design sequences. ACM Transactions on Graphics (TOG) 40, 4 (2021), 1–24.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
C&C '24: Proceedings of the 16th Conference on Creativity & Cognition
June 2024
718 pages
ISBN:9798400704857
DOI:10.1145/3635636
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 June 2024

Check for updates

Qualifiers

  • Poster
  • Research
  • Refereed limited

Conference

C&C '24
Sponsor:
C&C '24: Creativity and Cognition
June 23 - 26, 2024
IL, Chicago, USA

Acceptance Rates

Overall Acceptance Rate 108 of 371 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 232
    Total Downloads
  • Downloads (Last 12 months)232
  • Downloads (Last 6 weeks)26
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media