demonstration

Bespoke: Using LLM agents to generate just-in-time interfaces by reasoning about user intent

ICMI Companion '24: Companion Proceedings of the 26th International Conference on Multimodal Interaction

Pages 78 - 81

Published: 04 November 2024 Publication History

Abstract

Large language models (LLMs) have emerged as a powerful tool for creating personalized knowledge experiences for users, often serving as their own interface through text-based chatbots. The interpretation of user intent and generation of output occur implicitly within the model’s architecture. We propose an alternative approach in a system we call Bespoke where the LLM acts as an agent to explicitly reason about user intent, plan, and generate graphical interfaces to fulfill that intent. This approach enables the creation of visually rich interactions that complement chat-based interactions. By employing a step-by-step reasoning process to reduce ambiguity and keep the model on track, we compose interfaces from a toolkit of widgets, providing a designed and tailored user experience. Our early experiment shows that the output interface differs depending on the interpreted intent. In the current version, these interactions are multimodal in the automatic generation of UI; in future versions, this paradigm can be extended to multiple modalities of input and output. This agentive approach moves the interface towards a personalized, bespoke experience with multimodal interaction that adapts to the user’s intentions. See video demonstration here [2].

Supplemental Material

MOV File

A video explanation of the Bespoke UI system, powered by Gemini.

Download
148.52 MB

References

[1]

[n. d.]. Introducing ChatGPT. https://openai.com/index/chatgpt/. Accessed: 2024-07-13.

Google Scholar

[2]

Google. [n. d.]. Personalized AI for you | Gemini. Youtube. https://www.youtube.com/watch?v=v5tRc_5-8G4

Google Scholar

[3]

Qirui Huang, Min Lu, Joel Lanir, Dani Lischinski, Daniel Cohen-Or, and Hui Huang. 2024. GraphiMind: LLM-centric Interface for Information Graphics Design. arXiv preprint arXiv:2401.13245 (2024).

Google Scholar

[4]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.

Google Scholar

[5]

Xiao Ma, Swaroop Mishra, Ariel Liu, Sophie Ying Su, Jilin Chen, Chinmay Kulkarni, Heng-Tze Cheng, Quoc Le, and Ed Chi. 2024. Beyond chatbots: Explorellm for structured thoughts and personalized model responses. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

Google Scholar

[6]

Yijia Shao, Yucheng Jiang, Theodore A Kanell, Peter Xu, Omar Khattab, and Monica S Lam. 2024. Assisting in writing wikipedia-like articles from scratch with large language models. arXiv preprint arXiv:2402.14207 (2024).

Google Scholar

[7]

Chenglei Si, Yanzhe Zhang, Zhengyuan Yang, Ruibo Liu, and Diyi Yang. 2024. Design2Code: How Far Are We From Automating Front-End Engineering?arXiv preprint arXiv:2403.03163 (2024).

Google Scholar

[8]

Sangho Suh, Meng Chen, Bryan Min, Toby Jia-Jun Li, and Haijun Xia. 2023. Structured generation and exploration of design space with large language models for human-ai co-creation. arXiv preprint arXiv:2310.12953 (2023).

Google Scholar

[9]

Gemini Team, Rohan Anil, Sebastian Borgeaud, Yonghui Wu, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, 2023. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023).

Google Scholar

[10]

Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, 2022. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239 (2022).

Google Scholar

[11]

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems 35 (2022), 24824–24837.

Google Scholar

[12]

Zhenning Zhang, Yunan Zhang, Suyu Ge, Guangwei Weng, Mridu Narang, Xia Song, and Saurabh Tiwary. 2024. GenSERP: Large Language Models for Whole Page Presentation. arXiv preprint arXiv:2402.14301 (2024).

Google Scholar

Index Terms

Bespoke: Using LLM agents to generate just-in-time interfaces by reasoning about user intent
1. Human-centered computing
  1. Interaction design
    1. Systems and tools for interaction design

Recommendations

Equivalent representations of multimodal user interfaces

While providing non-visual access to graphical user interfaces has been a topic of research for over 20 years, blind users still face many obstacles when using computer systems. Furthermore, daily life has become more and more infused with devices that ...
Usability of user interfaces: from monomodal to multimodal
BCS-HCI '07: Proceedings of the 21st British HCI Group Annual Conference on People and Computers: HCI...but not as we know it - Volume 2

This workshop is aimed at reviewing and comparing existing Usability Evaluation Methods (UEMs) which are applicable to monomodal and multimodal applications, whether they are web-oriented or not. It addresses the problem on how to assess the usability ...
Using task models to generate multi-platform user interfaces while ensuring usability
CHI EA '02: CHI '02 Extended Abstracts on Human Factors in Computing Systems

The widespread emergence of new computing devices and associated interaction metaphors has necessitated new ways of building User Interfaces (UIs) for these devices. In this paper, we describe our approach of using a Task Model in conjunction with the ...

Comments

Information & Contributors

Information

Published In

ICMI Companion '24: Companion Proceedings of the 26th International Conference on Multimodal Interaction

November 2024

252 pages

ISBN:9798400704635

DOI:10.1145/3686215

Editors:
Hayley Hung
Delft University of Technology
,
Catharine Oertel
Delft University of Technology
,
Mohammad Soleymani
University of Southern California
,
Theodora Chaspari
University of Boulder, Colorado
,
Hamdi Dibeklioglu
Bilkent University
,
Jainendra Shukla
IIIT Delhi
,
Khiet Truong
University of Twente

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2024

Check for updates

Author Tags

Qualifiers

Demonstration
Research
Refereed limited

Conference

ICMI '24

Sponsor:

SIGCHI

ICMI '24: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

November 4 - 8, 2024

San Jose, Costa Rica

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
212
Total Downloads

Downloads (Last 12 months)212
Downloads (Last 6 weeks)61

Reflects downloads up to 13 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Abstract

Supplemental Material

References

Index Terms

Recommendations

Equivalent representations of multimodal user interfaces

Usability of user interfaces: from monomodal to multimodal

Using task models to generate multi-platform user interfaces while ensuring usability

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

HTML Format

Share

Share this Publication link

Share on social media

Affiliations