skip to main content
10.1145/3686215.3688372acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
demonstration

Bespoke: Using LLM agents to generate just-in-time interfaces by reasoning about user intent

Published: 04 November 2024 Publication History

Abstract

Large language models (LLMs) have emerged as a powerful tool for creating personalized knowledge experiences for users, often serving as their own interface through text-based chatbots. The interpretation of user intent and generation of output occur implicitly within the model’s architecture. We propose an alternative approach in a system we call Bespoke where the LLM acts as an agent to explicitly reason about user intent, plan, and generate graphical interfaces to fulfill that intent. This approach enables the creation of visually rich interactions that complement chat-based interactions. By employing a step-by-step reasoning process to reduce ambiguity and keep the model on track, we compose interfaces from a toolkit of widgets, providing a designed and tailored user experience. Our early experiment shows that the output interface differs depending on the interpreted intent. In the current version, these interactions are multimodal in the automatic generation of UI; in future versions, this paradigm can be extended to multiple modalities of input and output. This agentive approach moves the interface towards a personalized, bespoke experience with multimodal interaction that adapts to the user’s intentions. See video demonstration here [2].

Supplemental Material

MOV File
A video explanation of the Bespoke UI system, powered by Gemini.

References

[1]
[n. d.]. Introducing ChatGPT. https://openai.com/index/chatgpt/. Accessed: 2024-07-13.
[2]
Google. [n. d.]. Personalized AI for you | Gemini. Youtube. https://www.youtube.com/watch?v=v5tRc_5-8G4
[3]
Qirui Huang, Min Lu, Joel Lanir, Dani Lischinski, Daniel Cohen-Or, and Hui Huang. 2024. GraphiMind: LLM-centric Interface for Information Graphics Design. arXiv preprint arXiv:2401.13245 (2024).
[4]
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.
[5]
Xiao Ma, Swaroop Mishra, Ariel Liu, Sophie Ying Su, Jilin Chen, Chinmay Kulkarni, Heng-Tze Cheng, Quoc Le, and Ed Chi. 2024. Beyond chatbots: Explorellm for structured thoughts and personalized model responses. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems. 1–12.
[6]
Yijia Shao, Yucheng Jiang, Theodore A Kanell, Peter Xu, Omar Khattab, and Monica S Lam. 2024. Assisting in writing wikipedia-like articles from scratch with large language models. arXiv preprint arXiv:2402.14207 (2024).
[7]
Chenglei Si, Yanzhe Zhang, Zhengyuan Yang, Ruibo Liu, and Diyi Yang. 2024. Design2Code: How Far Are We From Automating Front-End Engineering?arXiv preprint arXiv:2403.03163 (2024).
[8]
Sangho Suh, Meng Chen, Bryan Min, Toby Jia-Jun Li, and Haijun Xia. 2023. Structured generation and exploration of design space with large language models for human-ai co-creation. arXiv preprint arXiv:2310.12953 (2023).
[9]
Gemini Team, Rohan Anil, Sebastian Borgeaud, Yonghui Wu, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, 2023. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023).
[10]
Romal Thoppilan, Daniel De Freitas, Jamie Hall, Noam Shazeer, Apoorv Kulshreshtha, Heng-Tze Cheng, Alicia Jin, Taylor Bos, Leslie Baker, Yu Du, 2022. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239 (2022).
[11]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems 35 (2022), 24824–24837.
[12]
Zhenning Zhang, Yunan Zhang, Suyu Ge, Guangwei Weng, Mridu Narang, Xia Song, and Saurabh Tiwary. 2024. GenSERP: Large Language Models for Whole Page Presentation. arXiv preprint arXiv:2402.14301 (2024).

Index Terms

  1. Bespoke: Using LLM agents to generate just-in-time interfaces by reasoning about user intent

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMI Companion '24: Companion Proceedings of the 26th International Conference on Multimodal Interaction
    November 2024
    252 pages
    ISBN:9798400704635
    DOI:10.1145/3686215
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 November 2024

    Check for updates

    Author Tags

    1. Agents
    2. Generated UI
    3. HCI
    4. LLM

    Qualifiers

    • Demonstration
    • Research
    • Refereed limited

    Conference

    ICMI '24
    Sponsor:
    ICMI '24: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
    November 4 - 8, 2024
    San Jose, Costa Rica

    Acceptance Rates

    Overall Acceptance Rate 453 of 1,080 submissions, 42%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 212
      Total Downloads
    • Downloads (Last 12 months)212
    • Downloads (Last 6 weeks)61
    Reflects downloads up to 13 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media