skip to main content
10.1145/3586182.3615978acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
demonstration

Experiencing Visual Captions: Augmented Communication with Real-time Visuals using Large Language Models

Published:29 October 2023Publication History

ABSTRACT

We demonstrate Visual Captions, a real-time system that integrates with a video conferencing platform to enrich verbal communication. Visual Captions leverages a fine-tuned large language model to proactively suggest visuals that are relevant to the context of the ongoing conversation. We implemented Visual Captions as a user-customizable Chrome plugin with three levels of AI proactivity: Auto-display (AI autonomously adds visuals), Auto-suggest (AI proactively recommends visuals), and On-demand-suggest (AI suggests visuals when prompted). We showcase the usage of Visual Captions in open-vocabulary settings, and how the addition of visuals based on the context of conversations could improve comprehension of complex or unfamiliar concepts. In addition, we demonstrate three approaches people can interact with the system with different levels of AI proactivity. Visual Captions is open-sourced at https://github.com/google/archat.

Skip Supplemental Material Section

Supplemental Material

References

  1. Xingyu "Bruce" Liu, Vladimir Kirilyuk, Xiuxiu Yuan, Alex Olwal, Peggy Chi, Xiang "Anthony" Chen, and Ruofei Du. 2023. Visual Captions: Augmenting Verbal Communication with On-the-Fly Visuals. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 108, 20 pages. https://doi.org/10.1145/3544548.3581566Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Experiencing Visual Captions: Augmented Communication with Real-time Visuals using Large Language Models

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Article Metrics

        • Downloads (Last 12 months)195
        • Downloads (Last 6 weeks)35

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format