research-article

Serendipity Wall: A Discussion Support System Using Real-time Speech Recognition and Large Language Model

Authors:
Shota Imamura

The University of Tokyo, JP

The University of Tokyo, JP

0009-0005-8440-0670
View Profile

,
Hirotaka Hiraki

The University of Tokyo, JP

The University of Tokyo, JP

0000-0002-6543-4593
View Profile

,
Jun Rekimoto

The University of Tokyo, Sony CSL Kyoto, JP

The University of Tokyo, Sony CSL Kyoto, JP

0000-0002-3629-2514
View Profile

AHs '24: Proceedings of the Augmented Humans International Conference 2024April 2024Pages 237–247https://doi.org/10.1145/3652920.3652931

Published:01 May 2024Publication History

AHs '24: Proceedings of the Augmented Humans International Conference 2024

Pages 237–247

ABSTRACT

Group discussions are important for exploring new ideas. Discussion support systems will enhance human creative ability through better discussion experiences. One method to support discussions is presenting relevant keywords or images. However, the context of the conversation and information tended not to be taken into account. Therefore, we propose a system that develops group discussions by presenting related information in response to discussions. As a specific example, this study addressed academic discussions among HCI researchers. During brainstorming sessions, the system continuously transcribes the dialogue and generates embedding vectors of the discussions. These vectors are matched against those of existing research articles to identify relevant studies. Then, the system presented relevant studies on the screen with summaries by an LLM. In case studies, this system had the effect of broadening the topics of discussion and facilitating the acquisition of new knowledge. This study showed the possibility that AI can facilitate discussion by providing discussion support through information retrieval and summarizing.

Supplemental Material

v1.2.mp4

Movie explanation

mp4

272.4 MB

Download

References

2023. CHI ’23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany). Association for Computing Machinery, New York, NY, USA.Google Scholar
Salvatore Andolina, Khalil Klouche, Diogo Cabral, Tuukka Ruotsalo, and Giulio Jacucci. 2015. InspirationWall: Supporting Idea Generation Through Automatic Information Exploration. In Proceedings of the 2015 ACM SIGCHI Conference on Creativity and Cognition (Glasgow, United Kingdom) (C&C ’15). Association for Computing Machinery, New York, NY, USA, 103–106.Google ScholarDigital Library
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdfGoogle Scholar
Qi Chen, Haidong Wang, Mingqin Li, Gang Ren, Scarlett Li, Jeffery Zhu, Jason Li, Chuanjie Liu, Lintao Zhang, and Jingdong Wang. 2018. SPTAG: A library for fast approximate nearest neighbor search. https://github.com/Microsoft/SPTAGGoogle Scholar
Michael A. Covington and Joe D. McFall. 2010. Cutting the Gordian Knot: The Moving-Average Type–Token Ratio (MATTR). Journal of Quantitative Linguistics 17, 2 (2010), 94–100. https://doi.org/10.1080/09296171003643098 arXiv:https://doi.org/10.1080/09296171003643098Google ScholarCross Ref
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423Google ScholarCross Ref
Giulia Di Fede, Davide Rocchesso, Steven P Dow, and Salvatore Andolina. 2022. The Idea Machine: LLM-based Expansion, Rewriting, Combination, and Suggestion of Ideas. In Proceedings of the 14th Conference on Creativity and Cognition (Venice, Italy) (C&C ’22). Association for Computing Machinery, New York, NY, USA, 623–627.Google ScholarDigital Library
Jonas Frich, Lindsay MacDonald Vermeulen, Christian Remy, Michael Mose Biskjaer, and Peter Dalsgaard. 2019. Mapping the Landscape of Creativity Support Tools in HCI. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19, Paper 389). Association for Computing Machinery, New York, NY, USA, 1–18.Google ScholarDigital Library
Katy Ilonka Gero, Vivian Liu, and Lydia Chilton. 2022. Sparks: Inspiration for Science Writing using Language Models. In Proceedings of the 2022 ACM Designing Interactive Systems Conference (Virtual Event, Australia) (DIS ’22). Association for Computing Machinery, New York, NY, USA, 1002–1019.Google ScholarDigital Library
Christensen P. R. Merrifield P. R. & Wilson R. C. Guilford, J. P.1960. Alternate Uses (ALTUS). APA PsycTests (1960).Google Scholar
Jiafeng Guo, Yinqiong Cai, Yixing Fan, Fei Sun, Ruqing Zhang, and Xueqi Cheng. 2022. Semantic Models for the First-Stage Retrieval: A Comprehensive Review. ACM Trans. Inf. Syst. Secur. 40, 4 (March 2022), 1–42.Google Scholar
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W Bruce Croft. 2016. A Deep Relevance Matching Model for Ad-hoc Retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (Indianapolis, Indiana, USA) (CIKM ’16). Association for Computing Machinery, New York, NY, USA, 55–64.Google ScholarDigital Library
Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, and Mor Naaman. 2023. Co-Writing with Opinionated Language Models Affects Users’ Views. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23, Article 111). Association for Computing Machinery, New York, NY, USA, 1–15.Google ScholarDigital Library
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2021. Billion-Scale Similarity Search with GPUs. IEEE Transactions on Big Data 7, 3 (July 2021), 535–547.Google ScholarCross Ref
Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, and Mike Lewis. 2019. Generalization through memorization: Nearest neighbor language models. (Oct. 2019). arxiv:1911.00172 [cs.CL]Google Scholar
Jeongyeon Kim, Sangho Suh, Lydia B Chilton, and Haijun Xia. 2023. Metaphorian: Leveraging Large Language Models to Support Extended Metaphor Creation for Science Writing. In Proceedings of the 2023 ACM Designing Interactive Systems Conference (Pittsburgh, PA, USA) (DIS ’23). Association for Computing Machinery, New York, NY, USA, 115–135.Google ScholarDigital Library
Yui Kita and Jun Rekimoto. 2018. V8 Storming: How Far Should Two Ideas Be?. In Proceedings of the 9th Augmented Human International Conference (Seoul, Republic of Korea) (AH ’18, Article 14). Association for Computing Machinery, New York, NY, USA, 1–8.Google ScholarDigital Library
Scott R Klemmer, Mark W Newman, Ryan Farrell, Mark Bilezikjian, and James A Landay. 2001. The designers’ outpost: a tangible interface for collaborative web site. In Proceedings of the 14th annual ACM symposium on User interface software and technology (Orlando, Florida) (UIST ’01). Association for Computing Machinery, New York, NY, USA, 1–10.Google ScholarDigital Library
T. KUDO. 2005. MeCab : Yet Another Part-of-Speech and Morphological Analyzer. http://mecab.sourceforge.net/ (2005). https://cir.nii.ac.jp/crid/1572543025344508032Google Scholar
Khanh-Duy Le, Paweł W Woźniak, Ali Alavi, Morten Fjeld, and Andreas Kunz. 2019. DigiMetaplan: supporting facilitated brainstorming for distributed business teams. In Proceedings of the 18th International Conference on Mobile and Ubiquitous Multimedia (Pisa, Italy) (MUM ’19, Article 36). Association for Computing Machinery, New York, NY, USA, 1–12.Google ScholarDigital Library
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Comput. Surv. 55, 9, Article 195 (jan 2023), 35 pages. https://doi.org/10.1145/3560815Google ScholarDigital Library
Vivian Liu and Lydia B Chilton. 2022. Design Guidelines for Prompt Engineering Text-to-Image Generative Models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 384, 23 pages. https://doi.org/10.1145/3491102.3501825Google ScholarDigital Library
Vivian Liu, Han Qiao, and Lydia Chilton. 2022. Opal: Multimodal Image Generation for News Illustration. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Association for Computing Machinery, New York, NY, USA, Article 73, 17 pages. https://doi.org/10.1145/3526113.3545621Google ScholarDigital Library
Ryan Louie, Andy Coenen, Cheng Zhi Huang, Michael Terry, and Carrie J. Cai. 2020. Novice-AI Music Co-Creation via AI-Steering Tools for Deep Generative Models. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–13. https://doi.org/10.1145/3313831.3376739Google ScholarDigital Library
Philip M McCarthy and Scott Jarvis. 2010. MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior research methods 42, 2 (2010), 381–392.Google Scholar
Lucas Memmert and Navid Tavanapour. 2023. TOWARDS HUMAN-AI-COLLABORATION IN BRAINSTORMING: EMPIRICAL INSIGHTS INTO THE PERCEPTION OF WORKING WITH A GENERATIVE AI. (2023), 429. https://aisel.aisnet.org/ecis2023_rp/429Google Scholar
Tomas Mikolov, Kai Chen, G.s Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. Proceedings of Workshop at ICLR 2013 (01 2013).Google Scholar
Bernard A. Nijstad, Carsten K. W. De Dreu, Eric F. Rietzschel, and Matthijs Baas. 2010. The dual pathway to creativity model: Creative ideation as a function of flexibility and persistence. European Review of Social Psychology 21, 1 (2010), 34–77. https://doi.org/10.1080/10463281003765323 arXiv:https://doi.org/10.1080/10463281003765323Google ScholarCross Ref
Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. (Jan. 2019). arxiv:1901.04085 [cs.IR]Google Scholar
J F Nunamaker, Alan R Dennis, Joseph S Valacich, Douglas Vogel, and Joey F George. 1991. Electronic meeting systems. Commun. ACM 34, 7 (July 1991), 40–61.Google Scholar
OpenAI. 2022. Embeddings - OpenAI API. Retrieved Spt 14, 2023 from https://platform.openai.com/docs/guides/embeddingsGoogle Scholar
OpenAI. 2022. Introducing ChatGPT and Whisper APIs. Retrieved Spt 14, 2023 from https://openai.com/blog/introducing-chatgpt-and-whisper-apisGoogle Scholar
OpenAI. 2023. GPT models - OpenAI API. Retrieved Spt 14, 2023 from https://platform.openai.com/docs/guides/gpt/chat-completions-apiGoogle Scholar
Alex Osborn. 1953. Applied Imagination: Principles and Procedures of Creative Problem Solving. Scribner, New York.Google Scholar
Hiroyuki Osone, Jun-Li Lu, and Yoichi Ochiai. 2021. BunCho: AI Supported Story Co-Creation via Unsupervised Multitask Learning to Increase Writers’ Creativity in Japanese. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI EA ’21, Article 19). Association for Computing Machinery, New York, NY, USA, 1–10.Google ScholarDigital Library
Sangah Park, Yoon Young Lee, Soobin Cho, Minjoon Kim, and Joongseek Lee. 2021. “Knock Knock, Here Is an Answer from Next Door”: Designing a Knowledge Sharing Chatbot to Connect Residents: Community Chatbot Design Case Study. In Companion Publication of the 2021 Conference on Computer Supported Cooperative Work and Social Computing (Virtual Event, USA) (CSCW ’21). Association for Computing Machinery, New York, NY, USA, 144–148.Google Scholar
Savvas Petridis, Nicholas Diakopoulos, Kevin Crowston, Mark Hansen, Keren Henderson, Stan Jastrzebski, Jeffrey V Nickerson, and Lydia B Chilton. 2023. AngleKindling: Supporting Journalistic Angle Ideation with Large Language Models. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23, Article 225). Association for Computing Machinery, New York, NY, USA, 1–16.Google ScholarDigital Library
Alain Pinsonneault, Henri Barki, R Brent Gallupe, and Norberto Hoppen. 1999. Electronic Brainstorming: The Illusion of Productivity. Information Systems Research 10, 2 (June 1999), 110–133.Google Scholar
Sebastián Ramírez. 2024. FastAPI. Retrieved Jan 17, 2024 from https://fastapi.tiangolo.comGoogle Scholar
Amon Rapp, Arianna Boldi, Lorenzo Curti, Alessandro Perrucci, and Rossana Simeoni. 2023. Collaborating with a Text-Based Chatbot: An Exploration of Real-World Collaboration Strategies Enacted during Human-Chatbot Interactions. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23, Article 115). Association for Computing Machinery, New York, NY, USA, 1–17.Google ScholarDigital Library
Vildan Salikutluk, Dorothea Koert, and Frank Jäkel. 2023. Interacting with large language models: A case study on AI-aided brainstorming for guesstimation problems. In Frontiers in Artificial Intelligence and Applications. Frontiers in artificial intelligence and applications, Vol. 368. HHAI 2023 Augmenting Human Intellect, 153 – 167. https://doi.org/10.3233/FAIA230081Google ScholarCross Ref
Yang Shi, Yang Wang, Ye Qi, John Chen, Xiaoyao Xu, and Kwan-Liu Ma. 2017. IdeaWall: Improving Creative Collaboration through Combinatorial Visual Stimuli. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (Portland, Oregon, USA) (CSCW ’17). Association for Computing Machinery, New York, NY, USA, 594–603.Google ScholarDigital Library
Donghoon Shin, Soomin Kim, Ruoxi Shang, Joonhwan Lee, and Gary Hsieh. 2023. IntroBot: Exploring the Use of Chatbot-assisted Familiarization in Online Collaborative Groups. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23, Article 613). Association for Computing Machinery, New York, NY, USA, 1–13.Google ScholarDigital Library
Joongi Shin, Michael A Hedderich, Andrés Lucero, and Antti Oulasvirta. 2022. Chatbots Facilitating Consensus-Building in Asynchronous Co-Design. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (Bend, OR, USA) (UIST ’22, Article 78). Association for Computing Machinery, New York, NY, USA, 1–13.Google ScholarDigital Library
Ben Shneiderman. 2007. Creativity support tools: accelerating discovery and innovation. Commun. ACM 50, 12 (Dec. 2007), 20–32.Google ScholarDigital Library
Douglas Summers-Stay, Clare R. Voss, and Stephanie M. Lukin. 2023. Brainstorm, then Select: a Generative Language Model Improves Its Creativity Score. In The AAAI-23 Workshop on Creative AI Across Modalities. https://openreview.net/forum?id=8HwKaJ1wvlGoogle Scholar
Unity. 2024. Waht is Unity. Retrieved Jan 17, 2024 from https://learn.unity.com/tutorial/what-is-unityGoogle Scholar
Mathias Peter Verheijden and Mathias Funk. 2023. Collaborative Diffusion: Boosting Designerly Co-Creation with Generative AI. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI EA ’23, Article 73). Association for Computing Machinery, New York, NY, USA, 1–8.Google Scholar
Hao-Chuan Wang, Dan Cosley, and Susan R Fussell. 2010. Idea expander: supporting group brainstorming with conversationally triggered visual thinking stimuli. In Proceedings of the 2010 ACM conference on Computer supported cooperative work (Savannah, Georgia, USA) (CSCW ’10). Association for Computing Machinery, New York, NY, USA, 103–106.Google ScholarDigital Library
Sitong Wang, Savvas Petridis, Taeahn Kwon, Xiaojuan Ma, and Lydia B Chilton. 2023. PopBlends: Strategies for Conceptual Blending with Large Language Models. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 435, 19 pages. https://doi.org/10.1145/3544548.3580948Google ScholarDigital Library
Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft: Story Writing With Large Language Models. In 27th International Conference on Intelligent User Interfaces (Helsinki, Finland) (IUI ’22). Association for Computing Machinery, New York, NY, USA, 841–852.Google ScholarDigital Library
Xiaoyu Zhang, Jianping Li, Po-Wei Chi, Senthil Chandrasegaran, and Kwan-Liu Ma. 2023. ConceptEVA: Concept-Based Interactive Exploration and Customization of Document Summaries. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23, Article 204). Association for Computing Machinery, New York, NY, USA, 1–16.Google ScholarDigital Library

Index Terms

Serendipity Wall: A Discussion Support System Using Real-time Speech Recognition and Large Language Model
1. Human-centered computing

Recommendations

Improving speech playback using time-compression and speech recognition
CHI '04: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems

Despite the ready availability of digital recording technology and the continually decreasing cost of digital storage, browsing audio recordings remains a tedious task. This paper presents evidence in support of a system designed to assist with ...
Read More
Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Read More
Method to integrate speaker identification, speech recognition, and information retrieval algorithms for speaker-based information retrieval

This article proposes speakers' voice-based information (audio and video) retrieval systems, which combines speaker identification, speech recognition, and information retrieval algorithms. Information retrieval systems encompass system structure and a ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AHs '24: Proceedings of the Augmented Humans International Conference 2024
April 2024
355 pages
ISBN:9798400709807
DOI:10.1145/3652920
Editors:
Anusha Withana
The University of Sydney, AU
,
Mark Billinghurst
University of South Australia, AU
,
Karola Marky
Ruhr-University Bochum, DE
,
Zhanna Sarsenbayeva
The University of Sydney, AU
,
Don Samitha Elvitigala
Monash University, AU
,
Benjamin Tag
Monash University, AU
,
Steeven Villa
LMU Munich, DE
,
Yun Suen Pai
University of Auckland, NZ
Copyright © 2024 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 May 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Discussion
Embedding Vectors
Facilitation
Information retrieval
Large Language Models
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 55
  Total Downloads
- Downloads (Last 12 months)55
- Downloads (Last 6 weeks)55
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Serendipity Wall: A Discussion Support System Using Real-time Speech Recognition and Large Language Model

AHs '24: Proceedings of the Augmented Humans International Conference 2024

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Improving speech playback using time-compression and speech recognition

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Method to integrate speaker identification, speech recognition, and information retrieval algorithms for speaker-based information retrieval