skip to main content
10.1145/3674829.3675082acmconferencesArticle/Chapter ViewAbstractPublication PagescompassConference Proceedingsconference-collections
short-paper

Comuniqa: Exploring Large Language Models For Improving English Speaking Skills

Published: 28 August 2024 Publication History

Abstract

In this paper, we investigate the potential of Large Language Models (LLMs) to improve English speaking skills. This is particularly relevant in countries like India, where English is crucial for academic, professional, and personal communication but remains a non-native language for many. Traditional methods for enhancing speaking skills often rely on human experts, which can be limited in terms of scalability, accessibility, and affordability. Recent advancements in Artificial Intelligence (AI) offer promising solutions to overcome these limitations.
We propose Comuniqa, a novel LLM-based system designed to enhance English speaking skills. We adopt a human-centric evaluation approach, comparing Comuniqa with the feedback and instructions provided by human experts. In our evaluation, we divide the participants in three groups: those who use LLM-based system for improving speaking skills, those guided by human experts for the same task and those who utilize both the LLM-based system as well as the human experts. Using surveys, interviews, and actual study sessions, we provide a detailed perspective on the effectiveness of different learning modalities. Our preliminary findings suggest that while LLM-based systems have commendable accuracy, they lack human-level cognitive capabilities, both in terms of accuracy and empathy. Nevertheless, Comuniqa represents a significant step towards achieving Sustainable Development Goal 4: Quality Education by providing a valuable learning tool for individuals who may not have access to human experts for improving their speaking skills.

References

[1]
[n. d.]. Effects of reciprocal peer feedback on EFL learners’ communication strategy use and oral communication performance - Smart Learning Environments — link.springer.com. https://link.springer.com/article/10.1186/s40561-018-0061-2. [Accessed 26-01-2024].
[2]
Trevor Ashby, Braden K. Webb, Gregory Knapp, Jackson Searle, and Nancy Fulda. 2023. Personalized Quest and Dialogue Generation in Role-Playing Games: A Knowledge Graph- and Language Model-based Approach. In Conference on Human Factors in Computing Systems - Proceedings. Association for Computing Machinery, 20. https://doi.org/10.1145/3544548.3581441
[3]
Brett A. Becker, Paul Denny, James Finnie-Ansley, Andrew Luxton-Reilly, James Prather, and Eddie Antonio Santos. 2023. Programming Is Hard - Or at Least It Used to Be: Educational Opportunities and Challenges of AI Code Generation. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (, Toronto ON, Canada, ) (SIGCSE 2023). Association for Computing Machinery, New York, NY, USA, 500–506. https://doi.org/10.1145/3545945.3569759
[4]
Paul Boersma and David Weenink. 2021. Praat: doing phonetics by computer [Computer program]. Version 6.1.38, retrieved 2 January 2021 http://www.praat.org/.
[5]
Virginia Braun and Victoria Clarke. 2006. Using thematic analysis in psychology. Qualitative Research in Psychology 3, 2 (2006), 77–101. https://doi.org/10.1191/1478088706qp063oa arXiv:https://www.tandfonline.com/doi/pdf/10.1191/1478088706qp063oa
[6]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arxiv:2005.14165 [cs.CL]
[7]
Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, and Yi Zhang. 2023. Sparks of Artificial General Intelligence: Early experiments with GPT-4. arxiv:2303.12712 [cs.CL]
[8]
John Joon Young Chung, Wooseok Kim, Kang Min Yoo, Hwaran Lee, Eytan Adar, and Minsuk Chang. 2022. TaleBrush: Sketching Stories with Generative Pretrained Language Models. In Conference on Human Factors in Computing Systems - Proceedings. Association for Computing Machinery, 19. https://doi.org/10.1145/3491102.3501819
[9]
Comuniqa. [n. d.]. Comuniqa - Apps on Google Play — play.google.com. https://play.google.com/store/apps/details?id=com.comuniqa&hl=en&gl=US. [Accessed 25-01-2024].
[10]
ETS. [n. d.]. toefl-speaking-scores. https://www.ets.org/content/dam/ets-org/pdfs/toefl/toefl-ibt-speaking-rubrics.pdf.
[11]
Toni Giorgino. 2009. Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package. Journal of Statistical Software 31, 7 (2009). https://doi.org/10.18637/jss.v031.i07
[12]
Inc. Goodreads. [n. d.]. The Art of Public Speaking — goodreads.com. https://www.goodreads.com/en/book/show/3363618. [Accessed 26-01-2024].
[13]
Google. [n. d.]. Android Apps on Google Play — play.google.com. https://play.google.com/store/. [Accessed 25-01-2024].
[14]
Yannick Jadoul, Bill Thompson, and Bart de Boer. 2018. Introducing Parselmouth: A Python interface to Praat. Journal of Phonetics 71 (2018), 1–15. https://doi.org/10.1016/j.wocn.2018.07.001
[15]
Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, and Mor Naaman. 2023. Co-Writing with Opinionated Language Models Affects Users’ Views. In Conference on Human Factors in Computing Systems - Proceedings. Association for Computing Machinery, 15. https://doi.org/10.1145/3544548.3581196 arxiv:2302.00560
[16]
Junaidi Junaidi. 2020. Artificial intelligence in EFL context: rising students’ speaking performance with Lyra virtual assistance. International Journal of Advanced Science and Technology Rehabilitation 29, 5 (2020), 6735–6741.
[17]
Qintong Li, Leyang Cui, Lingpeng Kong, and Wei Bi. 2023. Collaborative Evaluation: Exploring the Synergy of Large Language Models and Humans for Open-ended Generation Evaluation. arxiv:2310.19740 [cs.CL]
[18]
Jérôme Louradour. 2023. whisper-timestamped. https://github.com/linto-ai/whisper-timestamped.
[19]
[19] APNI YAARI EDUCATION PVT. LTD. [n. d.]. https://www.englishyaari.com/.
[20]
IDP Education Ltd. [n. d.]. ielts-band-scores. https://ieltsjp.com/japan/about/about-ielts/ielts-band-scores/en-gb.
[21]
Microsoft. [n. d.]. Recognizing and avoiding filler words. https://www.microsoft.com/en-us/microsoft-365-life-hacks/writing/recognizing-avoiding-filler-words.
[22]
[22] Adam Montgomerie. [n. d.]. https://amontgomerie.github.io/2021/03/14/cefr-level-prediction.html.
[23]
Clifford Nass and Youngme Moon. 2000. Machines and Mindlessness: Social Responses to Computers. The Society for the Psychological Study of Social Issues 56 (2000). Issue 1. https://doi.org/10.1111/0022-4537.00153
[24]
[24] Pearson. [n. d.]. https://www.english.com/gse/teacher-toolkit/user/vocabulary.
[25]
Zhenhui Peng, Xingbo Wang, Qiushi Han, Junkai Zhu, Xiaojuan Ma, and Huamin Qu. 2023. Storyfier: Exploring Vocabulary Learning Support with Text Generation Models. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (, San Francisco, CA, USA, ) (UIST ’23). Association for Computing Machinery, New York, NY, USA, Article 46, 16 pages. https://doi.org/10.1145/3586183.3606786
[26]
Savvas Petridis, Nicholas Diakopoulos, Kevin Crowston, Mark Hansen, Keren Henderson, Stan Jastrzebski, Jeffrey V. Nickerson, and Lydia B. Chilton. 2023. AngleKindling: Supporting Journalistic Angle Ideation with Large Language Models. In Conference on Human Factors in Computing Systems - Proceedings. Association for Computing Machinery, 16. https://doi.org/10.1145/3544548.3580907
[27]
Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. 2023. Robust speech recognition via large-scale weak supervision. In Proceedings of the 40th International Conference on Machine Learning (Honolulu, Hawaii, USA) (ICML’23). JMLR.org, Article 1182, 27 pages.
[28]
Sherry Ruan, Liwei Jiang, Justin Xu, Bryce Joe Kun Tham, Zhengneng Qiu, Yeshuang Zhu, Elizabeth L. Murnane, Emma Brunskill, and James A. Landay. 2019. QuizBot: A Dialogue-based Adaptive Learning System for Factual Knowledge. In Conference on Human Factors in Computing Systems - Proceedings, Vol. 13. Association for Computing Machinery. https://doi.org/10.1145/3290605.3300587
[29]
Sherry Ruan, Liwei Jiang, Qianyao Xu, Zhiyuan Liu, Glenn M Davis, Emma Brunskill, and James A. Landay. 2021. EnglishBot: An AI-Powered Conversational System for Second Language Learning. In 26th International Conference on Intelligent User Interfaces (College Station, TX, USA) (IUI ’21). Association for Computing Machinery, New York, NY, USA, 434–444. https://doi.org/10.1145/3397481.3450648
[30]
Hua Shen and Tongshuang Wu. 2023. Parachute: Evaluating Interactive Human-LM Co-writing Systems. arxiv:2303.06333 [cs.HC]
[31]
[31] Azure Speech. [n. d.]. https://azure.microsoft.com/en-us/products/ai-services/ai-speech/.
[32]
M. Iftekhar Tanveer, Emy Lin, and Mohammed (Ehsan) Hoque. 2015. Rhema: A Real-Time In-Situ Intelligent Interface to Help People with Public Speaking. In Proceedings of the 20th International Conference on Intelligent User Interfaces (Atlanta, Georgia, USA) (IUI ’15). Association for Computing Machinery, New York, NY, USA, 286–295. https://doi.org/10.1145/2678025.2701386
[33]
H. Trinh, R. Asadi, D. Edge, and T. Bickmore. 2017. RoboCOP: A Robotic Coach for Oral Presentations. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 2, Article 27 (jun 2017), 24 pages. https://doi.org/10.1145/3090092
[34]
I. Tuhovsky. 2015. Communication Skills: A Practical Guide to Improving Your Social Intelligence, Presentation, Persuasion and Public Speaking. CreateSpace Independent Publishing Platform. https://books.google.co.in/books?id=WBsGswEACAAJ
[35]
Stephanie Valencia, Richard Cave, Krystal Kallarackal, Katie Seaver, Michael Terry, and Shaun K. Kane. 2023. "The less I type, the better": How AI Language Models can Enhance or Impede Communication for AAC Users. In Conference on Human Factors in Computing Systems - Proceedings. Association for Computing Machinery, 14. https://doi.org/10.1145/3544548.3581560
[36]
Xingbo Wang, Haipeng Zeng, Yong Wang, Aoyu Wu, Zhida Sun, Xiaojuan Ma, and Huamin Qu. 2020. VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (, Honolulu, HI, USA,) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3313831.3376726
[37]
Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, and William Fedus. 2022. Emergent Abilities of Large Language Models. arxiv:2206.07682 [cs.CL]
[38]
Ann Yuan, Andy Coenen, Emily Reif, and Daphne Ippolito. 2022. Wordcraft: Story Writing With Large Language Models. In 27th International Conference on Intelligent User Interfaces (Helsinki, Finland) (IUI ’22). Association for Computing Machinery, New York, NY, USA, 841–852. https://doi.org/10.1145/3490099.3511105

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
COMPASS '24: Proceedings of the 7th ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies
July 2024
354 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 August 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Human-LLM Collaboration
  2. Large Language Models
  3. Speaking Skills
  4. Speech Interface

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

COMPASS '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 25 of 50 submissions, 50%

Upcoming Conference

COMPASS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 150
    Total Downloads
  • Downloads (Last 12 months)150
  • Downloads (Last 6 weeks)15
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media