research-article

Designing Pronunciation Learning Tools: The Case for Interactivity against Over-Engineering

Authors:

Sean Robertson,

Cosmin Munteanu,

Gerald PennAuthors Info & Claims

CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

Paper No.: 356, Pages 1 - 13

https://doi.org/10.1145/3173574.3173930

Published: 21 April 2018 Publication History

Abstract

Paired role-play is a common collaborative activity in language learning classrooms, adding meaning and cultural context to the learning process. This is complemented by teachers' immediate and explicit feedback. Interactive tools that provide explicit feedback during collaborative learning are scarce, however. More commonly, supporting dialogue practice takes the form of computer-aided single-student read-and-record activities. This limitation is partly due to the complexity of processing language learners' speech in unconstrained tasks. In this paper, we assess the value of pronunciation error detection algorithms within a realistic, software-aided, paired role-playing task with beginning learners of French. We found that students' pronunciations improve regardless of the type of error detector employed -- even for those using simple heuristics. We suggest that speech technologies for language learning have been too focused on engineering goals. Instead, new interactive designs supporting collaboration may be used to overcome engineering limitations and properly support students' engagement.

Supplementary Material

ZIP File (pn3189-file4.zip)

Download
115.26 KB

suppl.mov (pn3189-file3.mp4)

Supplemental video

Download
54.01 MB

MP4 File (pn3189.mp4)

Download
158.13 MB

References

[1]

Thom Baguley. 2012. Calculating and graphing within-subject confidence intervals for ANOVA. Behavior Research Methods 44, 1 (2012), 158--175.

[2]

Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. 2015. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67, 1 (2015), 1--48.

[3]

David Birdsong. 2007. Nativelike pronunciation among late learners of French as a second language. In Language experience in second language speech learning: in honor of James Emil Flege. John Benjamins Publishing, 99--116.

[4]

Stephen Bodnar, Catia Cucchiarini, Bart Penning de Vries, Helmer Strik, and Roeland van Hout. 2017. Learner affect in computerised L2 oral grammar practice with corrective feedback. Computer Assisted Language Learning 30, 3--4 (2017), 223--246.

[5]

Stephen Bodnar, Catia Cucchiarini, Helmer Strik, and Roeland van Hout. 2016. Evaluating the motivational impact of CALL systems: current practices and future directions. Computer Assisted Language Learning 29, 1 (2016), 186--212.

[6]

Judy Breitkreutz, Tracey Derwing, and Marian Rossiter. 2001. Pronunciation Teaching Practices in Canada. TESL Canada Journal 19, 1 (2001), 51--61.

[7]

Barbara Burnaby and Yilin Sun. 1989. Chinese Teachers' Views of Western Language Teaching: Context Informs Paradigms. TESOL Quarterly 23, 2 (1989), 219--238.

[8]

Susanne Carroll and Merrill Swain. 1993. Explicit and Implicit Negative Feedback. Studies in Second Language Acquisition 15, 03 (1993), 357--386.

[9]

Chen Chen, Xiaojun Meng, Shengdong Zhao, and Morten Fjeld. 2017. ReTool: Interactive Microtask and Workflow Design Through Demonstration. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 3551--3556.

Digital Library

[10]

Ray Clifford. 1998. Mirror, Mirror, on the Wall: Reflections on Computer Assisted Language Learning. CALICO Journal 16, 1 (1998), 1. http://search.proquest. com/docview/750443820?accountid=14771

[11]

Denis Cousineau. 2005. Confidence intervals in within-subject designs: A simpler solution to Loftus and Massons method. Tutorials in Quantitative Methods for Psychology 1 (2005), 42--45. http://www.tqmp.org/Content/vol01--1/p042/p042.pdf

[12]

Justin Cranshaw, Emad Elwany, Todd Newman, Rafal Kocielnik, Bowen Yu, Sandeep Soni, Jaime Teevan, and Andrés Monroy-Hernández. 2017. Calendar.Help: Designing a Workflow-Based Scheduling Agent with Humans in the Loop. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 2382--2393.

Digital Library

[13]

Gabriel Culbertson, Solace Shen, Erik Andersen, and Malte Jung. 2017. Have Your Cake and Eat It Too: Foreign Language Learning with a Crowdsourced Video Captioning System. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17). ACM, New York, NY, USA, 286--296.

Digital Library

[14]

Arturo Deza, Jeffrey R. Peters, Grant S. Taylor, Amit Surana, and Miguel P. Eckstein. 2017. Attention Allocation Aid for Visual Search. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 220--231.

Digital Library

[15]

Joost van Doremalen, Lou Boves, Jozef Colpaert, Catia Cucchiarini, and Helmer Strik. 2016. Evaluating automatic speech recognition-based language learning systems: a case study. Computer Assisted Language Learning 29, 4 (2016), 833--851.

[16]

Farzad Ehsani and Eva Knodt. 1998. Speech technology in computer-aided language learning: Strengths and limitations of a new CALL paradigm. Language Learning & Technology 2, 1 (1998), 45--60.

[17]

Maxine Eskenazi. 2009. An overview of spoken language technology for education. Speech Communication 51, 10 (2009), 832 -- 844.

Digital Library

[18]

James Emil Flege. 1987. The production of new and similar phones in a foreign language: Evidence for the effect of equivalence classification. Journal of phonetics 15, 1 (1987), 47--65.

[19]

James Emil Flege, Ocke-Schwen Bohn, and Sunyoung Jang. 1997. Effects of experience on non-native speakers' production and perception of English vowels. Journal of Phonetics 25, 4 (1997), 437 -- 470.

[20]

Nina Garrett. 2009. Computer-Assisted Language Learning Trends and Issues Revisited: Integrating Innovation. The Modern Language Journal 93 (2009), 719--740.

[21]

Lesson Nine GmbH. 2010. Tech Background: Babbel Speech Recognition. (June 2010). https://blog.babbel.com/tech-background-babbel-speechrecognition/

[22]

Lesson Nine GmbH. 2018. Babbel. (2018). https://www.babbel.com/

[23]

Ewa M. Golonka, Anita R. Bowles, Victor M. Frank, Dorna L. Richardson, and Suzanne Freynik. 2014. Technologies for foreign language learning: a review of technology types and their effectiveness. Computer Assisted Language Learning 27, 1 (2014), 70--105.

[24]

Google. 2018. Google Cloud Speech API. (2018). https://cloud.google.com/speech/

[25]

Saul Greenberg and Bill Buxton. 2008. Usability evaluation considered harmful (some of the time). In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, Florence, Italy, 111--120.

Digital Library

[26]

Guangwei Hu. 2002. Potential Cultural Resistance to Pedagogical Imports: The Case of Communicative Language Teaching in China. Language, Culture and Curriculum 15, 2 (2002), 93--105.

[27]

D. Huggins-Daines, M. Kumar, A. Chan, A.W. Black, M. Ravishankar, and A.I. Rudnicky. 2006. Pocketsphinx: A Free, Real-Time Continuous Speech Recognition System for Hand-Held Devices. In Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on, Vol. 1. I--I.

[28]

Yu-Wan Hung and Steve Higgins. 2016. Learners use of communication strategies in text-based and video-based synchronous computer-mediated communication environments: opportunities for language learning. Computer Assisted Language Learning 29, 5 (2016), 901--924.

[29]

Ivaylo Ilinkin and Sunghee Kim. 2017. Evaluation of Korean Text Entry Methods for Smartwatches. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 722--726.

Digital Library

[30]

Rosetta Stone Incorporated. 2018. Rosetta Stone. (2018). http://www.rosettastone.com/

[31]

Hernisa Kacorri, Kris M. Kitani, Jeffrey P. Bigham, and Chieko Asakawa. 2017. People with Visual Impairment Training Personal Object Recognizers: Feasibility and Challenges. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 5839--5849.

Digital Library

[32]

Matthew Kay, Shwetak N. Patel, and Julie A. Kientz. 2015. How Good is 85%?: A Survey Tool to Connect Classifier Evaluation to Acceptability of Accuracy. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 347--356.

Digital Library

[33]

Michael G. Kenward and James H. Roger. 1997. Small Sample Inference for Fixed Effects from Restricted Maximum Likelihood. Biometrics 53, 3 (1997), pp. 983--997. http://www.jstor.org/stable/2533558

[34]

Brian Mak, Manhung Siu, Mimi Ng, Yik-Cheung Tam, Yu-Chung Chan, Kin-Wah Chan, Ka-Yee Leung, Simon Ho, Fong-Ho Chong, Jimmy Wong, and Jacqueline Lo. 2003. PLASER: Pronunciation Learning via Automatic Speech Recognition. In Proceedings of the HLT-NAACL 03 Workshop on Building Educational Applications Using Natural Language Processing - Volume 2 (HLT-NAACL-EDUC '03). Association for Computational Linguistics, Stroudsburg, PA, USA, 23--29.

Digital Library

[35]

Richard Morey. 2008. Confidence Intervals from Normalized Data: A correction to Cousineau (2005). Tutorials in Quantitative Methods for Psychology 4, 2 (2008), 61--64. http://www.tqmp.org/Content/vol04--2/p061/p061.pdf

[36]

Ambra Neri, Catia Cucchiarini, and Helmer Strik. 2006. ASR-based corrective feedback on pronunciation: does it really work?. In Interspeech 2006. 1982--1985. http://www.isca-speech.org/archive/archive_papers/ interspeech_2006/i06_1372.pdf

[37]

Howard Nicholas, Patsy M. Lightbown, and Nina Spada. 2001. Recasts as Feedback to Language Learners. Language Learning 51, 4 (2001), 719--758.

[38]

Council of Europe. 2001. Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge University Press. https://rm.coe.int/CoERMPublicCommonSearchServices/ DisplayDCTMContent?documentId=0900001680459f97

[39]

Sean Robertson, Cosmin Munteanu, and Gerald Penn. 2016. Pronunciation Error Detection for New Language Learners. In Interspeech 2016. 2691--2695.

[40]

M. R. Salaberry. 1996. A Theoretical Foundation for the Development of Pedagogical Tasks in Computer Mediated Communication. CALICO Journal 14, 1 (1996), 5. http://search.proquest.com/docview/750317215? accountid=14771

[41]

Sandra J. Savignon. 1987. Communicative Language Teaching. Theory into Practice 26, 4 (1987), pp. 235--242. http://www.jstor.org/stable/1476834

[42]

Peter Skehan. 2003. Task-based instruction. Language Teaching 36, 1 (2003), 1--14.

[43]

Nina Spada and Yasuyo Tomita. 2010. Interactions Between Type of Instruction and Type of Language Feature: A Meta-Analysis: Type of Instruction and Language Feature. Language Learning 60, 2 (2010), 263--308.

[44]

Theban Stanley, Kadri Hacioglu, and Brian Pellom. 2011. Statistical Machine Translation Framework for Modeling Phonological Errors in Computer Assisted Pronunciation Training System. In ISCA Workshop on Speech and Language Technology in Education. Venice, Italy. http://project.cgm.unive.it/events/SLaTE2011/papers/ Stanley mt_for_phonological_error_modeling.pdf

[45]

B.G. Tabachnick and L.S. Fidell. 2012. Using Multivariate Statistics. Pearson Education, Limited. http://books.google.ca/books?id=ucj1ygAACAAJ

Digital Library

[46]

Joshua Tan, Lujo Bauer, Joseph Bonneau, Lorrie Faith Cranor, Jeremy Thomas, and Blase Ur. 2017. Can Unicorns Help Users Compare Crypto Key Fingerprints?. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 3787--3798.

Digital Library

[47]

Duolingo Team. 2018. Duolingo. (2018). https://www.duolingo.com/

[48]

Preben Wik, Rebecca Hincks, and Julia Hirschberg. 2009. Responses to Ville: A virtual language teacher for Swedish. (2009). http://academiccommons.columbia.edu/catalog/ac:160205

[49]

Silke M. Witt. 2012. Automatic error detection in pronunciation training: Where we are and where we need to go. In Proc. of the International Symposium on Automatic Detection of Errors in Pronunciation Training (ISADEPT), Vol. 6. 1--8.

[50]

Ping Yu, Yingxin Pan, Chen Li, Zengxiu Zhang, Qin Shi, Wenpei Chu, Mingzhuo Liu, and Zhiting Zhu. 2016. User-centred design for Chinese-oriented spoken english learning system. Computer Assisted Language Learning 29, 5 (2016), 984--1000.

[51]

Yong Zhao. 2003. Recent developments in technology and language learning: A literature review and meta-analysis. CALICO journal 21, 1 (2003), 7--27.

[52]

Huiyuan Zhou, Aisha Edrah, Bonnie MacKay, and Derek Reilly. 2017. Block Party: Synchronized Planning and Navigation Views for Neighbourhood Expeditions. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 1702--1713.

Digital Library

Cited By

Bear EChen X(2023)Evaluating a Conversational Agent for Second Language Learning Aligned with the School CurriculumArtificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky10.1007/978-3-031-36336-8_22(142-147)Online publication date: 30-Jun-2023
https://doi.org/10.1007/978-3-031-36336-8_22
Kawamura KRekimoto J(2022)DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00051(313-320)Online publication date: Dec-2022
https://doi.org/10.1109/ICMLA55696.2022.00051
B． Wong AHuang ZWu K(2022)Leveraging audible and inaudible signals for pronunciation training by sensing articulation through a smartphoneSpeech Communication10.1016/j.specom.2022.08.002144:C(42-56)Online publication date: 1-Oct-2022
https://dl.acm.org/doi/10.1016/j.specom.2022.08.002
Show More Cited By

Index Terms

Designing Pronunciation Learning Tools: The Case for Interactivity against Over-Engineering
1. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Automatic pronunciation scoring with score combination by learning to rank and class-normalized DP-based quantization

This paper proposes an automatic pronunciation scoring framework using learning to rank and class-normalized, dynamic-programming-based quantization. The goal is to train a model that is able to grade the pronunciation of a second language learner, such ...
Improving human scoring of prosody using parametric speech synthesis
Abstract
This paper proposes a method that utilizes parametric speech synthesis to improve human scoring of non-native speaker utterances. Instead of assessing each prosodic feature by directly listening to the utterance itself, in order to ...
Technology enhanced language learning: student motivation in computer assisted language learning
CompSysTech '07: Proceedings of the 2007 international conference on Computer systems and technologies

The aim of this paper is to present a new investigation on the impact of implementing technology on Bulgarian students' motivation after a Computer Assisted Language Learning (CALL) course. It suggests designing and applying instruments for assessing ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

April 2018

8489 pages

ISBN:9781450356206

DOI:10.1145/3173574

General Chairs:
Regan Mandryk
University of Saskatchewan, Canada
,
Mark Hancock
University of Waterloo, Canada
,
Program Chairs:
Mark Perry
Brunel University London, UK
,
Anna Cox
University College London, UK

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 April 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Ontario Centres of Excellence

Conference

CHI '18

Sponsor:

SIGCHI

CHI '18: CHI Conference on Human Factors in Computing Systems

April 21 - 26, 2018

Montreal QC, Canada

Acceptance Rates

CHI '18 Paper Acceptance Rate 666 of 2,590 submissions, 26%;

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
550
Total Downloads

Downloads (Last 12 months)31
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Bear EChen X(2023)Evaluating a Conversational Agent for Second Language Learning Aligned with the School CurriculumArtificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky10.1007/978-3-031-36336-8_22(142-147)Online publication date: 30-Jun-2023
https://doi.org/10.1007/978-3-031-36336-8_22
Kawamura KRekimoto J(2022)DDSupport: Language Learning Support System that Displays Differences and Distances from Model Speech2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00051(313-320)Online publication date: Dec-2022
https://doi.org/10.1109/ICMLA55696.2022.00051
B． Wong AHuang ZWu K(2022)Leveraging audible and inaudible signals for pronunciation training by sensing articulation through a smartphoneSpeech Communication10.1016/j.specom.2022.08.002144:C(42-56)Online publication date: 1-Oct-2022
https://dl.acm.org/doi/10.1016/j.specom.2022.08.002
Chaudhary ABelani MMaheshwari NParnami A(2021)Verbose : Designing a Context-based Educational System for Improving Communicative ExpressionsProceedings of the 23rd International Conference on Mobile Human-Computer Interaction10.1145/3447526.3472057(1-13)Online publication date: 27-Sep-2021
https://dl.acm.org/doi/10.1145/3447526.3472057
Bu YMa TLi WZhou HJia JChen SXu KShi DWu HYang ZLi KWu ZShi YLu XLiu ZKitamura YQuigley AIsbister KIgarashi TBjørn PDrucker S(2021)PTeacher: a Computer-Aided Personalized Pronunciation Training System with Exaggerated Audio-Visual Corrective FeedbackProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445490(1-14)Online publication date: 6-May-2021
https://dl.acm.org/doi/10.1145/3411764.3445490
Cho JAndersen EKizilcec R(2021)Delivery Ghost: Effects of Language Immersion and Interactivity in a Language Learning GameExtended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411763.3451767(1-7)Online publication date: 8-May-2021
https://dl.acm.org/doi/10.1145/3411763.3451767
Wigati NHidayanto A(2021)Smart Campus Implementation Effects towards Student Interest in Higher Education: A Systematic Literature Review2021 8th International Conference on Information Technology, Computer and Electrical Engineering (ICITACEE)10.1109/ICITACEE53184.2021.9617467(101-106)Online publication date: 23-Sep-2021
https://doi.org/10.1109/ICITACEE53184.2021.9617467
Jaskulska ASkorupska KKarpowicz BBiele CKowalski JKopeć W(2021)Exploration of Voice User Interfaces for Older Adults—A Pilot Study to Address Progressive Vision LossDigital Interaction and Machine Intelligence10.1007/978-3-030-74728-2_15(159-168)Online publication date: 26-Jun-2021
https://doi.org/10.1007/978-3-030-74728-2_15
Hamza-Lup FGoldbach I(2020)Multimodal, visuo-haptic games for abstract theory instruction: grabbing charged particlesJournal on Multimodal User Interfaces10.1007/s12193-020-00327-x15:1(1-10)Online publication date: 6-Jun-2020
https://doi.org/10.1007/s12193-020-00327-x
Brewer RFindlater LKaye JLasecki WMunteanu CWeber AEvers VNaaman MFitzpatrick GKarahalios KLampinen AMonroy-Hernández A(2018)Accessible Voice InterfacesCompanion of the 2018 ACM Conference on Computer Supported Cooperative Work and Social Computing10.1145/3272973.3273006(441-446)Online publication date: 30-Oct-2018
https://dl.acm.org/doi/10.1145/3272973.3273006

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents