research-article

Automatically generated captions: will they help non-native speakers communicate in english?

Authors:
Nobuhiro Shimogori

Toshiba Corporation, Kawasaki, Japan

Toshiba Corporation, Kawasaki, Japan
View Profile

,
Tomoo Ikeda

Toshiba Corporation, Kawasaki, Japan

Toshiba Corporation, Kawasaki, Japan
View Profile

,
Sougo Tsuboi

Toshiba Corporation, Kawasaki, Japan

Toshiba Corporation, Kawasaki, Japan
View Profile

ICIC '10: Proceedings of the 3rd international conference on Intercultural collaborationAugust 2010Pages 79–86https://doi.org/10.1145/1841853.1841865

Published:19 August 2010Publication History

ICIC '10: Proceedings of the 3rd international conference on Intercultural collaboration

Pages 79–86

ABSTRACT

Many people find it difficult to communicate in a foreign language. In order to help these people, one approach being studied is the use of captions generated by automatic speech recognition (ASR). Captions are known to facilitate comprehension of foreign languages, but ASR-generated captions may be subject to problems attributable to recognition errors and recognition time.

We conducted two experiments using subjects who are native Japanese speakers to determine how these differences caused by ASR affect understanding when listening to English. We found that captions with 80% accuracy will increase the understanding of the subjects with intermediate English skills, which would apply to about half of native Japanese users. Additionally, changing the display timing of the caption from after speech to before speech would contribute to improving the understanding more than increasing accuracy from 80% to 100%.

These findings suggest that captions generated with today's ASR can help non-native speakers communicate in English when used carefully

References

Takezawa T., Morimoto T., Sagisaka Y., Cambell N. and Iida H. (1998). A Japanese-to-English speech translation system: ATR-MATRIX, Proc. ICSLP 1998, 2779--2782.Google Scholar
Garza, T. (1991). Evaluating the Use of Captioned Video Materials in Advanced Foreign Language Learning, Foreign Language Annals, Vol.24, No.3, 239--258.Google Scholar
Huang, H. and Eskey, D, (2000). The effects of closed-captioned television on the listening comprehension of intermediate English as a second language (ESL) students, J. Educational Technology Systems, Vol.28, No.1, 75--96.Google ScholarCross Ref
Munteanu C., Baecker R., Penn G., Toms G. and James D. (2006). The Effect of Speech Recognition Accuracy Rates on the Usefulness and Usability of Webcast Archives. Proc. ACM CHI, 493--502. Google ScholarDigital Library
Kanazawa A., and Isono H. (2001) Cognitive Experiments on Timing Differences for News Subtitling and a Compensation Method. Proc. ITE Annual Convention 2001, 89--90.Google Scholar
Maruyama I., Abe Y., Sawamura E., Mitsuhashi T., Ehara T. and Shirai K.(1999). Cognitive Experiments on Timing Differences for Superimposing Closed Captions in News Programs, Technical report of IEICE. HCS, Vol. 99, No. 123, 21--28.Google Scholar
Educational Testing Service: TOEIC PROFCIENCY SCALE. (online) http://www.toeic.or.jp/toeic/pdf/data/proficiency.pd.Google Scholar
McCowan I., Moore D., Dines J., Gatica-Perez D., Flynn M., Wellner P. and Bourlard H. (2005) On the Use of Information Retrieval Measures for Speech Recognition Evaluation. Research Report 04-73, IDIAP Research Institute..Google Scholar

Index Terms

Automatically generated captions: will they help non-native speakers communicate in english?
1. Information systems
  1. World Wide Web
    1. Web applications
      1. Internet communications tools
        Web conferencing

Recommendations

Evaluating the Usability of Automatically Generated Captions for People who are Deaf or Hard of Hearing
ASSETS '17: Proceedings of the 19th International ACM SIGACCESS Conference on Computers and Accessibility

The accuracy of Automated Speech Recognition (ASR) technology has improved, but it is still imperfect in many settings. Researchers who evaluate ASR performance often focus on improving the Word Error Rate (WER) metric, but WER has been found to have ...
Read More
Behavioral Changes in Speakers who are Automatically Captioned in Meetings with Deaf or Hard-of-Hearing Peers
ASSETS '18: Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility

Deaf and hard of hearing (DHH) individuals face barriers to communication in small-group meetings with hearing peers; we examine generation of captions on mobile devices by automatic speech recognition (ASR). While ASR output displays errors, we study ...
Read More
Preferred Appearance of Captions Generated by Automatic Speech Recognition for Deaf and Hard-of-Hearing Viewers
CHI EA '19: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems

As the accuracy of Automatic Speech Recognition (ASR) nears human-level quality, it might become feasible as an accessibility tool for people who are Deaf and Hard of Hearing (DHH) to transcribe spoken language to text. We conducted a study using in-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICIC '10: Proceedings of the 3rd international conference on Intercultural collaboration
August 2010
300 pages
ISBN:9781450301084
DOI:10.1145/1841853
General Chairs:
Pamela Hinds
Stanford University, USA
,
Anne-Marie Søderberg
Copenhagen Business School, Denmark
,
Ravi Vatrapu
Copenhagen Business School, Denmark
,
Program Chairs:
Toru Ishida
Kyoto University, Japan
,
Martha Maznevski
International Institute for Management Development (IMD), Switzerland
,
Gary Olson
University of California-Irvine, USA
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 August 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
automatic speech recognition
communication support
meeting assistance
Qualifiers
- research-article
Conference

Acceptance Rates
ICIC '10 Paper Acceptance Rate47of77submissions,61%Overall Acceptance Rate47of77submissions,61%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 185
  Total Downloads
- Downloads (Last 12 months)13
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatically generated captions: will they help non-native speakers communicate in english?

ICIC '10: Proceedings of the 3rd international conference on Intercultural collaboration

ABSTRACT

References

Cited By

Index Terms

Recommendations

Evaluating the Usability of Automatically Generated Captions for People who are Deaf or Hard of Hearing

Behavioral Changes in Speakers who are Automatically Captioned in Meetings with Deaf or Hard-of-Hearing Peers

Preferred Appearance of Captions Generated by Automatic Speech Recognition for Deaf and Hard-of-Hearing Viewers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media