research-article

Introducing game elements in crowdsourced video captioning by non-experts

Authors:
Hernisa Kacorri

The Graduate Center, CUNY, New York, NY

The Graduate Center, CUNY, New York, NY
View Profile

,
Kaoru Shinkawa

IBM Research -- Tokyo, Toyosu, Koto-ku Tokyo, Japan

IBM Research -- Tokyo, Toyosu, Koto-ku Tokyo, Japan
View Profile

,
Shin Saito

IBM Research -- Tokyo, Toyosu, Koto-ku Tokyo, Japan

IBM Research -- Tokyo, Toyosu, Koto-ku Tokyo, Japan
View Profile

W4A '14: Proceedings of the 11th Web for All ConferenceApril 2014Article No.: 29Pages 1–4https://doi.org/10.1145/2596695.2596713

Published:07 April 2014Publication History

W4A '14: Proceedings of the 11th Web for All Conference

Pages 1–4

ABSTRACT

Video captioning can increase the accessibility of information for people who are deaf or hard-of-hearing and benefit second language learners and reading-deficient students. We propose a caption editing system that harvests crowdsourced work for the useful task of video captioning. To make the task an engaging activity, its interface incorporates game-like elements. Non-expert users submit their transcriptions for short video segments against a countdown timer, either in a "type" or "fix" mode, to score points. Transcriptions from multiple users are aligned and merged to form the final captions. Preliminary results with 42 participants and 578 short video segments show that the Word Error Rate of the merged captions with two users per segment improved from 20.7% in ASR to 16%. Finally, we discuss our work in progress to improve both the accuracy of the collected data and to increase the crowd engagement.

References

Amazon Mechanical Turk. http://www.mturk.com.Google Scholar
Automatic captions for YouTube videos. http://googleblog.blogspot.com.Google Scholar
CastingWords. https://castingwords.com.Google Scholar
Gruenstein, A., McGraw, I., and Sutherland, A. 2009. A self-transcribing speech corpus: collecting continuous speech with an online educational game. In SLaTE Workshop.Google Scholar
Goto, M., and Ogata, J. 2011. PodCastle: Recent Advances of a Spoken Document Retrieval Service Improved by Anonymous User Contributions. In INTERSPEECH 2011, 3073--3076.Google Scholar
Kirkland, E., Byrom, E., MacDougall, M., and Corcoran, M. 1995. The Effectiveness of Television Captioning on Comprehension and Preference. American Educational Research Association, 1995 Annual Meeting, San Francisco, CA.Google Scholar
Krajka, J. 2013. Audiovisual Translation in LSP--A Case for Using Captioning in Teaching Languages for Specific Purposes. Scripta Manent 8(1), 2--14.Google Scholar
Lasecki, W. S., Miller, C., Sadilek, A., Abumoussa, A., Borrello, D., Kushalnagar, R., and Bigham, J. P. 2012. Real-time captioning by groups of non-experts. In Proc. UIST 2012, ACM, 23--34. Google ScholarDigital Library
Liem, B., Zhang, H., and Chen, Y. 2011. An Iterative Dual Pathway Structure for Speech-to-Text Transcription. In Human Computation.Google Scholar
Meyer, M. J., and Lee, Y. B. B. 1995. Closed-Captioned Prompt Rates: Their Influence on Reading Outcomes. Office of Special Education and Rehabilitative Services.Google Scholar
Naim, I., Gildea, D., Lasecki, W. S., and Bigham, J. P. 2013. Text Alignment for Real-Time Crowd Captioning. In Proc. NAACL-HLT 2013, 201--210.Google Scholar
Sobhi, A., Nagatsuma, R., and Saitoh, T. 2012. Collaborative Caption Editing System--Enhancing the Quality of a Captioning and Editing System. In Proc. of the 28th Annual International Technology and Persons with Disabilities Conference (CSUN).Google Scholar
Soltau, H., Saon, G., and Kingsbury, B. 2010. The IBM Attila speech recognition toolkit. In Spoken Language Technology Workshop (SLT), IEEE, 97--102.Google Scholar
von Ahn, L., and Dabbish, L. 2008. Designing games with a purpose. Comm. ACM, 51(8), 58--67. Google ScholarDigital Library
Wald, M. 2013. Concurrent Collaborative Captioning. In Proc. SERP 2013Google Scholar
Winke, P., Gass, S., and Sydorenko, T. 2010. The effects of captioning videos used for foreign language listening activities. Language Learning & Technology, 14(1), 65--8.Google Scholar

Index Terms

Introducing game elements in crowdsourced video captioning by non-experts
1. Human-centered computing
  1. Collaborative and social computing
2. Social and professional topics
  1. Professional topics
    1. Computing profession
      1. Assistive technologies
  2. User characteristics
    1. People with disabilities

Recommendations

Leveraging Complementary Contributions of Different Workers for Efficient Crowdsourcing of Video Captions
CHI '17: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems

Hearing-impaired people and non-native speakers rely on captions for access to video content, yet most videos remain uncaptioned or have machine-generated captions with high error rates. In this paper, we present the design, implementation and ...
Read More
Real-time captioning by groups of non-experts
UIST '12: Proceedings of the 25th annual ACM symposium on User interface software and technology

Real-time captioning provides deaf and hard of hearing people immediate access to spoken language and enables participation in dialogue with others. Low latency is critical because it allows speech to be paired with relevant visual cues. Currently, the ...
Read More
Leveraging Pauses to Improve Video Captions
ASSETS '18: Proceedings of the 20th International ACM SIGACCESS Conference on Computers and Accessibility

Currently, video sites that offer automatic speech recognition display the auto-generated captions as arbitrarily segmented lines of unpunctuated text. This method of displaying captions can be detrimental to meaning, especially for deaf users who rely ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
W4A '14: Proceedings of the 11th Web for All Conference
April 2014
192 pages
ISBN:9781450326513
DOI:10.1145/2596695
General Chairs:
Jeffrey P. Bigham
Carnegie Mellon University
,
Yevgen Borodin
Stony Brook University
,
Program Chair:
Luis Carriço
University of Lisbon, Portugal
Copyright © 2014 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 7 April 2014
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
crowdsourcing
gamification
transcription
video captioning
Qualifiers
- research-article
Conference

Acceptance Rates
W4A '14 Paper Acceptance Rate6of14submissions,43%Overall Acceptance Rate171of371submissions,46%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 12
  Total Citations
  View Citations
- 332
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Introducing game elements in crowdsourced video captioning by non-experts

W4A '14: Proceedings of the 11th Web for All Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Leveraging Complementary Contributions of Different Workers for Efficient Crowdsourcing of Video Captions

Real-time captioning by groups of non-experts

Leveraging Pauses to Improve Video Captions

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media