poster

Providing synthesized audio description for online videos

Authors:
Masatomo Kobayashi

IBM Research, Tokyo Research Laboratory, Yamato, Japan

IBM Research, Tokyo Research Laboratory, Yamato, Japan
View Profile

,
Kentarou Fukuda

IBM Research, Tokyo Research Laboratory, Yamato, Japan

IBM Research, Tokyo Research Laboratory, Yamato, Japan
View Profile

,
Hironobu Takagi

IBM Research, Tokyo Research Laboratory, Yamato, Japan

IBM Research, Tokyo Research Laboratory, Yamato, Japan
View Profile

,
Chieko Asakawa

IBM Research, Tokyo Research Laboratory, Yamato, Japan

IBM Research, Tokyo Research Laboratory, Yamato, Japan
View Profile

Assets '09: Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibilityOctober 2009Pages 249–250https://doi.org/10.1145/1639642.1639699

Published:25 October 2009Publication History

Assets '09: Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility

Pages 249–250

ABSTRACT

We describe an initial attempt to develop a common platform for adding an audio description (AD) to an online video so that blind and visually impaired people can enjoy such material. A speech synthesis technology allows content providers to offer the AD at minimal cost. We exploit external metadata so that the AD can be independent of the video format. The external approach also allows external supporters to add ADs to any online videos. Our technology includes an authoring tool for writing AD scripts, a Web browser add-on for synthesizing ADs synchronized with original videos, and a text-based format to exchange AD scripts.

References

Fukuda, T., Ichikawa, O., and Nishimura, M. Phone-duration-dependent Long-term Dynamic Features for Stochastic Model-based Voice Activity Detection, In Proceedings of ICSLP 2008/Interspeech 2008, ISCA, 2008, pp. 1293--1296.Google Scholar
Miyashita, H., Sato, D., Takagi, H., and Asakawa, C. aiBrowser for Multimedia: Introducing Multimedia Content Accessibility for Visually Impaired Users, In Proceedings of ASSETS '07, ACM, 2007, pp. 91--98. Google ScholarDigital Library
Miyashita, H., Sato, D., Takagi, H., and Asakawa, C. Making Multimedia Content Accessible for Screen Reader Users, In Proceedings of W4A '07, ACM, 2007, pp. 126--127. Google ScholarDigital Library
Takagi, H., Kawanaka, S., Kobayashi, M., Itoh, T., and Asakawa, C. Social Accessibility: Achieving Accessibility through Collaborative Metadata Authoring. In Proceedings of ASSETS '08, ACM, 2008, pp. 193--200. Google ScholarDigital Library
CapScribe, http://capscribe.snow.utoronto.ca/Google Scholar
CNN Video, http://www.cnn.com/video/Google Scholar
LiveDescribe, http://www.livedescribe.com/Google Scholar
MAGpie, http://ncam.wgbh.org/webaccess/magpie/Google Scholar
Section 508, http://www.section508.gov/Google Scholar
Synchronized Multimedia Integration Language (SMIL 3.0), http://www.w3.org/TR/SMIL/Google Scholar
Speech Synthesis Markup Language (SSML) Version 1.0, http://www.w3.org/TR/speech-synthesis/Google Scholar
Timed Text (TT) Authoring Format 1.0 - Distribution Format Exchange Profile (DFXP), http://www.w3.org/TR/ttaf1-dfxp/Google Scholar
Web Contents Accessibility Guidelines (WCAG) 2.0, http://www.w3.org/TR/WCAG20/Google Scholar
YouTube, http://www.youtube.com/Google Scholar

Index Terms

Providing synthesized audio description for online videos

Recommendations

What Makes Videos Accessible to Blind and Visually Impaired People?
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

User-generated videos are an increasingly important source of information online, yet most online videos are inaccessible to blind and visually impaired (BVI) people. To find videos that are accessible, or understandable without additional description ...
Read More
Toward Automatic Audio Description Generation for Accessible Videos
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

Video accessibility is essential for people with visual impairments. Audio descriptions describe what is happening on-screen, e.g., physical actions, facial expressions, and scene changes. Generating high-quality audio descriptions requires a lot of ...
Read More
Are synthesized video descriptions acceptable?
ASSETS '10: Proceedings of the 12th international ACM SIGACCESS conference on Computers and accessibility

We conducted a series of experiments to assess the feasibility of synthesized narrations to describe online videos. To reduce the cultural bias, we included adult blind or low-vision participants from Japan and the U.S. in the main study. Our research ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
Assets '09: Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility
October 2009
290 pages
ISBN:9781605585581
DOI:10.1145/1639642
General Chair:
Shari Trewin
IBM T. J. Watson Research Center, USA
,
Program Chair:
Kathleen F. McCoy
University of Delaware, USA
Copyright © 2009 Copyright is held by the author/owner(s)
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 October 2009
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
audio description
external metadata
online videos
speech synthesis
text-to-speech (tts)
web accessibility
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate436of1,556submissions,28%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 17
  Total Citations
  View Citations
- 339
  Total Downloads
- Downloads (Last 12 months)21
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Providing synthesized audio description for online videos

Assets '09: Proceedings of the 11th international ACM SIGACCESS conference on Computers and accessibility

ABSTRACT

References

Cited By

Index Terms

Recommendations

What Makes Videos Accessible to Blind and Visually Impaired People?

Toward Automatic Audio Description Generation for Accessible Videos

Are synthesized video descriptions acceptable?