skip to main content
10.1145/2661334.2661381acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
research-article

Enhancing caption accessibility through simultaneous multimodal information: visual-tactile captions

Published: 20 October 2014 Publication History

Abstract

Captions (subtitles) for television and movies have greatly enhanced accessibility for Deaf and hard of hearing (DHH) consumers who do not understand the audio, but can otherwise follow by reading the captions. However, these captions fail to fully convey auditory information, due to simultaneous delivery of aural and visual content, and lack of standardization in representing non-speech information.
Viewers cannot simultaneously watch the movie scenes and read the visual captions; instead they have to switch between the two and inevitably lose information and context in watching the movies. In contrast, hearing viewers can simultaneously listen to the audio and watch the scenes.
Most auditory non-speech information (NSI) is not easily represented by words, e.g., the description of a ring tone, or the sound of something falling. We enhance captions with tactile and visual-tactile feedback. For the former, we transform auditory NSI into its equivalent tactile representation and convey it simultaneously with the captions. For the latter, we visually identify the location of the NSI. This approach can benefit DHH viewers by conveying more aural content to the viewer's visual and tactile senses simultaneously than visual-only captions alone. We conducted a study, which compared DHH viewer responses between video with captions, tactile captions, and visual-tactile captions. The viewers significantly benefited from visual-tactile and tactile captions.

References

[1]
I. Apostolopoulos, N. Fallah, E. Folmer, and K. E. Bekris. Integrated online localization and navigation for people with visual impairments using smart phones. ACM Transactions on Interactive Intelligent Systems, 3(4):1--28, Jan. 2014.
[2]
A. Baijal, J. Kim, C. Branje, F. Russo, and D. I. Fels. Composing vibrotactile music: A multi-sensory experience with the emoti-chair. In 2012 IEEE Haptics Symposium (HAPTICS), pages 509--515. IEEE, Mar. 2012.
[3]
C. Branje, M. Karam, D. Fels, and F. Russo. Enhancing entertainment through a multimodal chair interface. In 2009 IEEE Toronto International Conference Science and Technology for Humanity (TIC-STH), pages 636--641. IEEE, Sept. 2009.
[4]
S. Brewster and L. M. Brown. Tactons: Structured Tactile Messages for Non-Visual Information Display. In AUIC '04 Proceedings of the fifth conference on Australasian user interface, pages 15--23, 2004.
[5]
S. Brewster, F. Chohan, and L. Brown. Tactile feedback for mobile interactions. In Proceedings of the SIGCHI conference on Human factors in computing systems - CHI '07, page 159, New York, New York, USA, 2007. ACM Press.
[6]
A. C. Cavender, J. P. Bigham, and R. E. Ladner. ClassInFocus. In Proceedings of the 11th International ACM SIGACCESS Conference on Computers and Accessibility - ASSETS '09, pages 67--74, New York, New York, USA, 2009. ACM Press.
[7]
W. Fujisaki and S. Nishida. Audio-tactile superiority over visuo-tactile and audio-visual combinations in the temporal resolution of synchrony perception. Experimental brain research, 198(2--3):245--59, Sept. 2009.
[8]
C. Jensema. Closed-captioned television presentation speed and vocabulary. American Annals of the Deaf, 141(4):284--292, 1996.
[9]
C. J. Jensema, R. S. Danturthi, and R. Burch. Time spent viewing captions on television programs. American annals of the deaf, 145(5):464--8, Dec. 2000.
[10]
L. A. Jones and N. B. Sarter. Tactile Displays: Guidance for Their Design and Application. Human Factors: The Journal of the Human Factors and Ergonomics Society, 50(1):90--111, Feb. 2008.
[11]
W. L. Khoo, E. L. Seidel, and Z. Zhigang. Designing a virtual environment to evaluate multimodal sensors for assisting the visually impaired, volume 7383 of Lecture Notes in Computer Science. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012.
[12]
R. S. Kushalnagar, P. Kushalnagar, and G. Manganelli. Collaborative Gaze Cues for Deaf Students. In Dual Eye Tracking Workshop at the Computer Supported Cooperative Work and Social Computing Conference, Seattle, WA, Mar. 2012. ACM Press.
[13]
R. S. Kushalnagar, W. S. Lasecki, and J. P. Bigham. Captions Versus Transcripts for Online Video Content. In ACM, editor, 10th International Cross-Disclipinary Conference on Web Accessibility (W4A), 32, pages 1--4, Rio De Janerio, Brazil, May 2013. ACM Press.
[14]
R. S. Kushalnagar, W. S. Lasecki, and J. P. Bigham. Accessibility Evaluation of Classroom Captions. ACM Transactions on Accessible Computing, 5(3):1--24, Jan. 2014.
[15]
E. S. Ladner. Silent Talkies. American Annals of the Deaf, 76:323--325, 1931.
[16]
M. Marschark, G. Leigh, P. Sapere, D. Burnham, C. Convertino, M. Stinson, H. Knoors, M. P. J. Vervloed, and W. Noble. Benefits of sign language interpreting and text alternatives for deaf students' classroom learning. Journal of Deaf Studies and Deaf Education, 11(4):421--37, Jan. 2006.
[17]
M. Mills. On Disability and Cybernetics: Helen Keller, Norbert Wiener, and the Hearing Glove. differences, 22(2--3):74--111, Dec. 2011.
[18]
C. E. Sherrick, R. W. Cholewiak, and A. A. Collins. The localization of low- and high-frequency vibrotactile stimuli. The Journal of the Acoustical Society of America, 88(1):169--79, July 1990.
[19]
S. Sundaram and S. Narayanan. Classification of sound clips by two schemes: Using onomatopoeia and semantic labels. In 2008 IEEE International Conference on Multimedia and Expo, pages 1341--1344. IEEE, June 2008.

Cited By

View all
  • (2024)Towards a Rich Format for Closed-CaptioningProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3688504(1-5)Online publication date: 27-Oct-2024
  • (2024)Envisioning Collective Communication Access: A Theoretically-Grounded Review of Captioning Literature from 2013-2023Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675649(1-18)Online publication date: 27-Oct-2024
  • (2024)"I Wish You Could Make the Camera Stand Still": Envisioning Media Accessibility Interventions with People with AphasiaProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675598(1-17)Online publication date: 27-Oct-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASSETS '14: Proceedings of the 16th international ACM SIGACCESS conference on Computers & accessibility
October 2014
378 pages
ISBN:9781450327206
DOI:10.1145/2661334
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. aural-to-tactile information
  2. caption readability
  3. deaf and hard of hearing users
  4. multi-modal interfaces

Qualifiers

  • Research-article

Funding Sources

Conference

ASSETS '14
Sponsor:

Acceptance Rates

ASSETS '14 Paper Acceptance Rate 29 of 106 submissions, 27%;
Overall Acceptance Rate 436 of 1,556 submissions, 28%

Upcoming Conference

ASSETS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)76
  • Downloads (Last 6 weeks)5
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Towards a Rich Format for Closed-CaptioningProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3688504(1-5)Online publication date: 27-Oct-2024
  • (2024)Envisioning Collective Communication Access: A Theoretically-Grounded Review of Captioning Literature from 2013-2023Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675649(1-18)Online publication date: 27-Oct-2024
  • (2024)"I Wish You Could Make the Camera Stand Still": Envisioning Media Accessibility Interventions with People with AphasiaProceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3663548.3675598(1-17)Online publication date: 27-Oct-2024
  • (2024)Unspoken Sound: Identifying Trends in Non-Speech Audio Captioning on YouTubeProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642162(1-19)Online publication date: 11-May-2024
  • (2023)Enhancing Non-Speech Information Communicated in Closed Captioning Through Critical DesignProceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility10.1145/3597638.3608398(1-14)Online publication date: 22-Oct-2023
  • (2023)Accessibility Research in Digital Audiovisual Media: What Has Been Achieved and What Should Be Done Next?Proceedings of the 2023 ACM International Conference on Interactive Media Experiences10.1145/3573381.3596159(94-114)Online publication date: 12-Jun-2023
  • (2023)Visible Nuances: A Caption System to Visualize Paralinguistic Speech Cues for Deaf and Hard-of-Hearing IndividualsProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581130(1-15)Online publication date: 19-Apr-2023
  • (2023)Haptic-Captioning: Using Audio-Haptic Interfaces to Enhance Speaker Indication in Real-Time Captions for Deaf and Hard-of-Hearing ViewersProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3581076(1-14)Online publication date: 19-Apr-2023
  • (2023)“Easier or Harder, Depending on Who the Hearing Person Is”: Codesigning Videoconferencing Tools for Small Groups with Mixed Hearing StatusProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580809(1-15)Online publication date: 19-Apr-2023
  • (2023)Augmented Reality Visual-Captions: Enhancing Captioning Experience for Real-Time ConversationsDistributed, Ambient and Pervasive Interactions10.1007/978-3-031-34609-5_28(380-396)Online publication date: 9-Jul-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media