skip to main content
10.1145/3663548.3688504acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
poster

Towards a Rich Format for Closed-Captioning

Published: 27 October 2024 Publication History

Abstract

Closed-captioning is an essential part of viewing audio-visual content for many people, including those who are D/deaf and Hard-of-Hearing. Traditional closed-captioning systems generally consist of a single track of timed text that offers limited options for personalization. Research into extending the capabilities of captioning, such as affective, poetic, and customizable captions has shown a desire among a subset of users for these features, but only in specific contexts. However, due to the difficulty in creating custom stimuli videos utilizing the custom captioning system, comparisons between systems and longitudinal studies have not been pursued. This demo paper introduces Rich Captions, a structured system that allows for a single closed-caption file to be tagged with additional information that can then be flexibly leveraged to render different customizable, creative, and poetic captions from the same file. Additionally, we introduce the Rich Caption Editor 1, a free, open-source software system designed to author, edit, and render rich captions. The system design was informed by a formative design workshop with closed-captioning researchers and advocates. The current design allows researchers to generate reproducible stimuli for closed-captioning studies. Once the design space and user preferences are better understood, the rich captioning framework could be refined to serve a general audience.

References

[1]
Oliver Alonzo, Hijung Valentina Shin, and Dingzeyu Li. 2022. Beyond Subtitles: Captioning and Visualizing Non-speech Sounds to Improve Accessibility of User-Generated Videos. In Proceedings of the 24th International ACM SIGACCESS Conference on Computers and Accessibility. 1–12.
[2]
Akhter Al Amin, Abraham Glasser, Raja Kushalnagar, Christian Vogler, and Matt Huenerfauth. 2021. Preferences of deaf or hard of hearing users for live-TV caption appearance. In Universal Access in Human-Computer Interaction. Access to Media, Learning and Assistive Environments: 15th International Conference, UAHCI 2021, Held as Part of the 23rd HCI International Conference, HCII 2021, Virtual Event, July 24–29, 2021, Proceedings, Part II. Springer, 189–201.
[3]
Ben Caldwell, Michael Cooper, Loretta Guarino Reid, Gregg Vanderheiden, Wendy Chisholm, John Slatin, and Jason White. 2008. Web content accessibility guidelines (WCAG) 2.0. WWW Consortium (W3C) 290 (2008), 1–34.
[4]
Caluã de Lacerda Pataca, Saad Hassan, Nathan Tinker, Roshan Lalintha Peiris, and Matt Huenerfauth. 2024. Caption Royale: Exploring the Design Space of Affective Captions from the Perspective of Deaf and Hard-of-Hearing Individuals. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–17.
[5]
Caluã de Lacerda Pataca, Matthew Watkins, Roshan Peiris, Sooyeon Lee, and Matt Huenerfauth. 2023. Visualization of Speech Prosody and Emotion in Captions: Accessibility for Deaf and Hard-of-Hearing Users. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–15.
[6]
Caluã de Lacerda Pataca, Matthew Watkins, Roshan Peiris, Sooyeon Lee, and Matt Huenerfauth. 2023. Visualization of Speech Prosody and Emotion in Captions: Accessibility For Deaf And Hard-of-Hearing Users. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 831, 15 pages. https://doi.org/10.1145/3544548.3581511
[7]
Jordan Aiko Deja, Alexczar Dela Torre, Hans Joshua Lee, Jose Florencio Ciriaco IV, and Carlo Miguel Eroles. 2020. Vitune: A visualizer tool to allow the deaf and hard of hearing to see music with their eyes. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–8.
[8]
Gregory J Downey. 2008. Closed captioning: Subtitling, stenography, and the digital convergence of text with television. JHU Press.
[9]
Benjamin M Gorman, Michael Crabb, and Michael Armstrong. 2021. Adaptive Subtitles: Preferences and Trade-Offs in Real-Time Media Adaption. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–11.
[10]
Saad Hassan, Yao Ding, Agneya Abhimanyu Kerure, Christi Miller, John Burnett, Emily Biondo, and Brenden Gilbert. 2023. Exploring the Design Space of Automatically Generated Emotive Captions for Deaf or Hard of Hearing Users. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI EA ’23). Association for Computing Machinery, New York, NY, USA, Article 125, 10 pages. https://doi.org/10.1145/3544549.3585880
[11]
Brett R Jones, Hrvoje Benko, Eyal Ofek, and Andrew D Wilson. 2013. IllumiRoom: peripheral projected illusions for interactive experiences. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 869–878.
[12]
JooYeong Kim, SooYeon Ahn, and Jin-Hyuk Hong. 2023. Visible Nuances: A Caption System to Visualize Paralinguistic Speech Cues for Deaf and Hard-of-Hearing Individuals. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 54, 15 pages. https://doi.org/10.1145/3544548.3581130
[13]
Raja S Kushalnagar, Gary W Behm, Joseph S Stanislow, and Vasu Gupta. 2014. Enhancing caption accessibility through simultaneous multimodal information: visual-tactile captions. In Proceedings of the 16th international ACM SIGACCESS conference on Computers & accessibility. 185–192.
[14]
Daniel G Lee, Deborah I Fels, and John Patrick Udo. 2007. Emotive captioning. Computers in Entertainment (CIE) 5, 2 (2007), 11.
[15]
Lloyd May, Sarah Miller, Sehuam Bakri, Lorna C Quandt, and Melissa Malzkuhn. 2023. Designing Access in Sound Art Exhibitions: Centering Deaf Experiences in Musical Thinking. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems (Hamburg, Germany) (CHI EA ’23). Association for Computing Machinery, New York, NY, USA, Article 380, 8 pages. https://doi.org/10.1145/3544549.3573872
[16]
Lloyd May, Keita Ohshiro, Khang Dang, Sripathi Sridhar, Jhanvi Pai, Magdalena Fuentes, Sooyeon Lee, and Mark Cartwright. 2024. Unspoken Sound: Identifying Trends in Non-Speech Audio Captioning on YouTube. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–19.
[17]
Lloyd May, So Yeon Park, and Jonathan Berger. 2023. Enhancing Non-Speech Information Communicated in Closed Captioning Through Critical Design. In Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility (New York, NY, USA) (ASSETS ’23). Association for Computing Machinery, New York, NY, USA, Article 16, 14 pages. https://doi.org/10.1145/3597638.3608398
[18]
Emma J McDonnell, Tessa Eagle, Pitch Sinlapanuntakul, Soo Hyun Moon, Kathryn E Ringland, Jon E Froehlich, and Leah Findlater. 2024. “Caption It in an Accessible Way That Is Also Enjoyable”: Characterizing User-Driven Captioning Practices on TikTok. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–16.
[19]
Suranga Chandima Nanayakkara, Lonce Wyse, Sim Heng Ong, and Elizabeth A Taylor. 2013. Enhancing musical experience for the hearing-impaired using visual and haptic displays. Human–Computer Interaction 28, 2 (2013), 115–160.
[20]
Christine Sun Kim. 2020. Artist Christine Sun Kim rewrites closed captions | pop-up Magazine. https://www.youtube.com/watch?v=tfe479qL8hg
[21]
Niels Van Tomme and Christine Sun Kim. 2021. Activating Captions, ARGOS. https://www.argosarts.org/activatingcaptions/info
[22]
Fangzhou Wang, Hidehisa Nagano, Kunio Kashino, and Takeo Igarashi. 2016. Visualizing video sounds with sound word animation to enrich user experience. IEEE Transactions on Multimedia 19, 2 (2016), 418–429.
[23]
Yiwen Wang, Ziming Li, Pratheep Kumar Chelladurai, Wendy Dannels, Tae Oh, and Roshan L Peiris. 2023. Haptic-Captioning: Using Audio-Haptic Interfaces to Enhance Speaker Indication in Real-Time Captions for Deaf and Hard-of-Hearing Viewers. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–14.
[24]
Sean Zdenek. 2015. Reading sounds. In Reading Sounds. University of Chicago Press.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASSETS '24: Proceedings of the 26th International ACM SIGACCESS Conference on Computers and Accessibility
October 2024
1475 pages
ISBN:9798400706776
DOI:10.1145/3663548
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2024

Check for updates

Author Tags

  1. Affective Captioning
  2. Closed-Captioning
  3. Creative Captioning
  4. Subtitles

Qualifiers

  • Poster
  • Research
  • Refereed limited

Conference

ASSETS '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 436 of 1,556 submissions, 28%

Upcoming Conference

ASSETS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 86
    Total Downloads
  • Downloads (Last 12 months)86
  • Downloads (Last 6 weeks)17
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media