Abstract
Audio description is an accessibility service used by blind or visually impaired individuals. Often accompanying movies, television shows, and other visual art forms, the service provides spoken descriptions of visual content, allowing people with vision loss the ability to access information that sighted people obtain visually. At live theatrical events, audio description provides spoken descriptions of scenes, characters, props, and other visual elements that those with vision loss may otherwise find inaccessible. This paper explores a novel approach to automate the creation and deployment of audio description for repeatable live theatrical events, both musical and non-musical. Using readily available tools and established sound-processing techniques, we describe a framework to automate several aspects of audio description for theater. The method uses a reference audio recording and an online time warping algorithm to align audio description with live performances, including a process for handling unexpected interruptions. We also show how a reference audio recording and descriptive tracks can be generated automatically from a show’s script and used to facilitate the deployment of audio description without sufficient resources for multiple performances or a live audio describer. Finally, a software implementation that is integrated into an existing theatrical workflow is also described. This system is used in three evaluation experiments that show the method successfully aligns multiple recordings of works of musical theater and non-musical plays in order to automatically trigger pre-recorded, descriptive audio in real time.
Similar content being viewed by others
References
Theatre Development Fund (TDF). https://www.tdf.org. Last accessed 24 March 2020
Arzt A, Widmer G, Dixon S (2008) Automatic page turning for musicians via real-time machine listening. In: Ghallab M. (ed) Frontiers in artificial intelligence and applications
Branje C, Fels DI (2012) LiveDescribe: can amateur describers create high-quality audio Description? Journal of Visual Impairment and Blindness 106(3)
Caldwell B, Cooper M, Reid LG, Vanderheiden G (2008) Web content accessibility guidelines (wcag) 2.0 WWW Consortium (W3C)
Campos VP, de Araújo TMU, de Souza Filho GL, Gonçalves LMG (2018) CineAD: a system for automated audio description script generation for the visually impaired. Universal Access in the Information Society. https://doi.org/10.1007/s10209-018-0634-4
Caro MR (2016) Testing audio narration: the emotional impact of language in audio description. Perspectives 24(4):606–634. https://doi.org/10.1080/0907676x.2015.1120760
Dixon S (2005) Live tracking of musical performances using on-line time warping. In: Proceedings of the 8th International Conference on Digital Audio Effects, pp 92–97
Dubagunta SP (2016) A simple MFCC extractor using C++ STL and C++ 11. https://github.com/dspavankumar/compute-mfcc. Last accessed 24 March 2020
Frazier G (1975) The autobiography of Miss Jane Pittman: an all-audio adaptation of the teleplay for the blind and visually handicapped. San Francisco State University, Master’s thesis
Fryer L (2016) An introduction to audio description: a practical guide. Routledge
Greening J, Rolph D (2007) Accessibility: raising awareness of audio description in the UK. In: Media for all. Brill Rodopi, pp 127–138
Jordan P, Oppengaard B (2019) Media accessibility policy in theory and reality: empirical outreach to audio description users in the united states. In: 52nd Hawaii International Conference on System Sciences
Lakritz J, Salway A (2006) The semi-automatic generation of audio description from screenplays. Dept. of Computing Technical Report CS-06-05 University of Surrey
Lee SB (2017) Audio description in the digital age: amateur describers, web technology and beyond. The Journal of Translation Studies, 13–34
Lertwongkhanakool N, Kertkeidkachorn N, Punyabukkana P, Suchato A (2015) An automatic real-time synchronization of live speech with its transcription approach. Eng J-Thailand 19(5):81–99. https://doi.org/10.4186/ej.2015.19.5.81
Litsyn E, Pipko H (2019) System and method for distribution and synchronized presentation of content
Logan B (2000) Mel frequency cepstral coefficients for music modeling. In: Proc. of the International Symposium on Music Information Retrieval. Plymouth
Muda L, Begam M, Elamvazuthi I (2010) Voice recognition algorithms using mel frequency cepstral coeffient (MFCC) and dynamic time warping (DTW) techniques. J Comput 2(3):138–143
Oliver M (2013) The social model of disability: thirty years on. Disab Soc 28(7):1024–1026. https://doi.org/10.1080/09687599.2013.818773
Pfanstiehl M, Pfanstiehl C (1985) The play’s the thing-audio description in the theatre. Br J Vis Impair 3(3):91–92
Plaza M (2017) Cost-effectiveness of audio description production process: comparative analysis of outsourcing and ’in-house’ methods. Int J Prod Res 55(12):3480–3496. https://doi.org/10.1080/00207543.2017.1282182
Romero-Fresco P (2019) Accessible filmmaking: integrating translation and accessibility into the filmmaking process. Routledge
Sakoe H, Chiba S (1978) Dynamic-programming algorithm optimization for spoken word recognition. IEEE Trans Acous Speech Signal Process 26(1):43–49. https://doi.org/10.1109/Tassp.1978.1163055
Shakespeare T (2013) The social model of disability. The Disability Studies Reader, 214–221
Snyder J (2014) The visual made verbal: a comprehensive training manual and guide to the history and applications of audio description. American Council of the Blind Inc
Snyder J, Brack F (eds.) The Audio Description Project. American Council of the Blind. http://www.acb.org/adp Last accessed 24 March 2020
Szarkowska A (2011) Text-to-speech audio description: towards wider availability of AD. J Specialised Transl 15:142–162
Walczak A (2017) Audio description on smartphones: making cinema accessible for visually impaired audiences. Univ Access Inf Soc 17(4):833–840. https://doi.org/10.1007/s10209-017-0568-2
Acknowledgments
The authors would like to thank Juan Pablo Bello for his assistance and the Music and Audio Research Laboratory (MARL) at NYU Steinhardt.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Vander Wilt, D., Farbood, M.M. A new approach to creating and deploying audio description for live theater. Pers Ubiquit Comput 25, 771–781 (2021). https://doi.org/10.1007/s00779-020-01406-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00779-020-01406-2