Dynamic Subtitle Authoring Method Based on Audio Analysis for the Hearing Impaired

Lim, Wootaek; Jang, Inseon; Ahn, Chunghyun

doi:10.1007/978-3-319-08596-8_9

Wootaek Lim²⁰,
Inseon Jang²⁰ &
Chunghyun Ahn²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8547))

Included in the following conference series:

International Conference on Computers for Handicapped Persons

3417 Accesses

Abstract

The broadcasting and the Internet are important parts of modern society that a life without media is now unimaginable. However, hearing impaired people have difficulty in understanding media content due to the loss of audio information. If subtitles are available, subtitling with video can be helpful. In this paper, we propose a dynamic subtitle authoring method based on audio analysis for the hearing impaired. We analyze the audio signal and explore a set of audio features that include STE, ZCR, Pitch and MFCC. Using these features, we align the subtitle with the speech and match extracted speech features to subtitle as different text colors, sizes and thicknesses. Furthermore, it highlights the text via aligning them with the voice and tagging the speaker ID using the speaker recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Evaluation on Noise Reduction in Subtitle Generator for Videos

Reader: Speech Synthesizer and Speech Recognizer

Simultaneous Speech Subtitling Systems for Multiple Speakers

References

BSeries, B.T.: Accessibility to broadcasting services for persons with disabilities (2011)
Google Scholar
Abrahamian, S.: N. T. S. C. In: EIA-608 and EIA-708 closed captioning (2006)
Google Scholar
Boyd, J., Vader, E.A.: Captioned television for the deaf. Am. Ann. Hearing Impaired 117(1), 32–37 (1972)
Google Scholar
Hong, R., et al.: Dynamic captioning: video accessibility enhancement for hearing impairment. In: Proceedings of the International Conference on Multimedia. ACM (2010)
Google Scholar
Seto, S., et al.: Subtitle system visualizing non-verbal expressions in voice for hearing impaired-Ambient Font. In: Proceeding of the 10th Asia-Pacific Industrial Engineering and Management Systems (2010)
Google Scholar
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Communication 48(9), 1162–1181 (2006)
Article Google Scholar
Jalil, M., Butt, F.A., Malik, A.: Short-time energy, magnitude, zero crossing rate and autocorrelation measurement for discriminating voiced and unvoiced segments of speech signals. In: International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (2013)
Google Scholar
Hess, W.: Pitch Determination of Speech Signals. Springer (1983)
Google Scholar
Hasan, M.R., Jamil, M., Rabbani, M.G., Rahman, M.S.: Speaker identification using mel frequency cepstral coefficients (2004)
Google Scholar
https://instruct1.cit.cornell.edu/courses/ece576/FinalProjects/f2008/pae26_jsc59/pae26_jsc59/
Kim, N.: A Study on Multimedia Application Service using DTV Closed Caption Data. Journal of Broadcast Engineering (2009)
Google Scholar
Peter, O.L.: Making Television Accessible. Report published by the International Tele-communications Union, in collaboration with The Global Initiative for Inclusive Information and Communication Technologies. ITU. Media accessibility 101 (2011)
Google Scholar
Maryon, E.: The Science of Tone-Color. CC Birchard & Co., Boston (1924)
Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10(1) (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Electronics and Telecommunications Research Institute, Realistic Broadcasting Media, Research Department, 218 Gajeong-ro, Yuseong-gu, Daejeon, Korea
Wootaek Lim, Inseon Jang & Chunghyun Ahn

Authors

Wootaek Lim
View author publications
You can also search for this author in PubMed Google Scholar
Inseon Jang
View author publications
You can also search for this author in PubMed Google Scholar
Chunghyun Ahn
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute “Integriert Studieren”, Johannes Kepler University, Altenbergerstr. 69,, 4040, Linz, Austria
Klaus Miesenberger
Ryerson University, 350 Victoria Street, M5B 2K3, Toronto, ON, Canada
Deborah Fels
THIM/EA 4004 CHArt, Université Paris 8 Vincennes-Saint Denis, 2 rue de la liberté, 93526, Saint-Denis Cedex, France
Dominique Archambault
Teiresias Centre, Masaryk University, Botanická 68A, 602 00, Brno, Czech Republic
Petr Peňáz
Institute “Integriert Studieren”, Vienna University of Technology, Favoritenstraße 11/029, Vienna, 1050, Austria
Wolfgang Zagler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lim, W., Jang, I., Ahn, C. (2014). Dynamic Subtitle Authoring Method Based on Audio Analysis for the Hearing Impaired. In: Miesenberger, K., Fels, D., Archambault, D., Peňáz, P., Zagler, W. (eds) Computers Helping People with Special Needs. ICCHP 2014. Lecture Notes in Computer Science, vol 8547. Springer, Cham. https://doi.org/10.1007/978-3-319-08596-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-08596-8_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08595-1
Online ISBN: 978-3-319-08596-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics