Abstract
Automatic subtitling through speech recognition technology has become an important topic in recent years, where the effort has mostly centered on improving core speech technology to obtain better recognition results. However, subtitling quality also depends on other parameters aimed at favoring the readability and quick understanding of subtitles, like correct subtitle line segmentation. In this work, we present an approach to automate the segmentation of subtitles through machine learning techniques, allowing the creation of customized models adapted to the specific segmentation rules of subtitling companies. Support Vector Machines and Logistic Regression classifiers were trained over a reference corpus of subtitles manually created by professionals and used to segment the output of speech recognition engines. We describe the performance of both classifiers and discuss the merits of the approach for the automatic segmentation of subtitles.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
AENOR: Spanish Technical Standards. Standard UNE 153010:2003: Subtitled Through Teletext, http://www.aenor.es
Álvarez, A., del Pozo, A., Arruti, A.: APyCA: Towards the Automatic Subtitling of Television Content in Spanish. In: Proceedings of IMCSIT, pp. 567–574. IEEE, Wisla (2010)
Álvarez, A., Ruiz, P., Arzelus, H.: Improving a Long Audio Aligner through Phone-Relatedness Matrices for English, Spanish and Basque. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2014. LNCS (LNAI), vol. 8655, pp. 473–480. Springer, Heidelberg (2014)
Baldridge, J.: The OpenNLP Project (2005), http://opennlp.sourceforge.net/
Baldridge, J.: Stanford Parser 1.6 (2007), http://nlp.stanford.edu/software/lex-parser.shtml
Bordel, G., Peñagarikano, M., Rodríguez-Fuentes, L.J., Varona, A.: A Simple and Efficient Method to Align Very Long Speech Signals to Acoustically Imperfect Transcriptions. In: Proceedings of INTERSPEECH, Portland (2012)
Chang, C.C., Lin, C.J.: Libsvm: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 27:1–27:27 (2011)
Coltheart, M.: What Would We Read Best? Attention and Performance II: The Psychology of Reading. Lawrence Erlbaum Associates, London (1987)
D’Arcais, F., Giovanni, B.: Syntactic Processing during Reading for Comprehension. Attention and Performance II: The Psychology of Reading, pp. 619–633. Lawrence Erlbaum Associates, London (1987)
Díaz-Cintas, J., Orero, P., Remael, A.: Media for All: Subtitling for the Deaf, Audio Description, and Sign Language, vol. 30. Rodopi (2007)
D’Ydewalle, G., Rensbergen, J.V.: Developmental Studies of Text-Picture Interactions in the Perception of Animated Cartoons with Text. Advances in Psychology, vol. 58, pp. 233–248. Elsevier, Amsterdam (1989)
Ezeiza, N., Alegria, I., Arriola, J.M., Urizar, R., Aduriz, I.: Combining Stochastic and Rule-based Methods for Disambiguation in Agglutinative Languages. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 380–384, Montreal (1998)
Automatic Captions in YouTube (2009), http://googleblog.blogspot.com/2009/11/automatic-captions-in-youtube.html
Heafield, K.: KenLM: Faster and Smaller Language Model Queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 187–197, Edinburgh (2011)
Karamitroglou, F.: A Proposed Set of Subtitling Standards in Europe. Translation Journal 2(2), 1–15 (1998)
Kneser, R., Ney, H.: Improved Backing-off for n-gram Language Modeling. In: Proceedings of ICASSP, pp. 181–184, Detroit (1995)
Neto, J., Meinedo, H., Viveiros, M., Cassaca, R., Martins, C., Caseiro, D.: Broadcast News Subtitling System in Portuguese. In: Proceedings of ICASSP, pp. 1561–1564, Las Vegas (2008)
Padró, L.: Stanilovsky. E.: FreeLing 3.0: Towards Wider Multilinguality. In: Proceedings of the 8th Language Resources and Evaluation Conference, Istanbul (2012)
Pedregosa, F., et al.: Scikit-learn: Machine Learning in Python. The Journal of Machine Learning Research 12, 2825–2830 (2011)
Perego, E.: Subtitles and line-breaks: Towards improved readability. In: Between Text and Image: Updating Research in Screen Translation, vol. 78, pp. 211–223. John Benjamins Publishing (2008)
Perego, E., Del Missier, F., Porta, M., Mosconi, M.: The Cognitive Effectiveness of Subtitle Processing. Media Psychology 13(3), 243–272 (2010)
Petrov, S., Klein, D.: Improved Inference for Unlexicalized Parsing. In: Proceedings of HLT-NAACL, pp. 404–411, Rochester (2007)
Rajendran, D.J., Duchowski, A.T., Orero, P., Martínez, J., Romero-Fresco, P.: Effects of Text Chunking on Subtitling: A Quantitative and Qualitative Examination. Perspectives 21(1), 5–21 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Álvarez, A., Arzelus, H., Etchegoyhen, T. (2014). Towards Customized Automatic Segmentation of Subtitles. In: Navarro Mesa, J.L., et al. Advances in Speech and Language Technologies for Iberian Languages. Lecture Notes in Computer Science(), vol 8854. Springer, Cham. https://doi.org/10.1007/978-3-319-13623-3_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-13623-3_24
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13622-6
Online ISBN: 978-3-319-13623-3
eBook Packages: Computer ScienceComputer Science (R0)