Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8854))

Abstract

Automatic subtitling through speech recognition technology has become an important topic in recent years, where the effort has mostly centered on improving core speech technology to obtain better recognition results. However, subtitling quality also depends on other parameters aimed at favoring the readability and quick understanding of subtitles, like correct subtitle line segmentation. In this work, we present an approach to automate the segmentation of subtitles through machine learning techniques, allowing the creation of customized models adapted to the specific segmentation rules of subtitling companies. Support Vector Machines and Logistic Regression classifiers were trained over a reference corpus of subtitles manually created by professionals and used to segment the output of speech recognition engines. We describe the performance of both classifiers and discuss the merits of the approach for the automatic segmentation of subtitles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. AENOR: Spanish Technical Standards. Standard UNE 153010:2003: Subtitled Through Teletext, http://www.aenor.es

  2. Álvarez, A., del Pozo, A., Arruti, A.: APyCA: Towards the Automatic Subtitling of Television Content in Spanish. In: Proceedings of IMCSIT, pp. 567–574. IEEE, Wisla (2010)

    Google Scholar 

  3. Álvarez, A., Ruiz, P., Arzelus, H.: Improving a Long Audio Aligner through Phone-Relatedness Matrices for English, Spanish and Basque. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2014. LNCS (LNAI), vol. 8655, pp. 473–480. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  4. Baldridge, J.: The OpenNLP Project (2005), http://opennlp.sourceforge.net/

  5. Baldridge, J.: Stanford Parser 1.6 (2007), http://nlp.stanford.edu/software/lex-parser.shtml

  6. Bordel, G., Peñagarikano, M., Rodríguez-Fuentes, L.J., Varona, A.: A Simple and Efficient Method to Align Very Long Speech Signals to Acoustically Imperfect Transcriptions. In: Proceedings of INTERSPEECH, Portland (2012)

    Google Scholar 

  7. Chang, C.C., Lin, C.J.: Libsvm: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 27:1–27:27 (2011)

    Google Scholar 

  8. Coltheart, M.: What Would We Read Best? Attention and Performance II: The Psychology of Reading. Lawrence Erlbaum Associates, London (1987)

    Google Scholar 

  9. D’Arcais, F., Giovanni, B.: Syntactic Processing during Reading for Comprehension. Attention and Performance II: The Psychology of Reading, pp. 619–633. Lawrence Erlbaum Associates, London (1987)

    Google Scholar 

  10. Díaz-Cintas, J., Orero, P., Remael, A.: Media for All: Subtitling for the Deaf, Audio Description, and Sign Language, vol. 30. Rodopi (2007)

    Google Scholar 

  11. D’Ydewalle, G., Rensbergen, J.V.: Developmental Studies of Text-Picture Interactions in the Perception of Animated Cartoons with Text. Advances in Psychology, vol. 58, pp. 233–248. Elsevier, Amsterdam (1989)

    Google Scholar 

  12. Ezeiza, N., Alegria, I., Arriola, J.M., Urizar, R., Aduriz, I.: Combining Stochastic and Rule-based Methods for Disambiguation in Agglutinative Languages. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 380–384, Montreal (1998)

    Google Scholar 

  13. Automatic Captions in YouTube (2009), http://googleblog.blogspot.com/2009/11/automatic-captions-in-youtube.html

  14. Heafield, K.: KenLM: Faster and Smaller Language Model Queries. In: Proceedings of the Sixth Workshop on Statistical Machine Translation, pp. 187–197, Edinburgh (2011)

    Google Scholar 

  15. Karamitroglou, F.: A Proposed Set of Subtitling Standards in Europe. Translation Journal 2(2), 1–15 (1998)

    Google Scholar 

  16. Kneser, R., Ney, H.: Improved Backing-off for n-gram Language Modeling. In: Proceedings of ICASSP, pp. 181–184, Detroit (1995)

    Google Scholar 

  17. Neto, J., Meinedo, H., Viveiros, M., Cassaca, R., Martins, C., Caseiro, D.: Broadcast News Subtitling System in Portuguese. In: Proceedings of ICASSP, pp. 1561–1564, Las Vegas (2008)

    Google Scholar 

  18. Padró, L.: Stanilovsky. E.: FreeLing 3.0: Towards Wider Multilinguality. In: Proceedings of the 8th Language Resources and Evaluation Conference, Istanbul (2012)

    Google Scholar 

  19. Pedregosa, F., et al.: Scikit-learn: Machine Learning in Python. The Journal of Machine Learning Research 12, 2825–2830 (2011)

    MATH  MathSciNet  Google Scholar 

  20. Perego, E.: Subtitles and line-breaks: Towards improved readability. In: Between Text and Image: Updating Research in Screen Translation, vol. 78, pp. 211–223. John Benjamins Publishing (2008)

    Google Scholar 

  21. Perego, E., Del Missier, F., Porta, M., Mosconi, M.: The Cognitive Effectiveness of Subtitle Processing. Media Psychology 13(3), 243–272 (2010)

    Article  Google Scholar 

  22. Petrov, S., Klein, D.: Improved Inference for Unlexicalized Parsing. In: Proceedings of HLT-NAACL, pp. 404–411, Rochester (2007)

    Google Scholar 

  23. Rajendran, D.J., Duchowski, A.T., Orero, P., Martínez, J., Romero-Fresco, P.: Effects of Text Chunking on Subtitling: A Quantitative and Qualitative Examination. Perspectives 21(1), 5–21 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Álvarez, A., Arzelus, H., Etchegoyhen, T. (2014). Towards Customized Automatic Segmentation of Subtitles. In: Navarro Mesa, J.L., et al. Advances in Speech and Language Technologies for Iberian Languages. Lecture Notes in Computer Science(), vol 8854. Springer, Cham. https://doi.org/10.1007/978-3-319-13623-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13623-3_24

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13622-6

  • Online ISBN: 978-3-319-13623-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics