Text Summarization and Speech Synthesis for the Automated Generation of Personalized Audio Presentations

Lawless, Séamus; Lavin, Peter; Bayomi, Mostafa; Cabral, João P.; Ghorab, M. Rami

doi:10.1007/978-3-319-19581-0_28

Séamus Lawless¹⁸,
Peter Lavin¹⁸,
Mostafa Bayomi¹⁸,
João P. Cabral¹⁸ &
…
M. Rami Ghorab¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9103))

Included in the following conference series:

International Conference on Applications of Natural Language to Information Systems

1847 Accesses
1 Citations

Abstract

In today’s fast-paced world, users face the challenge of having to consume a lot of content in a short time. This situation is exacerbated by the fact that content is scattered in a range of different languages and locations. This research addresses these challenges using a number of natural language processing techniques: adapting content using automatic text summarization; enhancing content accessibility through machine translation; and altering the delivery modality through speech synthesis. This paper introduces Lean-back Learning (LbL), an information system that delivers automatically generated audio presentations for consumption in a “lean-back” fashion, i.e. hands-busy, eyes-busy situations. These presentations are personalized and are generated using multilingual multi-document text summarization. The paper discusses the system’s components and algorithms, in addition to initial system evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The MT service used in the experiments is Bing Translation API. The following are the reasons for choosing Bing: (a) it is generally known to perform relatively well in terms of translation quality and speed; (b) it supports a range of languages; and (c) it provides a well-defined RESTful Web service to communicate with it.

References

Murray, G., Renals, S., Carletta, J.: Extractive summarization of meeting recordings. In: Proceedings, Interspeech’ 2005 - Eurospeech, 9th European Conference on Speech Communication and Technology. Lisbon, Portugal (2005)
Google Scholar
Fiszman, M., Rindflesch, T.C.: Abstraction summarization for managing the biomedical research literature. In: Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL) Workshop on Computational Lexical Semantics (CLS), pp. 76–83, Boston, Massachusetts (2004)
Google Scholar
Vodolazova, T., Lloret, E., Muñoz, R., Palomar, M.: A comparative study of the impact of statistical and semantic features in the framework of extractive text summarization. In: 15th International Conference on Text, Speech Dialogue, (TSD), pp. 306–313, (2012)
Google Scholar
Nenkova, A., Mckeown, K.R.: Automatic summarization. Found. Trends Inf. Retrieval 5, 103–233 (2011)
Article Google Scholar
Radev, D.R.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)
Google Scholar
Mihalcea, R., Tarau, P.: TextRank: Bringing order into texts. In: Lin D., Wu D. (eds) Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Barcelona, Spain, pp. 404–411 (2004)
Google Scholar
Edmundson, H.P.: New methods in automatic extracting. J. ACM 16(2), 264–285 (1969)
Article MATH Google Scholar
Teufel, S., Moens, M.: Sentence extraction as a classification task. In: ACL/EACL workshop on Intelligent and scalable Text summarization, pp. 58–65, Madrid, Spain (1997)
Google Scholar
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)
Article MathSciNet Google Scholar
Black, A., Zen, H., Tokuda, K.: Statistical parametric speech synthesis. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), pp. 1229–1232 (2007)
Google Scholar
Ling, Z., Wang, R.: HMM-based hierarchical unit selection combining kullback-leibler divergence with likelihood criterion. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2007), pp. 1245–1248 (2007)
Google Scholar
Türk, O., Schröder, M.: Evaluation of expressive speech synthesis with voice conversion and copy resynthesis techniques. IEEE Trans. Audio, Speech, Lang. Proc. 18(5), 965–973 (2010)
Article Google Scholar
Székely, E., Cabral, J.P., Cahill, P., Carson-Berndsen, J.: Clustering expressive speech styles in audiobooks using glottal source parameters. In: Proceedings of Interspeech, Florence, Italy (2011)
Google Scholar
Yamagishi, J., Kobayashi, T.: Adaptive training for hidden semi-Markov model. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2005), Philadelphia, USA (2005)
Google Scholar
Tokuda, K., Zen, H., Yamagishi, J., Black, A., Masuko, T., Sako, S.: The HMM-based speech synthesis system (HTS), version 2.1 (2009). http://hts.sp.nitech.ac.jp/
Kominek, J., Black, A.: The CMU arctic speech databases. In: Proceedings of 5th ISCA Speech Synthesis Workshop (SSW5), Pittsburgh, USA (2004)
Google Scholar
Clark, R., Richmond, K., King, S.: Multisyn: Open-domain unit selection for the Festival speech synthesis system. Speech Commun. 49, 317–330 (2007)
Article Google Scholar
Kawahara, H., Masuda-Katsuse, I., Cheveigné, A.: Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds. Speech Commun. 27, 187–207 (1999)
Article Google Scholar
Talkin, D.: A robust algorithm for pitch tracking (RAPT). In: Kleijn W.B., Paliwal K.K. (eds.) Speech Coding and Synthesis, pp. 495–518. Elsevier Science, New York (1995)
Google Scholar
Schröder, M., Trouvain, J.: The German text-to-speech synthesis system Mary: a tool for research, development and teaching. Int. J. Speech Technol. 6, 365–377 (2003)
Article Google Scholar
Steinberger, J., Ježek, K.: Evaluation measures for text summarization. Comput. Inf. 28(2), 1001–1026 (2012)
Google Scholar
Lin, C., Rey, M.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004, Barcelona, Spain (2004)
Google Scholar
Augat, M., Ladlow, M.: An NLTK Package for Lexical-Chain Based Word Sense Disambiguation (2009)
Google Scholar
Tofiloski, M., Julian, B., Maite, T.: A syntactic and lexical-based discourse segmenter. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009) - Short Papers. Association for Computational Linguistics, (2009)
Google Scholar
Zen, H., Toda, T.: An overview of Nitech HMM-based speech synthesis system for blizzard challenge 2005. In: Blizzard Challenge Workshop, Lisbon, Portugal (2005)
Google Scholar
King, S., Karaiskos, V.: The Blizzard Challenge 2013. In: Blizzard Challenge Workshop. Barcelona, Spain (2013)
Google Scholar
Schröder, M., Pammi, S., Türk, O.: Multilingual MARY TTS participation in the Blizzard Challenge 2009. In: Blizzard Challenge Workshop, Edinburgh, UK (2009)
Google Scholar

Download references

Acknowledgements

This research is supported by the Science Foundation Ireland (grant 07/CE/I1142) as part of the Centre for Next Generation Localisation (www.cngl.ie) at Trinity College, Dublin.

Author information

Authors and Affiliations

CNGL Centre for Global Intelligent Content, Knowledge and Data Engineering Group, School of Computer Science and Statistics, Trinity College Dublin, Dublin, Ireland
Séamus Lawless, Peter Lavin, Mostafa Bayomi, João P. Cabral & M. Rami Ghorab

Authors

Séamus Lawless
View author publications
You can also search for this author in PubMed Google Scholar
Peter Lavin
View author publications
You can also search for this author in PubMed Google Scholar
Mostafa Bayomi
View author publications
You can also search for this author in PubMed Google Scholar
João P. Cabral
View author publications
You can also search for this author in PubMed Google Scholar
M. Rami Ghorab
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Séamus Lawless .

Editor information

Editors and Affiliations

Technische Universität Darmstadt, Darmstadt, Germany
Chris Biemann
Universität Passau, Passau, Germany
Siegfried Handschuh
Universität Passau, Passau, Germany
André Freitas
University of Salford, Salford, United Kingdom
Farid Meziane
Conservatoire National des Arts et Métiers, Paris, France
Elisabeth Métais

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lawless, S., Lavin, P., Bayomi, M., Cabral, J.P., Ghorab, M.R. (2015). Text Summarization and Speech Synthesis for the Automated Generation of Personalized Audio Presentations. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2015. Lecture Notes in Computer Science(), vol 9103. Springer, Cham. https://doi.org/10.1007/978-3-319-19581-0_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-19581-0_28
Published: 04 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19580-3
Online ISBN: 978-3-319-19581-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics