Improving Speech-to-Text Summarization by Using Additional Information Sources

Ribeiro, Ricardo; de Matos, David Martins

doi:10.1007/978-3-642-28569-1_13

Ricardo Ribeiro⁵ &
David Martins de Matos⁶

Part of the book series: Theory and Applications of Natural Language Processing ((NLP))

2018 Accesses

Abstract

Speech-to-text summarization systems usually take as input the output of an automatic speech recognition (ASR) system that is affected by issues like speech recognition errors, disfluencies, or difficulties in the accurate identification of sentence boundaries. We describe the inclusion of related, solid background information to cope with the difficulties of summarizing spoken language and the use of multi-document summarization techniques in single document speech-to-text summarization. In this work, we explore the possibilities offered by phonetic information to select the background information and conduct a perceptual evaluation to better assess the relevance of the inclusion of that information. Results show that summaries generated using this approach are considerably better than those produced by an up-to-date latent semantic analysis (LSA) summarization method and suggest that humans prefer summaries restricted to the information conveyed in the input source.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Application of Extractive Text Summarization Algorithms to Speech-to-Text Media

Automatic Summarization of Highly Spontaneous Speech

Speech-to-Text Summarization Using Automatic Phrase Extraction from Recognized Text

Notes

1.
http://www.gnu.org/software/gsl/

References

Amaral, R., Trancoso, I.: Improving the topic indexation and segmentation modules of a media watch system. In: Proceedings of the 8th International Conference on Spoken Language Processing (INTERSPEECH 2004 – ICSLP), Jeju Island (2004)
Google Scholar
Amaral, R., Meinedo, H., Caseiro, D., Trancoso, I., Neto, J.P.: Automatic vs. manual topic segmentation and indexation in broadcast news. In: Proceedings of the IV Jornadas en Tecnologia del Habla, Saragoza (2006)
Google Scholar
Amaral, R., Meinedo, H., Caseiro, D., Trancoso, I., Neto, J.P.: A prototype system for selective dissemination of broadcast news in European Portuguese. EURASIP J. Adv. Signal Process. 2007, 037507 (2007)
Google Scholar
Batista, F., Caseiro, D., Mamede, N.J., Trancoso, I.: Recovering punctuation marks for automatic speech recognition. In: Proceedings of the 8th Annual Conference of the International Speech Communication Association (INTERSPEECH 2007), Antwerp, pp. 2153–2156. ISCA (2007)
Google Scholar
Batista, F., Mamede, N.J., Trancoso, I.: The impact of language dynamics on the capitalization of broadcast news. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), Brisbane, pp. 220–223. ISCA (2008)
Google Scholar
Charniak, E., Johnson, M.: Edit detection and parsing for transcribed speech. In: Proceedings of the 2nd Conference of the North American Chapter of the ACL, Pittsburgh, pp. 1–9. Association for Computational Linguistics (2001)
Google Scholar
Chatain, P., Whittaker, E.W.D., Mrozinski, J.A., Furui, S.: Topic and stylistic adaptation for speech summarisation. In: Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, pp. 977–980. IEEE (2006)
Google Scholar
Chen, Y.T., Chen, B., Wang, H.M.: A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization. IEEE Trans. Audio Speech Lang. Process. 17(1), 95–106 (2009)
Google Scholar
Christensen, H., Gotoh, Y., Kolluru, B., Renals, S.: Are extractive text summarisation techniques portable to broadcast news? In: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU ’03), St. Thomas, pp. 489–494. IEEE (2003)
Google Scholar
Edmundson, H.P.: New methods in automatic abstracting. J. Assoc. Comput. Mach. 16(2), 264–285 (1969)
Google Scholar
Endres-Niggemeyer, B.: Summarizing Information. Springer, Berlin (1998)
Google Scholar
Endres-Niggemeyer, B.: Human-style WWW summarization. Tech. rep., University for Applied Sciences, Department of Information and Communication (2000)
Google Scholar
Endres-Niggemeyer, B., Hobbs, J.R., Spärck Jones, K. (eds.): Summarizing Text for Intelligent Communication. Dagstuhl-Seminar-Report, vol. 79. IBFI, Wadern (1995)
Google Scholar
Fleiss, J.L., Levin, B., Paik, M.C.: The measurement of interrater agreement. In: Statistical Methods for Rates and Proportions. Wiley Series in Probability and Statistics, 3rd edn., pp. 598–626. John Wiley & Sons, Inc., Hoboken, NJ, USA (2004)
Google Scholar
Furui, S.: Recent advances in automatic speech summarization. In: Proceedings of the 8th Conference on Recherche d’Information Assistée par Ordinateur (RIAO), Pittsburgh. Centre des Hautes Études Internationales d’Informatique Documentaire (2007)
Google Scholar
Golub, G.H., van Loan, C.F.: Matrix analysis. Matrix Computations. Johns Hopkins Series in the Mathematical Sciences 3rd edn., pp. 48–86. The Johns Hopkins University Press, Baltimore (1996)
Google Scholar
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: SIGIR 2001: Proceedings of the 24st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, pp. 19–25. ACM (2001)
Google Scholar
Hirohata, M., Shinnaka, Y., Iwano, K., Furui, S.: Sentence-extractive automatic speech summarization and evaluation techniques. Speech Commun. 48, 1151–1161 (2006)
Google Scholar
Hori, T., Hori, C., Minami, Y.: Speech summarization using weighted finite-state transducers. In: Proceedings of the 8th EUROSPEECH – INTERSPEECH 2003, Geneva, pp. 2817–2820. ISCA (2003)
Google Scholar
Hovy, E.: Text summarization. In: Mitkov, R. (ed.) The Oxford Handbook of Computational Linguistics, pp. 583–598. Oxford University Press, Oxford/New York (2003)
Google Scholar
Kessler, B.: Phonetic comparison algorithms. Trans. Philol. Soc. 103(2), 243–260 (2005)
Google Scholar
Kikuchi, T., Furui, S., Hori, C.: Two-stage automatic speech summarization by sentence extraction and compaction. In: Proceedings of the ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR-2003), Tokyo, pp. 207–210. ISCA (2003)
Google Scholar
Krippendorff, K.: Reliability. Content Analysis: An Introduction to Its Methodology, 2nd edn., pp. 211–256. Sage Publications, Thousand Oaks (2004)
Google Scholar
Landis, J.R., Kosh, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)
Google Scholar
Lavrenko, V., Croft, W.B.: Relevance models in information retrieval. In: Croft, W.B., Lafferty, J. (eds.) Language Modeling for Information Retrieval. The Information Retrieval Series, vol. 13. Kluwer Academic Publishers, Dordrecht, The Netherlands (2003)
Google Scholar
Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Moens, M.F., Szpakowicz S. (eds.) Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, Barcelona, pp. 74–81. Association for Computational Linguistics, East Stroudsburg (2004)
Google Scholar
Lin, S.H., Chen, B.: A risk minimization framework for extractive speech summarization. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, pp. 79–87. Association for Computational Linguistics (2010)
Google Scholar
Lin, S.H., Yeh, Y.M., Chen, B.: Extractive speech summarization – from the view of decision theory. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), Chiba, pp. 1684–1687. ISCA (2010)
Google Scholar
Liu, F., Liu, Y.: Using spoken utterance compression for meeting summarization: a pilot study. In: 2010 IEEE Workshop on Spoken Language Technology, Berkeley, pp. 37–42 (2010)
Google Scholar
Liu, Y., Xie, S.: Impact of automatic sentence segmentation on meeting summarization. In: 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing, Las Vegas, pp. 5009–5012. IEEE (2008)
Google Scholar
Liu, Y., Shriberg, E., Stolcke, A., Hillard, D., Ostendorf, M., Harper, M.: Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. IEEE Trans. Speech Audio Process. 14(5), 1526–1540 (2006)
Google Scholar
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)
Google Scholar
Maskey, S.R., Hirschberg, J.: Comparing lexical, acoustic/prosodic, strucural and discourse features for speech summarization. In: Proceedings of the 9th EUROSPEECH – INTERSPEECH 2005, Lisbon (2005)
Google Scholar
Maskey, S.R., Rosenberg, A., Hirschberg, J.: Intonational phrases for speech summarization. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), Brisbane, pp. 2430–2433. ISCA (2008)
Google Scholar
McKeown, K.R., Radev, D.: Generating summaries of multiple news articles. In: Fox, E.A., Ingwersen, P., Fidel R. (eds.) SIGIR 1995: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, pp. 74–82. ACM (1995)
Google Scholar
McKeown, K.R., Barzilay, R., Evans, D., Hatzivassiloglou, V., Klavans, J.L., Nenkova, A., Sable, C., Schiffman, B., Sigelman, S.: Tracking and summarizing news on a daily basis with Columbia’s newsblaster. In: Marcus, M. (ed.) Proceedings of the Second International Conference on Human Language Technology Research (HLT 2002), San Diego, pp. 280–285. Morgan Kaufmann (2002)
Google Scholar
McKeown, K.R., Hirschberg, J., Galley, M., Maskey, S.R.: From text to speech summarization. In: Proceedings of 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, Pennsylvania, vol. V, pp. 997–1000. IEEE (2005)
Google Scholar
Meinedo, H., Souto, N., Neto, J.P.: Speech recognition of broadcast news for the european portuguese language. In: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU ’01), Madonna di Campiglio. IEEE (2001)
Google Scholar
Meinedo, H., Caseiro, D., Neto, J.P., Trancoso, I.: AUDIMUS. Media: a broadcast news speech recognition system for the European Portuguese language. In: Computational Processing of the Portuguese Language: 6th International Workshop, PROPOR 2003, Faro, 26–27 June 2003. Proceedings. Lecture Notes in Computer Science (Subseries LNAI), vol. 2721, pp. 9–17. Springer (2003)
Google Scholar
Meinedo, H., Viveiros, M., Neto, J.P.: Evaluation of a live broadcast news subtitling system for portuguese. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), Brisbane, pp. 508–511. ISCA (2008)
Google Scholar
Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2), 269–311 (1997)
Google Scholar
Murray, G., Renals, S., Carletta, J.: Extractive summarization of meeting records. In: Proceedings of the 9th EUROSPEECH – INTERSPEECH 2005, Lisbon (2005)
Google Scholar
Murray, G., Renals, S., Carletta, J., Moore, J.: Incorporating speaker and discourse features into speech summarization. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, pp. 367–374. Association for Computational Linguistics (2006)
Google Scholar
Nenkova, A.: Summarization evaluation for text and speech: issues and approaches. In: Proceedings of INTERSPEECH 2006 – ICSLP, Pittsburgh, pp. 1527–1530. ISCA (2006)
Google Scholar
Ostendorf, M., Favre, B., Grishman, R., Hakkani-Tür, D., Harper, M., Hillard, D., Hirschberg, J., Ji, H., Kahn, J.G., Liu, Y., Maskey, S., Matusov, E., Ney, H., Rosenberg, A., Shriberg, E., Wang, W., Wooters, C.: Speech segmentation and spoken document processing. IEEE Signal Process. Mag. 25(3), 59–69 (2008)
Google Scholar
Paulo, S., Oliveira, L.C.: Multilevel annotation Of speech signals using weighted finite state transducers. In: Proceedings of the 2002 IEEE Workshop on Speech Synthesis, Santa Monica, pp. 111–114. IEEE (2002)
Google Scholar
Penn, G., Zhu, X.: A critical reassessment of evaluation baselines for speech summarization. In: Proceeding of ACL-08: HLT, Columbus, pp. 470–478. Association for Computational Linguistics (2008)
Google Scholar
Radev, D.R., Otterbacher, J., Winkel, A., Blair-Goldensohn, S.: NewsInEssence: summarizing online news topics. Commun. ACM 48(10), 95–98 (2005)
Google Scholar
Ribeiro, R., de Matos, D.M.: Extractive summarization of broadcast news: comparing strategies for European Portuguese. In: Matoušek, V., Mautner, P. (eds.) Text, Speech and Dialogue – 10th International Conference, TSD 2007, Pilsen, 3–7 September 2007. Proceedings. Lecture Notes in Computer Science (Subseries LNAI), vol. 4629, pp. 115–122. Springer (2007)
Google Scholar
Ribeiro, R., de Matos, D.M.: Mixed-source multi-document speech-to-text summarization. In: Coling 2008: Proceedings of the 2nd workshop on Multi-source Multilingual Information Extraction and Summarization, Manchester, pp. 33–40. Coling 2008 Organizing Committee (2008)
Google Scholar
Ribeiro, R., de Matos, D.M.: Using prior knowledge to assess relevance in speech summarization. In: 2008 IEEE Workshop on Spoken Language Technology, Holiday Inn Goa, pp. 169–172. IEEE (2008)
Google Scholar
Spärck Jones, K.: Automatic summarising: the state of the art. Inf. Process. Manag. 43, 1449–1481 (2007)
Google Scholar
Wan, X., Yang, J., Xiao, J.: CollabSum: exploiting multiple document clustering for collaborative single document summarizations. In: SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, pp. 143–150. ACM (2007)
Google Scholar
Zechner, K., Waibel, A.: Minimizing word error rate in textual summaries of spoken language. In: Proceedings of the 1st conference of the North American chapter of the ACL, Seattle, Washington, USA, pp. 186–193. Morgan Kaufmann (2000)
Google Scholar
Zhang, J.J., Chan, R.H.Y., Fung, P.: Extractive speech summarization using shallow rhetorical structure modeling. IEEE Trans. Audio Speech Lang. Process. 18(6), 1147–1157 (2010)
Google Scholar
Zhu, X., Penn, G.: Summarization of spontaneous conversations. In: Proceedings of INTERSPEECH 2006 – ICSLP, Pittsburgh, pp. 1531–1534. ISCA (2006)
Google Scholar

Download references

Acknowledgements

We would like to thank Fernando Batista for his help with the speech corpus; Joana Paulo Pardal for her help with the web evaluation form; and, all the human judges for their invaluable contribution. We would also like to thank the insightful comments of the anonymous reviewers.

This work was partially supported by FCT (INESC-ID multiannual funding) through the PIDDAC Program funds.

Author information

Authors and Affiliations

L2F – INESC ID/ISCTE – Instituto Universitário de Lisboa, Rua Alves Redol, 9, 1000-029, Lisboa, Portugal
Ricardo Ribeiro
L2F – INESC ID/IST, Rua Alves Redol, 9, 1000-029, Lisboa, Portugal
David Martins de Matos

Authors

Ricardo Ribeiro
View author publications
You can also search for this author in PubMed Google Scholar
David Martins de Matos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ricardo Ribeiro .

Editor information

Editors and Affiliations

Universite Sorbonne Nouvelle, LATTICE-CNRS, Ecole Normale Superieure and, rue d'Ulm 45, Paris, 75005, France
Thierry Poibeau
, Information & Communication Technologies, Universitat Pompeu Fabra, C/ Tanger 122-140, Barcelona, 08018, Spain
Horacio Saggion
Institute for Computer Science, Polish Acadmey of Science, ul. Jana Kazimierza 5, Warsaw, 01-248, Poland
Jakub Piskorski
Department of Computer Science, University of Helsinki, Gustaf Hällströmin katu 2, Helsinki, 00014, Finland
Roman Yangarber

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ribeiro, R., de Matos, D.M. (2013). Improving Speech-to-Text Summarization by Using Additional Information Sources. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28569-1_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-28569-1_13
Published: 12 July 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28568-4
Online ISBN: 978-3-642-28569-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics