Skip to main content

Improving Speech-to-Text Summarization by Using Additional Information Sources

  • Chapter
  • First Online:
Multi-source, Multilingual Information Extraction and Summarization

Abstract

Speech-to-text summarization systems usually take as input the output of an automatic speech recognition (ASR) system that is affected by issues like speech recognition errors, disfluencies, or difficulties in the accurate identification of sentence boundaries. We describe the inclusion of related, solid background information to cope with the difficulties of summarizing spoken language and the use of multi-document summarization techniques in single document speech-to-text summarization. In this work, we explore the possibilities offered by phonetic information to select the background information and conduct a perceptual evaluation to better assess the relevance of the inclusion of that information. Results show that summaries generated using this approach are considerably better than those produced by an up-to-date latent semantic analysis (LSA) summarization method and suggest that humans prefer summaries restricted to the information conveyed in the input source.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.gnu.org/software/gsl/

References

  1. Amaral, R., Trancoso, I.: Improving the topic indexation and segmentation modules of a media watch system. In: Proceedings of the 8th International Conference on Spoken Language Processing (INTERSPEECH 2004 – ICSLP), Jeju Island (2004)

    Google Scholar 

  2. Amaral, R., Meinedo, H., Caseiro, D., Trancoso, I., Neto, J.P.: Automatic vs. manual topic segmentation and indexation in broadcast news. In: Proceedings of the IV Jornadas en Tecnologia del Habla, Saragoza (2006)

    Google Scholar 

  3. Amaral, R., Meinedo, H., Caseiro, D., Trancoso, I., Neto, J.P.: A prototype system for selective dissemination of broadcast news in European Portuguese. EURASIP J. Adv. Signal Process. 2007, 037507 (2007)

    Google Scholar 

  4. Batista, F., Caseiro, D., Mamede, N.J., Trancoso, I.: Recovering punctuation marks for automatic speech recognition. In: Proceedings of the 8th Annual Conference of the International Speech Communication Association (INTERSPEECH 2007), Antwerp, pp. 2153–2156. ISCA (2007)

    Google Scholar 

  5. Batista, F., Mamede, N.J., Trancoso, I.: The impact of language dynamics on the capitalization of broadcast news. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), Brisbane, pp. 220–223. ISCA (2008)

    Google Scholar 

  6. Charniak, E., Johnson, M.: Edit detection and parsing for transcribed speech. In: Proceedings of the 2nd Conference of the North American Chapter of the ACL, Pittsburgh, pp. 1–9. Association for Computational Linguistics (2001)

    Google Scholar 

  7. Chatain, P., Whittaker, E.W.D., Mrozinski, J.A., Furui, S.: Topic and stylistic adaptation for speech summarisation. In: Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, pp. 977–980. IEEE (2006)

    Google Scholar 

  8. Chen, Y.T., Chen, B., Wang, H.M.: A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization. IEEE Trans. Audio Speech Lang. Process. 17(1), 95–106 (2009)

    Google Scholar 

  9. Christensen, H., Gotoh, Y., Kolluru, B., Renals, S.: Are extractive text summarisation techniques portable to broadcast news? In: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU ’03), St. Thomas, pp. 489–494. IEEE (2003)

    Google Scholar 

  10. Edmundson, H.P.: New methods in automatic abstracting. J. Assoc. Comput. Mach. 16(2), 264–285 (1969)

    Google Scholar 

  11. Endres-Niggemeyer, B.: Summarizing Information. Springer, Berlin (1998)

    Google Scholar 

  12. Endres-Niggemeyer, B.: Human-style WWW summarization. Tech. rep., University for Applied Sciences, Department of Information and Communication (2000)

    Google Scholar 

  13. Endres-Niggemeyer, B., Hobbs, J.R., Spärck Jones, K. (eds.): Summarizing Text for Intelligent Communication. Dagstuhl-Seminar-Report, vol. 79. IBFI, Wadern (1995)

    Google Scholar 

  14. Fleiss, J.L., Levin, B., Paik, M.C.: The measurement of interrater agreement. In: Statistical Methods for Rates and Proportions. Wiley Series in Probability and Statistics, 3rd edn., pp. 598–626. John Wiley & Sons, Inc., Hoboken, NJ, USA (2004)

    Google Scholar 

  15. Furui, S.: Recent advances in automatic speech summarization. In: Proceedings of the 8th Conference on Recherche d’Information Assistée par Ordinateur (RIAO), Pittsburgh. Centre des Hautes Études Internationales d’Informatique Documentaire (2007)

    Google Scholar 

  16. Golub, G.H., van Loan, C.F.: Matrix analysis. Matrix Computations. Johns Hopkins Series in the Mathematical Sciences 3rd edn., pp. 48–86. The Johns Hopkins University Press, Baltimore (1996)

    Google Scholar 

  17. Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: SIGIR 2001: Proceedings of the 24st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, pp. 19–25. ACM (2001)

    Google Scholar 

  18. Hirohata, M., Shinnaka, Y., Iwano, K., Furui, S.: Sentence-extractive automatic speech summarization and evaluation techniques. Speech Commun. 48, 1151–1161 (2006)

    Google Scholar 

  19. Hori, T., Hori, C., Minami, Y.: Speech summarization using weighted finite-state transducers. In: Proceedings of the 8th EUROSPEECH – INTERSPEECH 2003, Geneva, pp. 2817–2820. ISCA (2003)

    Google Scholar 

  20. Hovy, E.: Text summarization. In: Mitkov, R. (ed.) The Oxford Handbook of Computational Linguistics, pp. 583–598. Oxford University Press, Oxford/New York (2003)

    Google Scholar 

  21. Kessler, B.: Phonetic comparison algorithms. Trans. Philol. Soc. 103(2), 243–260 (2005)

    Google Scholar 

  22. Kikuchi, T., Furui, S., Hori, C.: Two-stage automatic speech summarization by sentence extraction and compaction. In: Proceedings of the ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR-2003), Tokyo, pp. 207–210. ISCA (2003)

    Google Scholar 

  23. Krippendorff, K.: Reliability. Content Analysis: An Introduction to Its Methodology, 2nd edn., pp. 211–256. Sage Publications, Thousand Oaks (2004)

    Google Scholar 

  24. Landis, J.R., Kosh, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)

    Google Scholar 

  25. Lavrenko, V., Croft, W.B.: Relevance models in information retrieval. In: Croft, W.B., Lafferty, J. (eds.) Language Modeling for Information Retrieval. The Information Retrieval Series, vol. 13. Kluwer Academic Publishers, Dordrecht, The Netherlands (2003)

    Google Scholar 

  26. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Moens, M.F., Szpakowicz S. (eds.) Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, Barcelona, pp. 74–81. Association for Computational Linguistics, East Stroudsburg (2004)

    Google Scholar 

  27. Lin, S.H., Chen, B.: A risk minimization framework for extractive speech summarization. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, pp. 79–87. Association for Computational Linguistics (2010)

    Google Scholar 

  28. Lin, S.H., Yeh, Y.M., Chen, B.: Extractive speech summarization – from the view of decision theory. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), Chiba, pp. 1684–1687. ISCA (2010)

    Google Scholar 

  29. Liu, F., Liu, Y.: Using spoken utterance compression for meeting summarization: a pilot study. In: 2010 IEEE Workshop on Spoken Language Technology, Berkeley, pp. 37–42 (2010)

    Google Scholar 

  30. Liu, Y., Xie, S.: Impact of automatic sentence segmentation on meeting summarization. In: 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing, Las Vegas, pp. 5009–5012. IEEE (2008)

    Google Scholar 

  31. Liu, Y., Shriberg, E., Stolcke, A., Hillard, D., Ostendorf, M., Harper, M.: Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. IEEE Trans. Speech Audio Process. 14(5), 1526–1540 (2006)

    Google Scholar 

  32. Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)

    Google Scholar 

  33. Maskey, S.R., Hirschberg, J.: Comparing lexical, acoustic/prosodic, strucural and discourse features for speech summarization. In: Proceedings of the 9th EUROSPEECH – INTERSPEECH 2005, Lisbon (2005)

    Google Scholar 

  34. Maskey, S.R., Rosenberg, A., Hirschberg, J.: Intonational phrases for speech summarization. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), Brisbane, pp. 2430–2433. ISCA (2008)

    Google Scholar 

  35. McKeown, K.R., Radev, D.: Generating summaries of multiple news articles. In: Fox, E.A., Ingwersen, P., Fidel R. (eds.) SIGIR 1995: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, pp. 74–82. ACM (1995)

    Google Scholar 

  36. McKeown, K.R., Barzilay, R., Evans, D., Hatzivassiloglou, V., Klavans, J.L., Nenkova, A., Sable, C., Schiffman, B., Sigelman, S.: Tracking and summarizing news on a daily basis with Columbia’s newsblaster. In: Marcus, M. (ed.) Proceedings of the Second International Conference on Human Language Technology Research (HLT 2002), San Diego, pp. 280–285. Morgan Kaufmann (2002)

    Google Scholar 

  37. McKeown, K.R., Hirschberg, J., Galley, M., Maskey, S.R.: From text to speech summarization. In: Proceedings of 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, Pennsylvania, vol. V, pp. 997–1000. IEEE (2005)

    Google Scholar 

  38. Meinedo, H., Souto, N., Neto, J.P.: Speech recognition of broadcast news for the european portuguese language. In: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU ’01), Madonna di Campiglio. IEEE (2001)

    Google Scholar 

  39. Meinedo, H., Caseiro, D., Neto, J.P., Trancoso, I.: AUDIMUS. Media: a broadcast news speech recognition system for the European Portuguese language. In: Computational Processing of the Portuguese Language: 6th International Workshop, PROPOR 2003, Faro, 26–27 June 2003. Proceedings. Lecture Notes in Computer Science (Subseries LNAI), vol. 2721, pp. 9–17. Springer (2003)

    Google Scholar 

  40. Meinedo, H., Viveiros, M., Neto, J.P.: Evaluation of a live broadcast news subtitling system for portuguese. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), Brisbane, pp. 508–511. ISCA (2008)

    Google Scholar 

  41. Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2), 269–311 (1997)

    Google Scholar 

  42. Murray, G., Renals, S., Carletta, J.: Extractive summarization of meeting records. In: Proceedings of the 9th EUROSPEECH – INTERSPEECH 2005, Lisbon (2005)

    Google Scholar 

  43. Murray, G., Renals, S., Carletta, J., Moore, J.: Incorporating speaker and discourse features into speech summarization. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, pp. 367–374. Association for Computational Linguistics (2006)

    Google Scholar 

  44. Nenkova, A.: Summarization evaluation for text and speech: issues and approaches. In: Proceedings of INTERSPEECH 2006 – ICSLP, Pittsburgh, pp. 1527–1530. ISCA (2006)

    Google Scholar 

  45. Ostendorf, M., Favre, B., Grishman, R., Hakkani-Tür, D., Harper, M., Hillard, D., Hirschberg, J., Ji, H., Kahn, J.G., Liu, Y., Maskey, S., Matusov, E., Ney, H., Rosenberg, A., Shriberg, E., Wang, W., Wooters, C.: Speech segmentation and spoken document processing. IEEE Signal Process. Mag. 25(3), 59–69 (2008)

    Google Scholar 

  46. Paulo, S., Oliveira, L.C.: Multilevel annotation Of speech signals using weighted finite state transducers. In: Proceedings of the 2002 IEEE Workshop on Speech Synthesis, Santa Monica, pp. 111–114. IEEE (2002)

    Google Scholar 

  47. Penn, G., Zhu, X.: A critical reassessment of evaluation baselines for speech summarization. In: Proceeding of ACL-08: HLT, Columbus, pp. 470–478. Association for Computational Linguistics (2008)

    Google Scholar 

  48. Radev, D.R., Otterbacher, J., Winkel, A., Blair-Goldensohn, S.: NewsInEssence: summarizing online news topics. Commun. ACM 48(10), 95–98 (2005)

    Google Scholar 

  49. Ribeiro, R., de Matos, D.M.: Extractive summarization of broadcast news: comparing strategies for European Portuguese. In: Matoušek, V., Mautner, P. (eds.) Text, Speech and Dialogue – 10th International Conference, TSD 2007, Pilsen, 3–7 September 2007. Proceedings. Lecture Notes in Computer Science (Subseries LNAI), vol. 4629, pp. 115–122. Springer (2007)

    Google Scholar 

  50. Ribeiro, R., de Matos, D.M.: Mixed-source multi-document speech-to-text summarization. In: Coling 2008: Proceedings of the 2nd workshop on Multi-source Multilingual Information Extraction and Summarization, Manchester, pp. 33–40. Coling 2008 Organizing Committee (2008)

    Google Scholar 

  51. Ribeiro, R., de Matos, D.M.: Using prior knowledge to assess relevance in speech summarization. In: 2008 IEEE Workshop on Spoken Language Technology, Holiday Inn Goa, pp. 169–172. IEEE (2008)

    Google Scholar 

  52. Spärck Jones, K.: Automatic summarising: the state of the art. Inf. Process. Manag. 43, 1449–1481 (2007)

    Google Scholar 

  53. Wan, X., Yang, J., Xiao, J.: CollabSum: exploiting multiple document clustering for collaborative single document summarizations. In: SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, pp. 143–150. ACM (2007)

    Google Scholar 

  54. Zechner, K., Waibel, A.: Minimizing word error rate in textual summaries of spoken language. In: Proceedings of the 1st conference of the North American chapter of the ACL, Seattle, Washington, USA, pp. 186–193. Morgan Kaufmann (2000)

    Google Scholar 

  55. Zhang, J.J., Chan, R.H.Y., Fung, P.: Extractive speech summarization using shallow rhetorical structure modeling. IEEE Trans. Audio Speech Lang. Process. 18(6), 1147–1157 (2010)

    Google Scholar 

  56. Zhu, X., Penn, G.: Summarization of spontaneous conversations. In: Proceedings of INTERSPEECH 2006 – ICSLP, Pittsburgh, pp. 1531–1534. ISCA (2006)

    Google Scholar 

Download references

Acknowledgements

We would like to thank Fernando Batista for his help with the speech corpus; Joana Paulo Pardal for her help with the web evaluation form; and, all the human judges for their invaluable contribution. We would also like to thank the insightful comments of the anonymous reviewers.

This work was partially supported by FCT (INESC-ID multiannual funding) through the PIDDAC Program funds.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ricardo Ribeiro .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Ribeiro, R., de Matos, D.M. (2013). Improving Speech-to-Text Summarization by Using Additional Information Sources. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28569-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28569-1_13

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28568-4

  • Online ISBN: 978-3-642-28569-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics