Abstract
Speech-to-text summarization systems usually take as input the output of an automatic speech recognition (ASR) system that is affected by issues like speech recognition errors, disfluencies, or difficulties in the accurate identification of sentence boundaries. We describe the inclusion of related, solid background information to cope with the difficulties of summarizing spoken language and the use of multi-document summarization techniques in single document speech-to-text summarization. In this work, we explore the possibilities offered by phonetic information to select the background information and conduct a perceptual evaluation to better assess the relevance of the inclusion of that information. Results show that summaries generated using this approach are considerably better than those produced by an up-to-date latent semantic analysis (LSA) summarization method and suggest that humans prefer summaries restricted to the information conveyed in the input source.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Amaral, R., Trancoso, I.: Improving the topic indexation and segmentation modules of a media watch system. In: Proceedings of the 8th International Conference on Spoken Language Processing (INTERSPEECH 2004 – ICSLP), Jeju Island (2004)
Amaral, R., Meinedo, H., Caseiro, D., Trancoso, I., Neto, J.P.: Automatic vs. manual topic segmentation and indexation in broadcast news. In: Proceedings of the IV Jornadas en Tecnologia del Habla, Saragoza (2006)
Amaral, R., Meinedo, H., Caseiro, D., Trancoso, I., Neto, J.P.: A prototype system for selective dissemination of broadcast news in European Portuguese. EURASIP J. Adv. Signal Process. 2007, 037507 (2007)
Batista, F., Caseiro, D., Mamede, N.J., Trancoso, I.: Recovering punctuation marks for automatic speech recognition. In: Proceedings of the 8th Annual Conference of the International Speech Communication Association (INTERSPEECH 2007), Antwerp, pp. 2153–2156. ISCA (2007)
Batista, F., Mamede, N.J., Trancoso, I.: The impact of language dynamics on the capitalization of broadcast news. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), Brisbane, pp. 220–223. ISCA (2008)
Charniak, E., Johnson, M.: Edit detection and parsing for transcribed speech. In: Proceedings of the 2nd Conference of the North American Chapter of the ACL, Pittsburgh, pp. 1–9. Association for Computational Linguistics (2001)
Chatain, P., Whittaker, E.W.D., Mrozinski, J.A., Furui, S.: Topic and stylistic adaptation for speech summarisation. In: Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, pp. 977–980. IEEE (2006)
Chen, Y.T., Chen, B., Wang, H.M.: A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization. IEEE Trans. Audio Speech Lang. Process. 17(1), 95–106 (2009)
Christensen, H., Gotoh, Y., Kolluru, B., Renals, S.: Are extractive text summarisation techniques portable to broadcast news? In: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU ’03), St. Thomas, pp. 489–494. IEEE (2003)
Edmundson, H.P.: New methods in automatic abstracting. J. Assoc. Comput. Mach. 16(2), 264–285 (1969)
Endres-Niggemeyer, B.: Summarizing Information. Springer, Berlin (1998)
Endres-Niggemeyer, B.: Human-style WWW summarization. Tech. rep., University for Applied Sciences, Department of Information and Communication (2000)
Endres-Niggemeyer, B., Hobbs, J.R., Spärck Jones, K. (eds.): Summarizing Text for Intelligent Communication. Dagstuhl-Seminar-Report, vol. 79. IBFI, Wadern (1995)
Fleiss, J.L., Levin, B., Paik, M.C.: The measurement of interrater agreement. In: Statistical Methods for Rates and Proportions. Wiley Series in Probability and Statistics, 3rd edn., pp. 598–626. John Wiley & Sons, Inc., Hoboken, NJ, USA (2004)
Furui, S.: Recent advances in automatic speech summarization. In: Proceedings of the 8th Conference on Recherche d’Information Assistée par Ordinateur (RIAO), Pittsburgh. Centre des Hautes Études Internationales d’Informatique Documentaire (2007)
Golub, G.H., van Loan, C.F.: Matrix analysis. Matrix Computations. Johns Hopkins Series in the Mathematical Sciences 3rd edn., pp. 48–86. The Johns Hopkins University Press, Baltimore (1996)
Gong, Y., Liu, X.: Generic text summarization using relevance measure and latent semantic analysis. In: SIGIR 2001: Proceedings of the 24st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, pp. 19–25. ACM (2001)
Hirohata, M., Shinnaka, Y., Iwano, K., Furui, S.: Sentence-extractive automatic speech summarization and evaluation techniques. Speech Commun. 48, 1151–1161 (2006)
Hori, T., Hori, C., Minami, Y.: Speech summarization using weighted finite-state transducers. In: Proceedings of the 8th EUROSPEECH – INTERSPEECH 2003, Geneva, pp. 2817–2820. ISCA (2003)
Hovy, E.: Text summarization. In: Mitkov, R. (ed.) The Oxford Handbook of Computational Linguistics, pp. 583–598. Oxford University Press, Oxford/New York (2003)
Kessler, B.: Phonetic comparison algorithms. Trans. Philol. Soc. 103(2), 243–260 (2005)
Kikuchi, T., Furui, S., Hori, C.: Two-stage automatic speech summarization by sentence extraction and compaction. In: Proceedings of the ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR-2003), Tokyo, pp. 207–210. ISCA (2003)
Krippendorff, K.: Reliability. Content Analysis: An Introduction to Its Methodology, 2nd edn., pp. 211–256. Sage Publications, Thousand Oaks (2004)
Landis, J.R., Kosh, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)
Lavrenko, V., Croft, W.B.: Relevance models in information retrieval. In: Croft, W.B., Lafferty, J. (eds.) Language Modeling for Information Retrieval. The Information Retrieval Series, vol. 13. Kluwer Academic Publishers, Dordrecht, The Netherlands (2003)
Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Moens, M.F., Szpakowicz S. (eds.) Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, Barcelona, pp. 74–81. Association for Computational Linguistics, East Stroudsburg (2004)
Lin, S.H., Chen, B.: A risk minimization framework for extractive speech summarization. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, pp. 79–87. Association for Computational Linguistics (2010)
Lin, S.H., Yeh, Y.M., Chen, B.: Extractive speech summarization – from the view of decision theory. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), Chiba, pp. 1684–1687. ISCA (2010)
Liu, F., Liu, Y.: Using spoken utterance compression for meeting summarization: a pilot study. In: 2010 IEEE Workshop on Spoken Language Technology, Berkeley, pp. 37–42 (2010)
Liu, Y., Xie, S.: Impact of automatic sentence segmentation on meeting summarization. In: 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing, Las Vegas, pp. 5009–5012. IEEE (2008)
Liu, Y., Shriberg, E., Stolcke, A., Hillard, D., Ostendorf, M., Harper, M.: Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. IEEE Trans. Speech Audio Process. 14(5), 1526–1540 (2006)
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2(2), 159–165 (1958)
Maskey, S.R., Hirschberg, J.: Comparing lexical, acoustic/prosodic, strucural and discourse features for speech summarization. In: Proceedings of the 9th EUROSPEECH – INTERSPEECH 2005, Lisbon (2005)
Maskey, S.R., Rosenberg, A., Hirschberg, J.: Intonational phrases for speech summarization. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), Brisbane, pp. 2430–2433. ISCA (2008)
McKeown, K.R., Radev, D.: Generating summaries of multiple news articles. In: Fox, E.A., Ingwersen, P., Fidel R. (eds.) SIGIR 1995: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, pp. 74–82. ACM (1995)
McKeown, K.R., Barzilay, R., Evans, D., Hatzivassiloglou, V., Klavans, J.L., Nenkova, A., Sable, C., Schiffman, B., Sigelman, S.: Tracking and summarizing news on a daily basis with Columbia’s newsblaster. In: Marcus, M. (ed.) Proceedings of the Second International Conference on Human Language Technology Research (HLT 2002), San Diego, pp. 280–285. Morgan Kaufmann (2002)
McKeown, K.R., Hirschberg, J., Galley, M., Maskey, S.R.: From text to speech summarization. In: Proceedings of 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, Pennsylvania, vol. V, pp. 997–1000. IEEE (2005)
Meinedo, H., Souto, N., Neto, J.P.: Speech recognition of broadcast news for the european portuguese language. In: Proceedings of the IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU ’01), Madonna di Campiglio. IEEE (2001)
Meinedo, H., Caseiro, D., Neto, J.P., Trancoso, I.: AUDIMUS. Media: a broadcast news speech recognition system for the European Portuguese language. In: Computational Processing of the Portuguese Language: 6th International Workshop, PROPOR 2003, Faro, 26–27 June 2003. Proceedings. Lecture Notes in Computer Science (Subseries LNAI), vol. 2721, pp. 9–17. Springer (2003)
Meinedo, H., Viveiros, M., Neto, J.P.: Evaluation of a live broadcast news subtitling system for portuguese. In: Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), Brisbane, pp. 508–511. ISCA (2008)
Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2), 269–311 (1997)
Murray, G., Renals, S., Carletta, J.: Extractive summarization of meeting records. In: Proceedings of the 9th EUROSPEECH – INTERSPEECH 2005, Lisbon (2005)
Murray, G., Renals, S., Carletta, J., Moore, J.: Incorporating speaker and discourse features into speech summarization. In: Proceedings of the Human Language Technology Conference of the North American Chapter of the ACL, New York, pp. 367–374. Association for Computational Linguistics (2006)
Nenkova, A.: Summarization evaluation for text and speech: issues and approaches. In: Proceedings of INTERSPEECH 2006 – ICSLP, Pittsburgh, pp. 1527–1530. ISCA (2006)
Ostendorf, M., Favre, B., Grishman, R., Hakkani-Tür, D., Harper, M., Hillard, D., Hirschberg, J., Ji, H., Kahn, J.G., Liu, Y., Maskey, S., Matusov, E., Ney, H., Rosenberg, A., Shriberg, E., Wang, W., Wooters, C.: Speech segmentation and spoken document processing. IEEE Signal Process. Mag. 25(3), 59–69 (2008)
Paulo, S., Oliveira, L.C.: Multilevel annotation Of speech signals using weighted finite state transducers. In: Proceedings of the 2002 IEEE Workshop on Speech Synthesis, Santa Monica, pp. 111–114. IEEE (2002)
Penn, G., Zhu, X.: A critical reassessment of evaluation baselines for speech summarization. In: Proceeding of ACL-08: HLT, Columbus, pp. 470–478. Association for Computational Linguistics (2008)
Radev, D.R., Otterbacher, J., Winkel, A., Blair-Goldensohn, S.: NewsInEssence: summarizing online news topics. Commun. ACM 48(10), 95–98 (2005)
Ribeiro, R., de Matos, D.M.: Extractive summarization of broadcast news: comparing strategies for European Portuguese. In: Matoušek, V., Mautner, P. (eds.) Text, Speech and Dialogue – 10th International Conference, TSD 2007, Pilsen, 3–7 September 2007. Proceedings. Lecture Notes in Computer Science (Subseries LNAI), vol. 4629, pp. 115–122. Springer (2007)
Ribeiro, R., de Matos, D.M.: Mixed-source multi-document speech-to-text summarization. In: Coling 2008: Proceedings of the 2nd workshop on Multi-source Multilingual Information Extraction and Summarization, Manchester, pp. 33–40. Coling 2008 Organizing Committee (2008)
Ribeiro, R., de Matos, D.M.: Using prior knowledge to assess relevance in speech summarization. In: 2008 IEEE Workshop on Spoken Language Technology, Holiday Inn Goa, pp. 169–172. IEEE (2008)
Spärck Jones, K.: Automatic summarising: the state of the art. Inf. Process. Manag. 43, 1449–1481 (2007)
Wan, X., Yang, J., Xiao, J.: CollabSum: exploiting multiple document clustering for collaborative single document summarizations. In: SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, pp. 143–150. ACM (2007)
Zechner, K., Waibel, A.: Minimizing word error rate in textual summaries of spoken language. In: Proceedings of the 1st conference of the North American chapter of the ACL, Seattle, Washington, USA, pp. 186–193. Morgan Kaufmann (2000)
Zhang, J.J., Chan, R.H.Y., Fung, P.: Extractive speech summarization using shallow rhetorical structure modeling. IEEE Trans. Audio Speech Lang. Process. 18(6), 1147–1157 (2010)
Zhu, X., Penn, G.: Summarization of spontaneous conversations. In: Proceedings of INTERSPEECH 2006 – ICSLP, Pittsburgh, pp. 1531–1534. ISCA (2006)
Acknowledgements
We would like to thank Fernando Batista for his help with the speech corpus; Joana Paulo Pardal for her help with the web evaluation form; and, all the human judges for their invaluable contribution. We would also like to thank the insightful comments of the anonymous reviewers.
This work was partially supported by FCT (INESC-ID multiannual funding) through the PIDDAC Program funds.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Ribeiro, R., de Matos, D.M. (2013). Improving Speech-to-Text Summarization by Using Additional Information Sources. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28569-1_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-28569-1_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28568-4
Online ISBN: 978-3-642-28569-1
eBook Packages: Computer ScienceComputer Science (R0)