Abstract
The multi-media archives are very difficult to be shown on the screen, and very difficult to retrieve and browse. It is therefore important to develop technologies to summarize the entire archives in the network content to help the user in browsing and retrieval. In a recent paper [1] we proposed a complete set of multi-layered technologies to handle at least some of the above issues: (1) Automatic Generation of Titles and Summaries for each of the spoken documents, such that the spoken documents become much more easier to browse, (2) Global Semantic Structuring of the entire spoken document archive, offering to the user a global picture of the semantic structure of the archive, and (3) Query-based Local Semantic Structuring for the subset of the spoken documents retrieved by the user’s query, providing the user the detailed semantic structure of the relevant spoken documents given the query he entered. The Probabilistic Latent Semantic Analysis (PLSA) is found to be helpful. This paper presents an initial prototype system for Chinese archives with the functions mentioned above, in which the broadcast news archive in Mandarin Chinese is taken as the example archive.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Lee, L.-S., Kong, S.-Y., Pan, Y.-C., Fu, Y.-S., Huang, Y.-T.: Multi-layered summarization of spoken document archives by information extraction and semantic structuring. In: Interspeech (2006) (to appear)
Lee, L.-S., Chen, B.: Spoken document understanding and organization. IEEE Signal Processing Magazine 22(5) (September 2005)
CMU Informedia Digital Video Library project [online]. Available: http://www.informedia.cs.cmu.edu/
Multimedia Document Retrieval project at Cambridge University [online]. Available: http://mi.eng.cam.ac.uk/research/Projects/MultimediaDocumentRetrieval/
Miller, D.R.H., Leek, T., Schwartz, R.: Speech and language technologies for audio indexing and retrieval. Proc. IEEE 88(8), 1338–1353 (2000)
Whittaker, S., Hirschberg, J., Choi, J., Hindle, D., Pereira, F., Singhal, A.: Scan: Designing and evaluating user interface to support retrieval from speech archives. In: Proc. ACM SIGIR Conf. R&D in Information Retrieval, pp. 26–33 (1999)
Merlino, A., Maybury, M.: An empirical study of the optimal presentation of multimedia summaries of broadcast news. In: Mani, I., Maybury, M. (eds.) Automated Text Summarization, pp. 391–401. MIT Press, Cambridge (1999)
SpeechBot Audio/Video Search at Hewlett-Packard (HP) Labs [online]. Available: http://www.speechbot.com/
Furui, S.: Recent advances in spontaneous speech recognition and understanding. In: Proc. ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition, pp. 1–6 (2003)
Columbia Newsblaster project at Columbia University [online]. Available: http://www1.cs.columbia.edu/nlp/newsblaster/
Hofmann, T.: Probabilistic latent semantic analysis. Uncertainty in Artificial Intelligence (1999)
Jin, R., Hauptmann, A.: Automatic title generation for spoken broadcase news. In: Proc. of HLT, pp. 1–3 (2001)
Banko, M., Mittal, V., Witbrock, M.: Headline generation based on statistical translation. In: Proc. of ACL, pp. 318–325 (2000)
Dorr, B., Zajic, D., Schwartz, R.: Hedge trimmer: A parse-and-trim approach to headline generation. In: Proc. of HLT-NAACL, vol. 5, pp. 1–8 (2003)
Furui, S., Kikuchi, T., Shinnaka, Y., Hori, C.: Speech-to-text and speech-tospeech summarization of spontaneous speech. IEEE Trans. on Speech and Audio Processing 12(4), 401–408 (2004)
Hirohata, M., Shinnaka, Y., Iwano, K., Furui, S.: Sentence extraction-based presentation summarization techniques and evaluation metrics. In: Proc. ICASSP, pp. SP–P16.14 (2005)
Kong, S.-Y., Lee, L.-S.: Improved spoken document summarization using probabilistic latent semantic analysis (plsa). In: Proc. ICASSP (2006) (to appear)
Wang, C.-C.: Improved automatic generation of titles for spoken documents using various scoring techniques. M.S. thesis, National Taiwan Univerisity (2006)
Chen, S.-C., Lee, L.-S.: Automatic title generation for chinese spoken documents using an adaptive k-nearest-neighbor approach. In: Proc. European Conf. Speech Communication and Technology, pp. 2813–2816 (2003)
Li, T.-H., Lee, M.-H., Chen, B., Lee, L.-S.: Hierarchical topic organization and visual presentation of spoken documents using probabilistic latent semantic analysis (plsa) for efficient retrieval/browsing applications. In: Proc. European Conf. Speech Communication and Technology, pp. 625–628 (2005)
Pan, Y.-C., Wang, C.-C., Hsieh, Y.-C., Lee, T.-H., Lee, Y.-S., Fu, Y.-S., Huang, Y.-T., Lee, L.-S.: A multi-modal dialogue system for information navigation and retrieval across spoken document archives with topic hierarchies. In: Proc. of ASRU, pp. 375–380 (2005)
Chuang, S.-L., Chien, L.-F.: A pratical web-based approach to generating topic hierarchy for text segments. In: ACM SIGIR, pp. 127–136 (2004)
Lin, C.-Y.: Rouge: A package for automatic evaluation of summaries. In: Proc. Of Workshop on Text Summarization Branches Out, pp. 74–81 (2004)
Kohonen, T., Kaski, S., Lagus, K., Salojvi, J., Honkela, J., Paatero, V., Saarela, A.: Self organization of a massive document collection. IEEE Trans on Neural Networks 11(3), 574–585 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, Ls., Kong, Sy., Pan, Yc., Fu, Ys., Huang, Yt., Wang, CC. (2006). A Multi-layered Summarization System for Multi-media Archives by Understanding and Structuring of Chinese Spoken Documents. In: Huo, Q., Ma, B., Chng, ES., Li, H. (eds) Chinese Spoken Language Processing. ISCSLP 2006. Lecture Notes in Computer Science(), vol 4274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11939993_69
Download citation
DOI: https://doi.org/10.1007/11939993_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49665-6
Online ISBN: 978-3-540-49666-3
eBook Packages: Computer ScienceComputer Science (R0)