Abstract
There has been a surge of interest in the last several years in methods for automatic generation of content indices for multimedia documents, particularly with respect to video and audio documents. As a result, there is much interest in methods for analyzing transcribed documents from audio and video broadcasts and telephone conversations and messages. The present paper deals with such an analysis by presenting a clustering technique to partition a set of transcribed documents into different meaningful topics. Our method determines the intersection between matching transcripts, evaluates the information contribution by each transcript, assesses the information closeness of overlapping words and calculates similarity based on Chi-square method. The main novelty of our method lies in the proposed similarity measure that is designed to withstand the imperfections of transcribed documents. Experimental results using documents of varying quality of transcription are presented to demonstrate the efficacy of the proposed methodology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
U. Gargi, R. Kasturi, and S.H. Strayer, Performance characterization of videoshot-change detection methods. In IEEE Transaction on Circuits and Systems for Video Technology, Vol. 10, No.1, pp. 1–13, February 2000.
N. Patel and IK Sethi, Video Shot Detection and Characterization for Video Databases. In Pattern Recognition, Vol. 30, pp. 583–592, April 1997.
M.M. Yeung and B.-L. Yeo, Video visualization for compact presentation and fast browsing of pictorial content. In IEEE Transaction on Circuits and Systems for Video Technology, Vol. 7, No. 5, pp. 771–785, October 1997.
K.Y. Kupeev and Z. Sivan, An algorithm for efficient segmentation and selection of representative frames in video sequences. In Proceedings of SPIE Conference on Storage and Retrieval for Media Databases, pp. 253–261, San Jose, USA, January 2000.
Y.P. Tan, S.R. Kulkarni, and P.J. Ramadge, Rapid estimation of camera motion from compressed video with application to video annotation. In IEEE Transaction on Circuits and Systems for Video Technology, Vol. 10, No. 1, pp. 133–146, February 2000.
H.D. Wactler, A.G. Hauptmann, M.G. Christel, R.A. Houghton, and A.M. Olligschlaeger, Complementary video and audio analysis for broadcast news archives. In Communications of the ACM, Vol. 43, No. 2, pp. 42–47, February 2000.
T. Sato, T. Kanade, E.K. Hughes, M.A. Smith, and S. Satoh, Video ocr: indexing digital news libraries by recognition of superimposed captions. In Multimedia Systems, Vol. 7, pp. 385–394, 1999.
S. Tsekeridou and I. Pitas, Audio-visual content analysis for content-based video indexing. In Proceedings IEEE International Conference on Multimedia Computing and Systems, pp. 667–672, Florence, Italy, June 1999.
E. Wold, T. Blum, et al., Content-based classification, search, and retrieval of audio. In IEEE Multimedia, pp. 27–36, Fall 1996.
N. V. Patel and I. K. Sethi, Audio characterization for video indexing. In Proceedings of IS&T/SPIE Conf. Storage and Retrieval for Image and Video Databases IV, pp. 373–384, San Jose, CA, February 1996.
M. Spina and V. W. Zue, Automatic Transcription of General Audio Data: Preliminary Analyses. In Proceedings of International Conference on Spoken Language Processing, pp. 594–597, Philadelphia, Pa., October 1996.
N. V. Patel and I. K. Sethi, Video Classification using Speaker Identification. In Proceedings of IS&T/SPIE Conf. Storage and Retrieval for Image and Video Databases V, pp. 218–225, San Jose, February 1997.
Dongge Li, IK Sethi, N Dimitrova and T McGee, Classification of General Audio Data for Content-Based Retrieval. In Pattern Recognition Letters, Vol. 22, pp. 533–544, April 2001.
A.G. Hauptmann, M.J. Witbrock, Informedia news on demand: Information acquisition and retrieval, In M.T. Maybury (ed.) Intelligent Multimedia Information Retrieval, AAAI Press/MIT Press, 1997, pp. 213–239.
Anni R. Coden, Eric W. Brown, Speech Transcript Analysis for Automatic Search, In IBM Research Report, RC 21838 (98287), September 2000.
John S. Garofolo, Cedric G.P. Auzanne, Ellen M. Voorhees, The TREC Spoken Document Retrieval Track: A Success Story, In 1999 TREC-8 Spoken Document Retrieval Track Overview and Results, 2000.
D. Li, G. Wei, I.K. Sethi, and N. Dimitrova, Fusion of Visual and Audio Features for Person Identification in Real Video, In Proc. Of the SPIE/IS&T Conference on Storage and Retrieval for Media Databases, pp. 180–186, San Jose, California, January 2001.
S.E. Robertson, K. Sparck Jones, Simple, Proven Approaches to Text Retrieval, http://www.uky.edu/gbenoit/637/SparckJones1.html
M. Singler, R. Jin, A. Hauptmann, CMU Spoken Document Retrieval in Trec-8: Analysis of the role of Term Frequency TF. In The 8th Text REtrieval Conference, NIST, Gaithersburg, MD, November 1999.
D. Abberley, S. Renals, G. Cook, Retrieval of broadcast news documents with the THISL system. In Proc. of the IEEE International Conference on Acoustic, Speech, and Signal Processing, pp. 3781–3784, 1998.
S.E. Johnson, P. Jourlin, G.L. Moore, K.S. Jones, P.C. Woodland, The Cambridge University Spoken Document Retrieval System. In Proc. of the IEEE International Conference on Acoustic, Speech, and Signal Processing, pp. 49–52, 1999.
R. Willet, Recent trends in hierarchic document clustering: a critical view. In Information Processing and Management., 25(5):577–597, 1988.
Y. Yang, J.G. Carbonell, R. Brown, Thomas Pierce, Brian T. Archibald, and Xin Liu, Learning approaches for detecting and tracking news events. In IEEE Intelligent Systems, 14(4):32–43, 1999. http://citeseer.nj.nec.com/yang99learning.html
M.F. Porter, An algorithm for suffix stripping. In Program, 14(3), pp. 130–137, 1980.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ibrahimov, O., Sethi, I., Dimitrova, N. (2002). Clustering of Imperfect Transcripts Using a Novel Similarity Measure. In: Coden, A.R., Brown, E.W., Srinivasan, S. (eds) Information Retrieval Techniques for Speech Applications. IRTSA 2001. Lecture Notes in Computer Science, vol 2273. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45637-6_3
Download citation
DOI: https://doi.org/10.1007/3-540-45637-6_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43156-5
Online ISBN: 978-3-540-45637-7
eBook Packages: Springer Book Archive