Multimodal detection of highlights for multimedia content

Dagtas, Serhan; Abdel-Mottaleb, Mohamed

doi:10.1007/s00530-003-0130-3

Multimodal detection of highlights for multimedia content

Published: June 2004

Volume 9, pages 586–593, (2004)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Serhan Dagtas¹ &
Mohamed Abdel-Mottaleb²

71 Accesses
6 Citations
Explore all metrics

Abstract.

We present a multimedia information analysis framework for content-based browsing of video. Specifically, we develop algorithms for the automated extraction of video highlights in sports video that are based on audio, text, and image features. The extracted annotations are used to build applications for selective browsing of sports videos. Such summarization techniques enable content-based indexing of multimedia documents for efficient storage and retrieval. In addition, in the context of the newly emerging standard MPEG-7, these methods will enable applications that use MPEG-7 descriptions. As this standard provides only the syntax for representing such descriptions and not specific algorithms for extracting them, these algorithms are of great value for establishing MPEG-7 as an accepted standard. We provide experimental results for the proposed algorithms on several hours of sports programs that prove the feasibility of efficient video access techniques in a multimedia environment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abdel-Mottaleb M, Dimitrova N, Desai R, Martino J (1996) CONIVAS: CONtent-based image and video access system. In: Proceedings of ACM Multimedia’ 96, Boston, 18-22 November 1996, pp 427-428
Alatan A, Akansu A, Wolf W (2001) Multimodal dialogue scene detection using Hidden Markov Models for content-based multimedia indexing. Multimedia Tools Appl 14(2):137-151
Article Google Scholar
Assfalg J, Bertini M, Colombo C, Del Bimbo A (2002) Semantic annotation of sports videos. IEEE Multimedia 9(2):52-60
Article Google Scholar
Babaguchi N, Sasamori S, Kitahashi T, Jain R (1999) Detecting events from continuous media by intermodal collaboration and knowledge use. In: Proceedings of the IEEE international conference on multimedia computing and systems, Florence, Italy, 1-7 June 1999, pp 782-786
Babaguchi N, Kawai Y, Kitahashi T (2002) Event based indexing of broadcasted sports video by intermodal collaboration. IEEE Trans Multimedia 4(1):782-786
Article Google Scholar
Brown GJ, Cooke M (1994) Computational auditory scene analysis. Comput Speech Lang (8):297-236
Article Google Scholar
Brown M, Foote J, Jones G, Sparck-Jones K, Young S (1995) Automatic content-based retrieval of broadcast news. In: Proceedings of ACM Multimedia 1995, San Francisco, 5-9 November 1995, pp 35-43
Chang SF, Chen W, Meng HJ, Sundaram H, Zhong D (1997) VideoQ - an automatic content-based video search system using visual cues. In: Proceedings of ACM Multimedia, Seattle, November 1997, pp 313-324
Chen J-Y, Taskiran C, Delp EJ, Bouman CA (1998) ViBE: a new paradigm for video database browsing and search. In: Proceedings of the workshop on content-based access of image and video libraries (in conjunction with CVPR’98), Santa Barbara, CA, June 1998, pp 96-100
Colombo C, Del Bimbo A, Pala P (1999) Semantics in visual information retrieval. In: Proceedings of IEEE Multimedia, 6(3):38-53
Dagtas S, Abdel-Mottaleb M (2001) Extraction of TV highlights using multimedia features. In: Proceedings of the IEEE workshop on multimedia signal processing, Cannes France, 3-5 October 2001, pp 91-96
Eickeler S, Muller S (1999) Content-based video indexing of TV broadcast news using hidden Markov models. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, Phoenix, AZ, 15-19 March 1999, pp 2997-3000
El-Maleh K, Klein M, Petrucci G, Kabal P (2000) Speech/music discrimination for multimedia applications. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, Istanbul, Turkey, 5-9 June 2000, pp 2445-2448
Flickner M(1996) Query by image and video content: the QBIC system. IEEE Comput 28(9):23-32
Google Scholar
Ghias A, Logan J, Chamberlin D, Smith BC (1995) Query by humming - musical information retrieval in an audio database. In: Proceedings of ACM Multimedia, San Francisco, 5-9 November 1995, pp 231-236
Gong Y, Sin LT, Chuan CH, Zhang H, Sakauchi M (1995) Automatic parsing of TV soccer programs. In: Proceedings of the international conference on multimedia computing and systems (ICMCS ‘99), Washington, DC, 15-18 May 1995, pp 167-174
Gunsel B, Ferman M, Tekalp M (1996) Video indexing through integration of syntactic and semantic features. In: Proceedings of the 3rd IEEE workshop on applications of computer vision, Sarasota, FL, 2-4 December 1996, pp 90-95
Hampapur A, Gupta A, Horowitz B, Shu CF, Fuller C, Bach J, Gorkani M, Jain R (1997) Virage video engine. In: Proceedings of SPIE: Storage and Retrieval for Image and Video Databases V, San Jose, CA, February 1997, pp 188-197
Hauptmann AG, Lee D, Kennedy PE (1999) Topic labeling of multilingual broadcast news in the informedia digital video library. In: Proceedings of the ACM DL/ SIGIR MIDAS Workshop, Berkeley, CA, 14 August 1999, pp 287-288
Huang J, Lu Z, Wang Y, Chen Y, Wong EK (1999) Integration of multimodal features for video scene classification based on HMM. In: Proceedings of the IEEE workshop on multimedia signal processing, Copenhagen, Denmark, 13-15 September 1999, pp 53-58
Ma WY, Manjunath BS (1997) Netra: a toolbox for navigating large image databases. In: Proceedings of the IEEE international conference on image processing, Santa Barbara, CA, October 1997, 1:568-571
Martin KD (1999) Sound-source recognition: a theory and computational model. Ph.D. thesis, MIT, Cambridge, MA, June 1999
Mehrotra S, Rui Y, Ortega M, Huang TS (1997) Supporting content-based queries over images in MARS. In: Proceedings of the IEEE international conference on multimedia computing and systems, Ontario, Canada, 3-6 June 1997, pp 632-633
Minam K, Akutsu A, Hamada H, Tomomura Y (1998) Video handling with music and speech detection. In: Proceedings of IEEE Multimedia 5(3):17-25
Naphade MR, Huang TS (2001) A probabilistic framework for semantic video indexing, filtering, and retrieval. IEEE Trans Multimedia 3(1):141-151
Article Google Scholar
Patel NV, Sethi K (1996) Audio characterization for video indexing. In: Proceedings of SPIE on storage and retrieval for still image and video databases, San Jose, CA, 28 January-2 February 1996, 2670:373-384
Patel NV, Sethi K (1997) Video classification using speaker identification. In: Proceedings of IS&T SPIE, Storage and Retrieval for Image and Video Databases IV, San Jose, CA, 8-14 February 1997, pp 218-225
Peker AK, Alatan AA, Akansu AN (2000) Low-level motion activity features for semantic characterization of video. In: Proceedings of the IEEE international conference on multimedia and expo, New York, 30 July-2 August 2000, 2:801-804
Pentland A, Picard RW, Sclaroff S (1994) Photobook: content-based manipulation of image databases. In: Proceedings of SPIE Storage Retrieval Image Video Databases II, San Jose, CA, 6-10 February 1994, 2185:34-47
Pfeiffer S, Fischer S, Effelsberg W (1996) Automatic audio content analysis. In: Proceedings of ACM Multimedia 1996, Boston, 18-22 November 1996, pp 21-30
Pfeiffer S, Lienhart R, Effelsberg W (2001) Scene determination based on video and audio features. Multimedia Tools Appl 15(1):59-81
Article MATH Google Scholar
Picard RW, Minka TP (1995) Vision texture for annotation. Multimedia Sys 3:3-14
Google Scholar
Qian R, Tovinkere V (2001) Detecting semantic events in soccer games: towards a complete solution. In: Proceedings of the IEEE international conference on multimedia and expo, Tokyo, 22-26 August 2001, pp 833-836
Rui Y, Grupta A, Acero A (2000) Automatically extracting highlights for TV baseball programs. In: Proceedings of ACM Multimedia, Los Angeles October 2000, pp 105-115
Satoh S, Nakamura Y, Kanade T (1999) Name-it: naming and detecting faces in news videos. IEEE Multimedia 6(1):22-35
Article Google Scholar
Smith JR, Chang SF (1996) Visualseek: a fully automated content-based image query system. In: Proceedings of ACM Multimedia, Boston, November 1996, pp 87-98
Smith MA, Kanade T (1997) Video skimming and characterization through the combination of image and language understanding techniques. In: Proceedings of CVPR 1997, San Juan, Puerto Rico, 17-19 June 1997, pp 775-781
Sudhir G, Lee JCM, Jain AK (1998) Automatic classification of tennis video for high-level content-based retrieval. In: Proceedings of the IEEE international workshop on content-based access of image and video databases, in conjunction with ICCV’98, Bombay, India, 3 January 1998, pp 81-90
Toklu C, Liou S, Das M (2000) Video abstract: a hybrid approach to generate semantically meaningful video summaries. In: Proceedings of the 1st IEEE international conference on multimedia and expo (ICME), New York, 30 July-2 August 2000, 3:1333-1336
Truong BT, Venkatesh S, Dorai C (2000) Automatic genre identification for content-based video categorization. In: Proceedings of the IEEE international conference on pattern recognition, Barcelona, Spain, 3-8 September 2000, pp 4230-4233
Wold E, Blum T, Keislar D, Wheaton J (1996) Content-based classification, search, and retrieval of audio. In: Proceedings of IEEE Multimedia 3(3):27-36
Zhang H, Tan S, Smoliar S, Yihong G (1995) Automatic parsing and indexing of news video. Multimedia Sys 2(6):256-266
Google Scholar
Zhou W, Vellaikal A, Kuo C (2000) Rule-based video classification system for basketball video indexing. In: Proceedings of ACM Multimedia 2000, Los Angeles, 30 October-4 November 2000, pp 213-216
Zhu W, Toklu C, Liou S (2001) Automatic news video segmentation and categorization based on closed-captioned text. In: Proceedings of the IEEE international conference on multimedia and expo, Tokyo, 22-25 August 2001, pp 1036-1039

Download references

Author information

Authors and Affiliations

Department of Information Science, University of Arkansas, 2800 S. University Ave., AR 72204, Little Rock, USA
Serhan Dagtas
Department of Electrical & Computer Engineering, University of Miami, 1251 Memorial Drive, FL 33146, Coral Gables, USA
Mohamed Abdel-Mottaleb

Authors

Serhan Dagtas
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Abdel-Mottaleb
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Serhan Dagtas.

Additional information

Serhan Dagtas: Correspondence to:

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dagtas, S., Abdel-Mottaleb, M. Multimodal detection of highlights for multimedia content. Multimedia Systems 9, 586–593 (2004). https://doi.org/10.1007/s00530-003-0130-3

Download citation

Issue Date: June 2004
DOI: https://doi.org/10.1007/s00530-003-0130-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal detection of highlights for multimedia content

Abstract.

Access this article

Similar content being viewed by others

Automatic summarization of soccer highlights using audio-visual descriptors

A content-based approach for detecting highlights in action movies

Highlight Detection in Movie Scenes Through Inter-users, Physiological Linkage

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multimodal detection of highlights for multimedia content

Abstract.

Access this article

Similar content being viewed by others

Automatic summarization of soccer highlights using audio-visual descriptors

A content-based approach for detecting highlights in action movies

Highlight Detection in Movie Scenes Through Inter-users, Physiological Linkage

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation