Content Based Identification of Talk Show Videos Using Audio Visual Features

Muhammad, Atta; Daudpota, Sher Muhammad

doi:10.1007/978-3-319-41920-6_20

Atta Muhammad¹⁴ &
Sher Muhammad Daudpota¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9729))

Included in the following conference series:

International Conference on Machine Learning and Data Mining in Pattern Recognition

3033 Accesses
1 Citations

Abstract

TV Talk Shows are used for exchanging views among participants on a particular topic. Huge audiences, among all age groups, follow the talk shows worldwide to acquire knowledge on current affairs and other topics of their interest. A major portion of these audiences use online video databases to search talk shows of their interest. Online video databases contain videos of different genres like movie, drama, talk shows, animations, sports, horror, music etc. Searching a particular talk show, in presence of many other video files, is a tedious task, especially when the uploader has not used proper naming convention while assigning caption to the file as search in the video databases is still text based. Different contents based classification techniques have been proposed in literature to efficiently index video contents on Internet. Literature also includes few attempts at identifying if a video clip is representing a talk show; however, this identification already found in literature uses a long list of features which makes its processing slow. This paper proposes a solution based on audio visual features and employing basic grammar of talk shows recording to automatically identify if a video recently uploaded on video database is containing a talk show. We have performed experiments on different genres of videos collected from YouTube, Dialymotion, movies from Bollywood and Hollywood. The proposed system classifies a video file in TalkShow and OtherVideo classes with precision and recall of 93% and 100% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

REELSO, 2018-internet-traffic-video (2015). http://www.reelseo.com
YouTube Statistics (2015). https://www.youtube.com/yt/press/statistics.html
IMDb Statistics (2015). http://www.imdb.com/stats
Statistica, TV program type Number of TV viewers who typically watch daytime talk shows (2015). http://www.statista.com/statistics/229097/tv-viewers-who-typically-watch-daytime-talk-shows-usa
Lehane, B., O’Connor, N.E., Murphy, N.: Action sequence detection in motion pictures. In: European Workshop on the Integration of Knowledge, Semantic and Digital Media Technologies (2004)
Google Scholar
Shirahama, K., Uehara, K.: Query by shots, retrieving meaningful events using multiple queries and rough set theory. Eurasip Journal on Image and Video Processing (2008)
Google Scholar
Lehane, B., O’Connor, N.E., Lee, H., Smeaton, A.F.: Indexing of fictional video content for event detection and summarisation. Eurasip Journal on Image and Video Processing (2007)
Google Scholar
Naphade, M.R., Smith, J.R., Tesic, J., Chang, S.F., Hsu, W.H., Kennedy, L.S., et al.: Large-Scale Concept Ontology for Multimedia. IEEE Multimedia (2006)
Google Scholar
Haering, N., Qian, R.J., Ibrahim, M.: A semantic event-detection approach and its application to detecting hunts in wildlife video. IEEE Transactions on Circuits and Systems for Video Technology (1999)
Google Scholar
Raoch, M.: Video genre classification using dynamics, acoustics, speech, and signal processing. In: IEEE Proceedings (2001)
Google Scholar
Kim, Y.T., Chua, T.S.: Retrieval of news video using video sequence matching. In: Multimedia Modeling (2005)
Google Scholar
Peng, Y., Ngo, C.W.: Emd-Based Video Clip Retrieval by Many-to-Many Matching (2005)
Google Scholar
Liu, X., Zhuang, Y., Pan, Y.: A new approach to retrieve video by example video clip. In: ACM Multimedia Conference (1999)
Google Scholar
Jain, A.K., Vailaya, A., Xiong, W.: Query by video clip. Multimedia Systems (1999)
Google Scholar
Sundaram, H., Chang, S.F.: Computable scenes and structures in films. IEEE Transactions on Multimedia (2002)
Google Scholar
Lienhart, R., Pfeiffer, S., Effelsberg, W.: Scene determination based on video and audio features. Multimedia Tools and Applications (1998)
Google Scholar
Chen, L., Rizvi, S.J., Ozsu, M.T.: Incorporating audio cues into dialog and action scene extraction. In: Storage and Retrieval for Image and Video Databases (2003)
Google Scholar
Kotti, M., Kotropoulos, C., Ziólko, B., Pitas, I., Moschou, V.: A framework for dialogue detection in movies. In: Gunsel, B., Jain, A.K., Tekalp, A.M., Sankur, B. (eds.) MRCS 2006. LNCS, vol. 4105, pp. 371–378. Springer, Heidelberg (2006)
Chapter Google Scholar
Zhai, Y., Rasheed, Z., Shah, M.: A framework for semantic classification of scenes using finite state machines. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 279–288. Springer, Heidelberg (2004)
Chapter Google Scholar
Lu, L., Zhang, H.J.: Content analysis for audio classification and segmentation. IEEE Transactions on Audio, Speech and Language Processing (2002)
Google Scholar
Zhang, H.J., Li, S.Z.: Content-based audio classification and segmentation by using support vector machines. Multimedia Systems (2003)
Google Scholar
Panagiotakis, C., Tziritas, G.: IEEE Transactions on Multimedia (2005)
Google Scholar
Ekenel, H.K., Semela, T., Stiefelhagen, R.: Content-based video genre classification using multiple cues. In: AIEMPro (2004)
Google Scholar
Li, Y., Narayanan, S., Kuo, C.J.: Content-based movie analysis and indexing based on audiovisual cues. IEEE Transactions on Circuits and Systems for Video Technology (2004)
Google Scholar
Doudpota, S.M., Guha, S.: Mining movies to extract song sequences. In: MDMKDD 2011 (2011)
Google Scholar
Zhang, H., Low, C.Y., Smoliar, S.W.: Video parsing and browsing using compressed data. Multimedia Tools and Applications (1995)
Google Scholar
Zhang, H., Low, C.Y., Smoliar, S.W.: ACM Multimedia Conference (1995)
Google Scholar
Scheirer, E., Slaney, M.: Construction and evaluation of a robust multifeature speech/music discriminator. In: ACM Multimedia Conference (1997)
Google Scholar
Li, D., Sethi, I.K., Dimitrova, N., Mcgee, T.: Classification of general audio data for content-based retrieval. Pattern Recognition Letters (2001)
Google Scholar
Harb, H., Chen, L., Auloge, J.Y.: Speech/music/silence and gender detection algorithm. In: Proceedings of the 7th International Conference on Distributed Multimedia Systems, DMS 2001 (2001)
Google Scholar
Aggelos Pikrakis, T.G., Theodoridis, S.: A speech/music discriminator of radio recordings based on dynamic programming and bayesian networks. IEEE Transactions on Multimedia (2008)
Google Scholar
Panagiotakis, C., Tziritas, G.: A speech/music discriminator based on rms and zero-crossings. IEEE Transactions on Multimedia (2005)
Google Scholar
Lu, L., Li, S.Z., Zhang, H.J.: Content-based audio segmentation using support vector machines. In: IEEE International Conference on Multimedia and Expo (2001)
Google Scholar
Saunders, J.: Real time discrimination of broadcast speech/music. In: International Conference on Acoustics, Speech, and Signal Processing (1996)
Google Scholar
Zhang, T., Jay Kuo, C.C.: Real time discrimination of broadcast speech/music. In: International Conference on Acoustics, Speech, and Signal Processing (1999)
Google Scholar
Zhang, H., Low, C.Y., Smoliar, S.W.: Video parsing, retrieval and browsing: an integrated and content-based solution. In: ACM Multimedia Conference (1995)
Google Scholar
Oger, S., Linares, G., Matrouf, D.: Audio-Based Video Genre Identification (2015)
Google Scholar
Martins, G.B., Almeida, J., Papa, J.P.: Supervised video genre classification using optimum-path forest. In: Pardo, A., Kittler, J. (eds.) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. LNCS, vol. 9423, pp. 735–742. Springer, Switzerland (2015)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Sukkur IBA, Sukkur, Sindh, Pakistan
Atta Muhammad & Sher Muhammad Daudpota

Authors

Atta Muhammad
View author publications
You can also search for this author in PubMed Google Scholar
Sher Muhammad Daudpota
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Atta Muhammad .

Editor information

Editors and Affiliations

IBaI, Inst of Comp Vision and applied Comp Sci, Leipzig, Sachsen, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muhammad, A., Daudpota, S.M. (2016). Content Based Identification of Talk Show Videos Using Audio Visual Features. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-41920-6_20
Published: 28 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics