Skip to main content

Content Based Identification of Talk Show Videos Using Audio Visual Features

  • Conference paper
  • First Online:
Machine Learning and Data Mining in Pattern Recognition (MLDM 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9729))

Abstract

TV Talk Shows are used for exchanging views among participants on a particular topic. Huge audiences, among all age groups, follow the talk shows worldwide to acquire knowledge on current affairs and other topics of their interest. A major portion of these audiences use online video databases to search talk shows of their interest. Online video databases contain videos of different genres like movie, drama, talk shows, animations, sports, horror, music etc. Searching a particular talk show, in presence of many other video files, is a tedious task, especially when the uploader has not used proper naming convention while assigning caption to the file as search in the video databases is still text based. Different contents based classification techniques have been proposed in literature to efficiently index video contents on Internet. Literature also includes few attempts at identifying if a video clip is representing a talk show; however, this identification already found in literature uses a long list of features which makes its processing slow. This paper proposes a solution based on audio visual features and employing basic grammar of talk shows recording to automatically identify if a video recently uploaded on video database is containing a talk show. We have performed experiments on different genres of videos collected from YouTube, Dialymotion, movies from Bollywood and Hollywood. The proposed system classifies a video file in TalkShow and OtherVideo classes with precision and recall of 93% and 100% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. REELSO, 2018-internet-traffic-video (2015). http://www.reelseo.com

  2. YouTube Statistics (2015). https://www.youtube.com/yt/press/statistics.html

  3. IMDb Statistics (2015). http://www.imdb.com/stats

  4. Statistica, TV program type Number of TV viewers who typically watch daytime talk shows (2015). http://www.statista.com/statistics/229097/tv-viewers-who-typically-watch-daytime-talk-shows-usa

  5. Lehane, B., O’Connor, N.E., Murphy, N.: Action sequence detection in motion pictures. In: European Workshop on the Integration of Knowledge, Semantic and Digital Media Technologies (2004)

    Google Scholar 

  6. Shirahama, K., Uehara, K.: Query by shots, retrieving meaningful events using multiple queries and rough set theory. Eurasip Journal on Image and Video Processing (2008)

    Google Scholar 

  7. Lehane, B., O’Connor, N.E., Lee, H., Smeaton, A.F.: Indexing of fictional video content for event detection and summarisation. Eurasip Journal on Image and Video Processing (2007)

    Google Scholar 

  8. Naphade, M.R., Smith, J.R., Tesic, J., Chang, S.F., Hsu, W.H., Kennedy, L.S., et al.: Large-Scale Concept Ontology for Multimedia. IEEE Multimedia (2006)

    Google Scholar 

  9. Haering, N., Qian, R.J., Ibrahim, M.: A semantic event-detection approach and its application to detecting hunts in wildlife video. IEEE Transactions on Circuits and Systems for Video Technology (1999)

    Google Scholar 

  10. Raoch, M.: Video genre classification using dynamics, acoustics, speech, and signal processing. In: IEEE Proceedings (2001)

    Google Scholar 

  11. Kim, Y.T., Chua, T.S.: Retrieval of news video using video sequence matching. In: Multimedia Modeling (2005)

    Google Scholar 

  12. Peng, Y., Ngo, C.W.: Emd-Based Video Clip Retrieval by Many-to-Many Matching (2005)

    Google Scholar 

  13. Liu, X., Zhuang, Y., Pan, Y.: A new approach to retrieve video by example video clip. In: ACM Multimedia Conference (1999)

    Google Scholar 

  14. Jain, A.K., Vailaya, A., Xiong, W.: Query by video clip. Multimedia Systems (1999)

    Google Scholar 

  15. Sundaram, H., Chang, S.F.: Computable scenes and structures in films. IEEE Transactions on Multimedia (2002)

    Google Scholar 

  16. Lienhart, R., Pfeiffer, S., Effelsberg, W.: Scene determination based on video and audio features. Multimedia Tools and Applications (1998)

    Google Scholar 

  17. Chen, L., Rizvi, S.J., Ozsu, M.T.: Incorporating audio cues into dialog and action scene extraction. In: Storage and Retrieval for Image and Video Databases (2003)

    Google Scholar 

  18. Kotti, M., Kotropoulos, C., Ziólko, B., Pitas, I., Moschou, V.: A framework for dialogue detection in movies. In: Gunsel, B., Jain, A.K., Tekalp, A.M., Sankur, B. (eds.) MRCS 2006. LNCS, vol. 4105, pp. 371–378. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  19. Zhai, Y., Rasheed, Z., Shah, M.: A framework for semantic classification of scenes using finite state machines. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 279–288. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  20. Lu, L., Zhang, H.J.: Content analysis for audio classification and segmentation. IEEE Transactions on Audio, Speech and Language Processing (2002)

    Google Scholar 

  21. Zhang, H.J., Li, S.Z.: Content-based audio classification and segmentation by using support vector machines. Multimedia Systems (2003)

    Google Scholar 

  22. Panagiotakis, C., Tziritas, G.: IEEE Transactions on Multimedia (2005)

    Google Scholar 

  23. Ekenel, H.K., Semela, T., Stiefelhagen, R.: Content-based video genre classification using multiple cues. In: AIEMPro (2004)

    Google Scholar 

  24. Li, Y., Narayanan, S., Kuo, C.J.: Content-based movie analysis and indexing based on audiovisual cues. IEEE Transactions on Circuits and Systems for Video Technology (2004)

    Google Scholar 

  25. Doudpota, S.M., Guha, S.: Mining movies to extract song sequences. In: MDMKDD 2011 (2011)

    Google Scholar 

  26. Zhang, H., Low, C.Y., Smoliar, S.W.: Video parsing and browsing using compressed data. Multimedia Tools and Applications (1995)

    Google Scholar 

  27. Zhang, H., Low, C.Y., Smoliar, S.W.: ACM Multimedia Conference (1995)

    Google Scholar 

  28. Scheirer, E., Slaney, M.: Construction and evaluation of a robust multifeature speech/music discriminator. In: ACM Multimedia Conference (1997)

    Google Scholar 

  29. Li, D., Sethi, I.K., Dimitrova, N., Mcgee, T.: Classification of general audio data for content-based retrieval. Pattern Recognition Letters (2001)

    Google Scholar 

  30. Harb, H., Chen, L., Auloge, J.Y.: Speech/music/silence and gender detection algorithm. In: Proceedings of the 7th International Conference on Distributed Multimedia Systems, DMS 2001 (2001)

    Google Scholar 

  31. Aggelos Pikrakis, T.G., Theodoridis, S.: A speech/music discriminator of radio recordings based on dynamic programming and bayesian networks. IEEE Transactions on Multimedia (2008)

    Google Scholar 

  32. Panagiotakis, C., Tziritas, G.: A speech/music discriminator based on rms and zero-crossings. IEEE Transactions on Multimedia (2005)

    Google Scholar 

  33. Lu, L., Li, S.Z., Zhang, H.J.: Content-based audio segmentation using support vector machines. In: IEEE International Conference on Multimedia and Expo (2001)

    Google Scholar 

  34. Saunders, J.: Real time discrimination of broadcast speech/music. In: International Conference on Acoustics, Speech, and Signal Processing (1996)

    Google Scholar 

  35. Zhang, T., Jay Kuo, C.C.: Real time discrimination of broadcast speech/music. In: International Conference on Acoustics, Speech, and Signal Processing (1999)

    Google Scholar 

  36. Zhang, H., Low, C.Y., Smoliar, S.W.: Video parsing, retrieval and browsing: an integrated and content-based solution. In: ACM Multimedia Conference (1995)

    Google Scholar 

  37. Oger, S., Linares, G., Matrouf, D.: Audio-Based Video Genre Identification (2015)

    Google Scholar 

  38. Martins, G.B., Almeida, J., Papa, J.P.: Supervised video genre classification using optimum-path forest. In: Pardo, A., Kittler, J. (eds.) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. LNCS, vol. 9423, pp. 735–742. Springer, Switzerland (2015)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Atta Muhammad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Muhammad, A., Daudpota, S.M. (2016). Content Based Identification of Talk Show Videos Using Audio Visual Features. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-41920-6_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-41919-0

  • Online ISBN: 978-3-319-41920-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics