Skip to main content
Log in

Video genre identification using clustering-based shot detection algorithm

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Rapid growth in storage technology and data acquisition has significantly increased the volume of multimedia data online. A challenging problem is to analyze that multimedia data which are in massive quantity. In recent years, indexing of video files based on contents has gained increased popularity in research community. There are also attempts at identifying if a video clip is containing a specific genre of video, e.g., an sports video, movie, drama, animation or talk show. These proposed techniques, however, use a long list of audio visual features in achieving this classification task, which obviously decreases processing efficiency. Based on certain patterns in audio visual features and basic grammar of talk show, this research differentiates multimedia contents of talk shows from rest of the video genres. Our multimodal rule-based classification approach exploits shots and scenes in a video as classification features. The contents from popular multimedia databases like DailyMotion, YouTube and movies from Hollywood and Bollywood are used as dataset to test overall system of genre identification. The system achieves precision and recall of 98% and 100%, respectively, on 600 selected videos of more than 600 h of duration to classify multimedia content as either ‘TalkShow’ or ‘OtherVideo’ category.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Amraee, S., Vafaei, A., Jamshidi, K., Adibi, P.: Abnormal event detection in crowded scenes using one-class SVM. Signal Image Video Process. 12(6), 1115–1123 (2018)

    Article  Google Scholar 

  2. Chen, M., Chen, S.C., Shyu, M.L., Zhang, C.: Video event mining via multimodal content analysis and classification. In: Multimedia Data Mining and Knowledge Discovery, pp. 234–258. Springer (2007)

  3. Ciresan, D., Giusti, A., Gambardella, L.M., Schmidhuber, J.: Deep neural networks segment neuronal membranes in electron microscopy images. In: Advances in Neural Information Processing Systems, pp. 2843–2851 (2012)

  4. Domnic, S.: Walsh-hadamard transform kernel-based feature vector for shot boundary detection. IEEE Trans. Image Process. 23(12), 5187–5197 (2014)

    Article  MathSciNet  Google Scholar 

  5. e Souza, M.R., Pedrini, H.: Combination of local feature detection methods for digital video stabilization. Signal Image Video Process. 12(8), 1513–1521 (2018)

    Article  Google Scholar 

  6. Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013)

    Article  Google Scholar 

  7. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

  8. Harb, H., Chen, L., Auloge, J.Y.: Speech/music/silence and gender detection algorithm. In: Proceedings of the 7th International conference on Distributed Multimedia Systems DMS01. Citeseer (2001)

  9. Kar, T., Kanungo, P.: Video shot boundary detection based on Hilbert and wavelet transform. In: 2017 2nd International Conference on Man and Machine Interfacing (MAMI), pp. 1–6. IEEE (2017)

  10. Karpathy, A., et al.: Large scale video classification with convolutionalneural networks. In: Computer Vision and Pattern Recognition (CVPR), p. 1725. IEEE (2014)

  11. Kawai, Y., Sumiyoshi, H., Yagi, N.: Shot boundary detection at TRECVID 2007. In: TRECVID. Citeseer (2007)

  12. Kim, Y.T., Chua, T.S.: Retrieval of news video using video sequence matching. In: 11th International Multimedia Modelling Conference, pp. 68–75. IEEE (2005)

  13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  14. Li, D., Sethi, I.K., Dimitrova, N., McGee, T.: Classification of general audio data for content-based retrieval. Pattern Recognit. Lett. 22(5), 533–544 (2001)

    Article  Google Scholar 

  15. Li, Y., Narayanan, S., Kuo, C.C.J.: Content-based movie analysis and indexing based on audiovisual cues. IEEE Trans. Circuits Syst. Video Technol. 14(8), 1073–1085 (2004)

    Article  Google Scholar 

  16. Li, Z., Liu, X., Zhang, S.: Shot boundary detection based on multilevel difference of colour histograms. In: 2016 First International Conference on Multimedia and Image Processing (ICMIP), pp. 15–22. IEEE (2016)

  17. Liu, H.Y., Zhang, H.: A sports video browsing and retrieval system based on multimodal analysis: Sportsbr. In: 2005 International Conference on Machine Learning and Cybernetics, vol. 8, pp. 5077–5081. IEEE (2005)

  18. Lu, L., Zhang, H.J., Li, S.Z.: Content-based audio classification and segmentation by using support vector machines. Multimed. Syst. 8(6), 482–492 (2003)

    Article  Google Scholar 

  19. Mondal, J., Kundu, M.K., Das, S., Chowdhury, M.: Video shot boundary detection using multiscale geometric analysis of nsct and least squares support vector machine. Multimed. Tools Appl. 77(7), 8139–8161 (2018)

    Article  Google Scholar 

  20. Montagnuolo, M., Messina, A.: Parallel neural networks for multimodal video genre classification. Multimed. Tools Appl. 41(1), 125–159 (2009)

    Article  Google Scholar 

  21. Panagiotakis, C., Tziritas, G.: A speech/music discriminator based on RMS and zero-crossings. IEEE Trans. Multimed. 7(1), 155–166 (2005)

    Article  Google Scholar 

  22. Peng, Y., Ngo, C.W.: EMD-based video clip retrieval by many-to-many matching. In: International Conference on Image and Video Retrieval, pp. 71–81. Springer (2005)

  23. Pikrakis, A., Giannakopoulos, T., Theodoridis, S.: A speech/music discriminator of radio recordings based on dynamic programming and bayesian networks. IEEE Trans. Multimed. 10(5), 846–857 (2008)

    Article  Google Scholar 

  24. Pingping, C., Guan, Y., Ding, X., Yu, Z.: Shot boundary detection with sparse presentation. In: 2016 IEEE 13th International Conference on Signal Processing (ICSP), pp. 900–904. IEEE (2016)

  25. Sahoo, P., Kanungo, P., Mishra, S.: A fast valley-based segmentation for detection of slowly moving objects. Signal Image Video Process. 12(7), 1265–1272 (2018)

    Article  Google Scholar 

  26. Saunders, J.: Real-time discrimination of broadcast speech/music. In: 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings, vol. 2, pp. 993–996. IEEE (1996)

  27. Scheirer, E., Slaney, M.: Construction and evaluation of a robust multifeature speech/music discriminator. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 1331–1334. IEEE (1997)

  28. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)

  29. Sharif Razavian, A., Azizpour, H., Sullivan, J., Carlsson, S.: CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 806–813 (2014)

  30. Shirahama, K., Uehara, K.: Query by shots: retrieving meaningful events using multiple queries and rough set theory. In: Proceedings of the 9th International Workshop on Multimedia Data Mining: held in conjunction with the ACM SIGKDD 2008, pp. 43–52. ACM (2008)

  31. Smeaton, A.F., Over, P., Doherty, A.R.: Video shot boundary detection: seven years of trecvid activity. Comput. Vis. Image Underst. 114(4), 411–418 (2010)

    Article  Google Scholar 

  32. Song, B.C., Ra, J.B.: Automatic shot change detection algorithm using multi-stage clustering for mpeg-compressed videos. J. Vis. Commun. Image Represent. 12(3), 364–385 (2001)

    Article  Google Scholar 

  33. Supreeth, H., Patil, C.M.: Efficient multiple moving object detection and tracking using combined background subtraction and clustering. Signal Image Video Process. 12(6), 1097–1105 (2018)

    Article  Google Scholar 

  34. Truong, B.T., Dorai, C.: Automatic genre identification for content-based video categorization. In: Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, vol. 4, pp. 230–233. IEEE (2000)

  35. Wu, S., Zhong, S., Liu, Y.: Deep residual learning for image steganalysis. Multimed. Tools Appl. 77(9), 10437–10453 (2018)

    Article  Google Scholar 

  36. Yazdi, M., Fani, M.: Shot boundary detection with effective prediction of transitions’ positions and spans by use of classifiers and adaptive thresholds. In: 2016 24th Iranian Conference on Electrical Engineering (ICEE), pp. 167–172. IEEE (2016)

  37. Zeng, S., Lu, G., Yan, P.: Enhancing human action recognition via structural average curves analysis. Signal Image Video Process. 12(8), 1551–1558 (2018)

    Article  Google Scholar 

  38. Zhang, M., Li, W., Du, Q.: Diverse region-based CNN for hyperspectral image classification. IEEE Trans. Image Process. 27(6), 2623–2634 (2018)

    Article  MathSciNet  Google Scholar 

  39. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2018)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sher Muhammad Daudpota.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Daudpota, S.M., Muhammad, A. & Baber, J. Video genre identification using clustering-based shot detection algorithm. SIViP 13, 1413–1420 (2019). https://doi.org/10.1007/s11760-019-01488-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-019-01488-3

Keywords

Navigation