Skip to main content

Shot Boundary Detection Through Multi-stage Deep Convolution Neural Network

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2021)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12572))

Included in the following conference series:

Abstract

Fast and accurate shot segmentation is very important for content-based video analysis. However, existing solutions have not yet achieved the ideal balance of speed and accuracy. In this paper, we propose a multi-stage shot boundary detection framework based on deep CNN for shot segmentation tasks. The process is composed of three stages, which are respectively for candidate boundary detection, abrupt detection and gradual transition detection. At each stage, deep CNN is used to extract image features, which overcomes the disadvantages of hand-craft feature-based methods such as poor scalability and complex calculation. Besides, we also set a variety of constraints to filter as many non-boundaries as possible to improve the processing speed of the model. In gradual transition detection, we introduce a scheme that can infer the gradual position by computing the probability signals of the start, mid and end of the gradual transition. We conduct experiments on ClipShots and the experimental results show that the proposed model achieves better performance on abrupt and gradual transition detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zhang, C., Wang, W.: A robust and efficient shot boundary detection approach based on fished criterion. In: 20th ACM International Conference on Multimedia, pp. 701–704 (2012)

    Google Scholar 

  2. Lu, Z.-M., Shi, Y.: Fast video shot boundary detection based on SVD and pattern matching. IEEE Trans. Image Process. 22(12), 5136–5145 (2013)

    Article  MathSciNet  Google Scholar 

  3. Adjeroh, D.-A., Lee, M.-C., Banda, N., Kandaswamy, U.: Adaptive edge-oriented shot boundary detection. J. Image Video Proc. 2009, 859371 (2009)

    Article  Google Scholar 

  4. Cernekova, Z., Pitas, I., Nikou, C.: Information theory-based shot cut/fade detection and video summarization. IEEE Trans. Circ. Syst. Video Technol. 16(1), 82–91 (2006)

    Article  Google Scholar 

  5. Priya, L., Domnic, S.: Walsh-hadamard transform kernel-based feature vector for shot boundary detection. IEEE Trans. Image Process. 23(12), 5187–5197 (2014)

    Article  MathSciNet  Google Scholar 

  6. Apostolidis, E., Mezaris, V.: Fast shot segmentation combining global and local visual descriptors. 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6583–6587. IEEE (2014)

    Google Scholar 

  7. Mohanta, P.-P., Saha, S.-K., Chanda, B.: A model-based shot boundary detection technique using frame transition parameters. IEEE Trans. Multimedia 14(1), 223–233 (2012)

    Article  Google Scholar 

  8. Lian, S.: Automatic video temporal segmentation based on multiple features. Soft Comput. 15(3), 469–482 (2011)

    Article  Google Scholar 

  9. Baraldi, L., Grana, C., Cucchiara, R.: Shot and scene detection via hierarchical clustering for re-using broadcast video. In: Azzopardi, G., Petkov, N. (eds.) CAIP 2015. LNCS, vol. 9256, pp. 801–811. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23192-1_67

    Chapter  Google Scholar 

  10. Lankinen, J., Kamarainen, J.-K.: Video shot boundary detection using visual bag-of-words. In: International Conference on Computer Vision Theory and Applications, pp. 788–791 (2013)

    Google Scholar 

  11. Thounaojam, D.-M., Thonga, M.K., Singh, K.-M., Roy, S.: A genetic algorithm and fuzzy logic approach for video shot boundary detection. Comput. Intell. Neurosci. 2016, 1–11 (2016)

    Article  Google Scholar 

  12. Yusoff, Y., Christmas, W.-J., Kittler, J.: Video shot cut detection using adaptive thresholding. In: Proceedings of the British Machine Conference, pp. 1–10. BMVA Press (2000)

    Google Scholar 

  13. Wu, X., Yuan, P.-C., Liu, C., Huang, J.: Shot boundary detection: an information saliency approach. In: 2008 Congress on Image and Signal Processing, pp. 808–812. IEEE (2010)

    Google Scholar 

  14. Xia, D., Deng, X., Zeng, Q.: Shot boundary detection based on difference sequences of mutual information. In: 4th International Conference on Image and Graphics (ICIG 2007), pp. 389–394. IEEE (2007)

    Google Scholar 

  15. Tippaya, S., Sitjongsataporn, S., Tan, T., Khan, M.-M., Chamnongthai, K.: Multi-modal visual features-based video shot boundary detection. IEEE Access 5, 12563–12575 (2017)

    Article  Google Scholar 

  16. Tong, W., Song, L., Yang, X., Qu, H., Xie, R.: CNN-based shot boundary detection and video annotation. In: 2015 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting, pp. 1–5. IEEE (2015)

    Google Scholar 

  17. Hassanien, A., Elgharib, M.-A., Selim, A., Hefeeda, M., Matusik, W.: Large-scale, fast and accurate shot boundary detection through spatio-temporal convolutional neural networks. arXiv preprint arXiv: 1705.03281 (2017)

  18. Gygli, M.: Ridiculously fast shot boundary detection with fully convolutional neural networks. In: 2018 International Conference on Content-Based Multimedia Indexing, pp. 1–4. IEEE (2018)

    Google Scholar 

  19. Tang, S., Feng, L., Kuang, Z., Chen, Y., Zhang, W.: Fast video shot transition localization with deep structured models. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11361, pp. 577–592. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20887-5_36

    Chapter  Google Scholar 

  20. Nibali, A., He, Z., Morgan, S., Greenwood, G.: Extraction and classification of diving clips from continuous video footage. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 94–104. IEEE (2017)

    Google Scholar 

  21. Hara, K., Kataoka, H., Satoh, Y.: Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and imagenet?. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6546–6555. IEEE (2018)

    Google Scholar 

  22. Victor, B., He, Z., Morgan, S., Miniutti, D.: Continuous video to simple signals for swimming stroke detection with convolutional neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 122–131. IEEE (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junqing Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, T., Feng, N., Yu, J., He, Y., Hu, Y., Chen, YP.P. (2021). Shot Boundary Detection Through Multi-stage Deep Convolution Neural Network. In: Lokoč, J., et al. MultiMedia Modeling. MMM 2021. Lecture Notes in Computer Science(), vol 12572. Springer, Cham. https://doi.org/10.1007/978-3-030-67832-6_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67832-6_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67831-9

  • Online ISBN: 978-3-030-67832-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics