Abstract
Key frame extraction is an important manner of video summarization. It can be used to interpret video content quickly. Existing approaches first partition the entire video into video clips by shot boundary detection, and then, extract key frames by frame clustering. However, in most team-sport videos, a video clip usually includes many events, and it is difficult to extract the key frames related to all of these events accurately, because different events of a game shot can have features of similar appearance. As is well known, most events in team-sport videos are attack and defense conversions, which are related to global translation. Therefore, by using fine-grained partition based on the global motion, a shot could be further partitioned into more video clips, from which more key frames could be extracted and they are related to the events. In this study, global horizontal motion is introduced to further partition video clips into fine-grained video clips. Furthermore, global motion statistics are utilized to extract candidate key frames. Finally, the representative key frames are extracted based on the spatial–temporal consistence and hierarchical clustering, and the redundant frames are removed. A dataset called SportKF is built, which includes 25 videos of 197,878 frames in 112 min and 764 key frames from four types of sports (basketball, football, American football and field hockey). The experimental results demonstrate that the proposed scheme achieves state-of-the-art performance by introducing global motion statistics.
Similar content being viewed by others
Notes
Youtube-8M: https://research.google.com/youtube8m/.
References
Abd-Almageed, W.: Online, simultaneous shot boundary detection and key frame extraction for sports videos using rank tracing. In: 2008 15th IEEE International Conference on Image Processing, pp 3200–3203 (2008)
Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 92(1), 1–31 (2011)
Basavarajaiah, M., Sharma, P.: Survey of compressed domain video summarization techniques. ACM Comput. Surv. 52(6), 116–144 (2019)
Cernekova, Z., Pitas, I., Nikou, C.: Information theory-based shot cut/fade detection and video summarization. IEEE Trans. Circ. Syst. Video Technol. 16(1), 82–91 (2006)
Chacon-Quesada, R., Siles-Canales, F.: Evaluation of different histogram distances for temporal segmentation in digital videos of football matches from tv broadcast. In: 2017 International Conference and Workshop on Bioinspired Intelligence (IWOBI), pp 1–7 (2017)
Chen, Y., Hu, W., Zeng, X., Li, W.: Indexing and matching of video shots based on motion and color analysis. In: 2006 9th International Conference on Control, pp 1–6 (2006)
de Avila, S.E.F., Lopes, A.P.B., da Luz, A., de Albuquerque, Araújo A.: Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn. Lett. 32(1), 56–68 (2011)
Devanne, M., Wannous, H., Berretti, S., Pala, P., Daoudi, M., Del Bimbo, A.: 3d human action recognition by shape analysis of motion trajectories on Riemannian manifold. IEEE Trans. Cybern. 45(7), 1340–1352 (2015)
Doulamis, N.D., Doulamis, A.D., Avrithis, Y., Kollias, S.D.: A stochastic framework for optimal key frame extraction from mpeg video databases. In: 1999 IEEE Third Workshop on Multimedia Signal Processing, pp 141–146 (1999)
Evangelopoulos, G., Zlatintsi, A., Skoumas, G., Rapantzikos, K., Potamianos, A., Maragos, P., Avrithis, Y.: Video event detection and summarization using audio, visual and text saliency. In: 2009 IEEE International Conference on Acoustics, pp 3553–3556 (2009)
Gharbi, H., Bahroun, S., Massaoudi, M., Zagrouba, E.: Key frames extraction using graph modularity clustering for efficient video summarization. In: 2017 IEEE International Conference on Acoustics, pp 1502–1506 (2017)
Gunawardena, P., Sudarshana, H., Amila, O., Nawaratne, R., Alahakoon, D., Perera, A.S., Chitraranjan, C.: Interest-oriented video summarization with keyframe extraction. In: 2019 19th International Conference on Advances in ICT for Emerging Regions, vol 250, pp 1–8 (2019)
Hannane, R., Elboushaki, A., Afdel, K., Naghabhushan, P., Javed, M.: An efficient method for video shot boundary detection and keyframe extraction using sift-point distribution histogram. Int. J. Multimed. Inf. Retrieval 5(2), 89–104 (2016)
Huang, C., Wang, H.: A novel key-frames selection framework for comprehensive video summarization. IEEE Trans. Circuits Syst. Video Technol. 30(2), 577–589 (2020)
Huayong, L., Tao, L.: Key frame extraction based on improved frame blocks features and second extraction. In: 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery, pp 1950–1955 (2015)
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: Flownet 2.0: Evolution of optical flow estimation with deep networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1647–1655 (2017)
Ioannidis, A., Chasanis, V., Likas, A.: Weighted multi-view key-frame extraction. Pattern Recogn. Lett. 72, 52–61 (2016)
Kumar, K.: Evs-dk: event video skimming using deep keyframe. J. Vis. Commun. Image Represent. 58, 345–352 (2019)
Kumar, K., Shrimankar, D.D.: Deep event learning boost-up approach: Delta. Multimed. Tools Appl. 77, 26635–26655 (2018)
Kumar, K., Shrimankar, D.D.: F-des: Fast and deep event summarization. IEEE Trans. Multimed. 20(2), 323–334 (2018)
Kumar, K., Shrimankar, D.D., Singh, N.: Equal partition based clustering approach for event summarization in videos. In: 2016 12th International Conference on Signal-Image Technology Internet-Based Systems (SITIS), pp 119–126 (2016)
Kumar, K., Shrimankar, D.D., Singh, N.: Event bagging: A novel event summarization approach in multiview surveillance videos. In: 2017 International Conference on Innovations in Electronics, Signal Processing and Communication (IESC), pp 106–111 (2017)
Kumar, K., Shrimankar, D.D., Singh, N.: Eratosthenes sieve based key-frame extraction technique for event summarization in videos. Multimed. Tools Appl. 77, 7383–7404 (2018)
Kumar, K., Shrimankar, D.D., Singh, N.: V-less: a video from linear event summaries. Adv. Intell. Syst. Comput. 703, 385–395 (2018)
Kumar, K., Shrimankar, D.D., Singh, N.: Key-lectures: keyframes extraction in video lectures. Mach. Intell. Signal Anal. 748, 453–459 (2019)
Lin, Y., Lian, F.: Data reduction based on keyframe with motion energy extraction rules. In: 2014 IEEE International Conference on Information and Automation (ICIA), pp 507–512 (2014)
Mazloom, M., Habibian, A., Liu, D., Snoek, C.G., Chang, S.F.: Encoding concept prototypes for video event detection and summarization. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp 123–130 (2015)
Mendi, E., Clemente, H.B., Bayrak, C.: Sports video summarization based on motion analysis. Comput. Electr. Eng. 39(3), 790–796 (2013)
Mounika, B.R., Prakash, O., Khare, A.: Key frame extraction using uniform local binary pattern. In: 2018 Second International Conference on Advances in Computing, Control and Communication Technology, pp 87–91 (2018)
Nasreen, A., Roy, K., Roy, K., Shobha, G.: Key frame extraction and foreground modelling using k-means clustering. In: 2015 7th International Conference on Computational Intelligence, Communication Systems and Networks, pp 141–145 (2015)
Peng, X., Lexing, X., Shih-Fu, C., Divakaran, A., Vetro, A., Huifang, S. (2001) Algorithms and system for segmentation and structure analysis in soccer video. In: IEEE International Conference on Multimedia and Expo, pp 721–724
Qu, Z., Gao, T.F.: An improved algorithm of keyframe extraction for video summarization. Adv. Mater. Res. 225–226(1–2), 807–811 (2011)
Rashedi, E., Mirzaei, A., Rahmati, M.: An information theoretic approach to hierarchical clustering combination. Neurocomputing 148, 487–497 (2015)
Ren, Z., Yan, J., Ni, B., Liu, B., Yang, X., Zha, H.: Unsupervised deep learning for optical flow estimation. Proceedings of the AAAI Conference on Artificial Intelligence 31(1), (2017)
Sun, D., Yang, X., Liu, M., Kautz, J.: Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 8934–8943 (2018)
Vennila, TJ., Balamurugan, V.: A stochastic framework for keyframe extraction. In: 2020 International Conference on Emerging Trends in Information Technology and Engineering, pp 1–5 (2020)
Vila, R.M, Bardera, A.Q., Feixas, M.S.: (2013) Tsallis entropy-based information measures for shot boundary detection and keyframe selection. Signal Image and Video Processing pp 507–520
Vázquez-Martín, R., Bandera, A.: Spatio–temporal feature-based keyframe detection from video shots using spectral clustering. Pattern Recogn. Lett. 34(7), 770–779 (2013)
Wu, L., Zhang, S., Jian, M., Lu, Z., Wang, D.: Two stage shot boundary detection via feature fusion and spatial–temporal convolutional neural networks. IEEE Access 7, 77268–77276 (2019)
Wu, L., Yang, Z., Wang, Q., Jian, M., Zhao, B., Yan, J., Chen, C.W.: Fusing motion patterns and key visual information for semantic event recognition in basketball videos. Neurocomputing 413, 217–229 (2020)
Xia, G., Chen, B., Sun, H., Liu, Q.: Nonconvex low-rank kernel sparse subspace learning for keyframe extraction and motion segmentation. IEEE Transactions on Neural Networks and Learning Systems pp 1–15 (2020)
Xie, W.B., Lee, Y.L., Wang, C., Chen, D.B., Zhou, T.: Hierarchical clustering supported by reciprocal nearest neighbors. Inf. Sci. 527, 279–292 (2020)
Yueting, Z., Yong, R., Huang, T.S., Mehrotra, S.: Adaptive key frame extraction using unsupervised clustering. In: Proceedings 1998 International Conference on Image Processing, vol 1, pp 866–870 (1998)
Zhang, X., He, Z.: Video abnormality judgment based on grayscale and optical flow detection. Chin. J. Electron. Dev. 42(3), 718–721 (2019)
Zhong, D., Kumar, R., Chang, SF.: Real-time personalized sports video filtering and summarization. In: Proceedings of the Ninth ACM International Conference on Multimedia, pp 623–625 (2001)
Acknowledgements
This study was partially supported by National Natural Science Foundation of China (61976010, 61802011, 61702022), National Key R&D Program of China (2019YFF0301802), Beijing Municipal Education Committee Science Foundation (KM201910005024), Postdoctoral Research Foundation of China (2018M640033), and ”Ri Xin” Training Programme Foundation for Talents by Beijing University of Technology. We would also like to thank Editage (www.editage.cn) for English language editing.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yuan, Y., Lu, Z., Yang, Z. et al. Key frame extraction based on global motion statistics for team-sport videos. Multimedia Systems 28, 387–401 (2022). https://doi.org/10.1007/s00530-021-00777-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-021-00777-7