Abstract
With the rapid development of the Internet and P2P technology, multimedia resources are gradually adding and used widely. Since network traffic increases sharply, how to choose the interested information for a number of Internet users is challenging. So, technologies and applications, such as video search, video fast browsing, video index and storage are in great demand. Behind these technologies and applications, an important problem is how to quickly browse massive video data and obtain the main content of the video. To solve this problem, different key frame extraction algorithms have been proposed. Due to the diversity of video content, different video have different characteristics. So the design of general video key frame extraction algorithm to solve the problem is not the reality. The main trend for the problem is to design the key frame extraction algorithm based on the characteristics of the video itself. In this article, we mainly focus on videos with edited boundaries and shot conversions. Aiming at this kind of video, we have designed and implemented video key frame extraction algorithm based on sliding window, the global feature Gist and local feature point detection algorithm SURF. In this algorithm, we use Gist feature to construct the global scene information of frames,and the SURF key point detection algorithm to extract local key points as local feature for each frame. Then, shot segmentation based on sliding window and shot merging algorithm is applied to dividing the original video into several shots. After that,we select the most representative frames in each video shot as key frames. Finally we evaluate the result of the algorithm from the subjective and objective perspective. Results show that key frames extracted in the algorithm are of high quality and can basically cover the main content of the original video.
Similar content being viewed by others
References
Ba TT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM Trans Multimed Comput Commun Appl (TOMM) 3(1):3
Bay H, Ess A, Tuytelaars T, Gool LV (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110(3):346–359
Chasanis VT, Ioannidis AI, Likas AC (2014) Efficient key-frame extraction based on unimodality of frame sequences. In: 2014 12th international conference on signal processing (ICSP), pp 1133–1138
Chen S, Li M, Ren K, Qiao C (2015) Crowd Map: Accurate reconstruction of indoor floor plans from crowdsourced Sensor-Rich videos. In: 2015 IEEE 35th International conference on distributed computing systems, pp 1–10
Cong Y, Yuan J, Luo J (2012) Towards scalable summarization of consumer videos via sparse dictionary selection. IEEE Trans Multimedia 14(1):66–75
Dang CT, Kumar M, Radha H (2012) Key frame extraction from consumer videos using epitome. In: 2012 19th IEEE International conference on Image processing (ICIP), pp 93–96
Gianluigi C, Raimondo S (2006) An innovative algorithm for key frame extraction in video summarization. J Real-Time Image Proc 1(1):69–88
Hanjalic A, Zhang H (1999) An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis. IEEE Trans Circuits Syst Video Technol 9(8):1280–1289
Jafarpour S, Cevher V, Schapire RE (2011) A game theoretic approach to expander-based compressive sensing. In: 2011 IEEE International symposium on information theory proceedings (ISIT), pp 464–468
Koh J-L, Lee C-S, Chen ALP (1999) Semantic video model for content-based retrieval. In: 1999 IEEE International conference on multimedia computing and systems, vol 2, pp 472–478
Kullback S (1959) Information theory and statistics. Wiley, New York
Kumar M, Loui AC (2011) Key frame extraction from consumer videos using sparse representation. In: 2011 18th IEEE International conference on image processing (ICIP), pp 2437– 2440
Liao X, Li L, Tan G, Jin H, Yang X, Zhang W, Bo L (2016) Liverender: A cloud gaming system based on compressed graphics streaming. IEEE/ACM Trans Networking 24(4):2128–2139
Lienhart RW (1998) Comparison of automatic shot boundary detection algorithms. In: Electronic imaging’99 , pages 290–301. International society for optics and photonics
Liu T, Kender JR (2007) Computational approaches to temporal sampling of video sequences. ACM Trans Multimed Comput Commun Appl (TOMM) 3(2):7
Luo J, Papin C, Costello K (2009) Towards extracting semantically meaningful key frames from personal video clips: from humans to computers. IEEE Trans Circuits Syst Video Technol 19(2):289–301
Oliva A, Torralba A (2001) Modeling the shape of the scene a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Panagiotakis C, Doulamis A, Tziritas G (2009) Equivalent key frames selection based on iso-content principles. IEEE Trans Circuits Syst Video Technol 19(3):447–451
Rasheed Z, Shah M (2005) Detection and representation of scenes in videos. IEEE Trans Multimedia 7 (6):1097–1105
Rong J, Jin W, Lide W (2004) Key frame extraction using inter-shot information. In: 2004 IEEE International conference on multimedia and expo, 2004. ICME’04. , vol 1, pp 571–574
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22 (8):888–905
Xiao-Dong Y, Wang L, Qi T, Xue P (2004) Multilevel video representation with application to keyframe extraction Multimedia modelling conference, 2004. Proceedings. 10th international, pp 117–123
Zhang X-D, Liu T-Y, Lo K-T, Feng J (2003) Dynamic selection and effective compression of key frames for video abstraction. Pattern Recogn Lett 24(9):1523–1532
Zhuang Y, Rui Y, Huang TS, Mehrotra S (1998) Adaptive key frame extraction using unsupervised clustering 1998 International conference on image processing, 1998. ICIP 98. Proceedings. , vol 1, pp 866–870
Acknowledgment
This work is supported by the National Natural Science Foundation of China (No. 61502439).
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is part of the Topical Collection: Special Issue on Big Data Networking
Guest Editors: Xiaofei Liao, Song Guo, Deze Zeng, and Kun Wang
Rights and permissions
About this article
Cite this article
Yu, L., Cao, J., Chen, M. et al. Key frame extraction scheme based on sliding window and features. Peer-to-Peer Netw. Appl. 11, 1141–1152 (2018). https://doi.org/10.1007/s12083-017-0567-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-017-0567-3