Skip to main content
Log in

Key frame extraction scheme based on sliding window and features

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

With the rapid development of the Internet and P2P technology, multimedia resources are gradually adding and used widely. Since network traffic increases sharply, how to choose the interested information for a number of Internet users is challenging. So, technologies and applications, such as video search, video fast browsing, video index and storage are in great demand. Behind these technologies and applications, an important problem is how to quickly browse massive video data and obtain the main content of the video. To solve this problem, different key frame extraction algorithms have been proposed. Due to the diversity of video content, different video have different characteristics. So the design of general video key frame extraction algorithm to solve the problem is not the reality. The main trend for the problem is to design the key frame extraction algorithm based on the characteristics of the video itself. In this article, we mainly focus on videos with edited boundaries and shot conversions. Aiming at this kind of video, we have designed and implemented video key frame extraction algorithm based on sliding window, the global feature Gist and local feature point detection algorithm SURF. In this algorithm, we use Gist feature to construct the global scene information of frames,and the SURF key point detection algorithm to extract local key points as local feature for each frame. Then, shot segmentation based on sliding window and shot merging algorithm is applied to dividing the original video into several shots. After that,we select the most representative frames in each video shot as key frames. Finally we evaluate the result of the algorithm from the subjective and objective perspective. Results show that key frames extracted in the algorithm are of high quality and can basically cover the main content of the original video.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Ba TT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM Trans Multimed Comput Commun Appl (TOMM) 3(1):3

    Article  Google Scholar 

  2. Bay H, Ess A, Tuytelaars T, Gool LV (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110(3):346–359

    Article  Google Scholar 

  3. Chasanis VT, Ioannidis AI, Likas AC (2014) Efficient key-frame extraction based on unimodality of frame sequences. In: 2014 12th international conference on signal processing (ICSP), pp 1133–1138

    Chapter  Google Scholar 

  4. Chen S, Li M, Ren K, Qiao C (2015) Crowd Map: Accurate reconstruction of indoor floor plans from crowdsourced Sensor-Rich videos. In: 2015 IEEE 35th International conference on distributed computing systems, pp 1–10

    Google Scholar 

  5. Cong Y, Yuan J, Luo J (2012) Towards scalable summarization of consumer videos via sparse dictionary selection. IEEE Trans Multimedia 14(1):66–75

    Article  Google Scholar 

  6. Dang CT, Kumar M, Radha H (2012) Key frame extraction from consumer videos using epitome. In: 2012 19th IEEE International conference on Image processing (ICIP), pp 93–96

    Chapter  Google Scholar 

  7. Gianluigi C, Raimondo S (2006) An innovative algorithm for key frame extraction in video summarization. J Real-Time Image Proc 1(1):69–88

    Article  Google Scholar 

  8. Hanjalic A, Zhang H (1999) An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis. IEEE Trans Circuits Syst Video Technol 9(8):1280–1289

    Article  Google Scholar 

  9. Jafarpour S, Cevher V, Schapire RE (2011) A game theoretic approach to expander-based compressive sensing. In: 2011 IEEE International symposium on information theory proceedings (ISIT), pp 464–468

    Chapter  Google Scholar 

  10. Koh J-L, Lee C-S, Chen ALP (1999) Semantic video model for content-based retrieval. In: 1999 IEEE International conference on multimedia computing and systems, vol 2, pp 472–478

  11. Kullback S (1959) Information theory and statistics. Wiley, New York

    MATH  Google Scholar 

  12. Kumar M, Loui AC (2011) Key frame extraction from consumer videos using sparse representation. In: 2011 18th IEEE International conference on image processing (ICIP), pp 2437– 2440

    Chapter  Google Scholar 

  13. Liao X, Li L, Tan G, Jin H, Yang X, Zhang W, Bo L (2016) Liverender: A cloud gaming system based on compressed graphics streaming. IEEE/ACM Trans Networking 24(4):2128–2139

    Article  Google Scholar 

  14. Lienhart RW (1998) Comparison of automatic shot boundary detection algorithms. In: Electronic imaging’99 , pages 290–301. International society for optics and photonics

    Google Scholar 

  15. Liu T, Kender JR (2007) Computational approaches to temporal sampling of video sequences. ACM Trans Multimed Comput Commun Appl (TOMM) 3(2):7

    Article  Google Scholar 

  16. Luo J, Papin C, Costello K (2009) Towards extracting semantically meaningful key frames from personal video clips: from humans to computers. IEEE Trans Circuits Syst Video Technol 19(2):289–301

    Article  Google Scholar 

  17. Oliva A, Torralba A (2001) Modeling the shape of the scene a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175

    Article  MATH  Google Scholar 

  18. Panagiotakis C, Doulamis A, Tziritas G (2009) Equivalent key frames selection based on iso-content principles. IEEE Trans Circuits Syst Video Technol 19(3):447–451

    Article  Google Scholar 

  19. Rasheed Z, Shah M (2005) Detection and representation of scenes in videos. IEEE Trans Multimedia 7 (6):1097–1105

    Article  Google Scholar 

  20. Rong J, Jin W, Lide W (2004) Key frame extraction using inter-shot information. In: 2004 IEEE International conference on multimedia and expo, 2004. ICME’04. , vol 1, pp 571–574

  21. Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22 (8):888–905

    Article  Google Scholar 

  22. Xiao-Dong Y, Wang L, Qi T, Xue P (2004) Multilevel video representation with application to keyframe extraction Multimedia modelling conference, 2004. Proceedings. 10th international, pp 117–123

    Chapter  Google Scholar 

  23. Zhang X-D, Liu T-Y, Lo K-T, Feng J (2003) Dynamic selection and effective compression of key frames for video abstraction. Pattern Recogn Lett 24(9):1523–1532

    Article  MATH  Google Scholar 

  24. Zhuang Y, Rui Y, Huang TS, Mehrotra S (1998) Adaptive key frame extraction using unsupervised clustering 1998 International conference on image processing, 1998. ICIP 98. Proceedings. , vol 1, pp 866–870

Download references

Acknowledgment

This work is supported by the National Natural Science Foundation of China (No. 61502439).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Linchen Yu.

Additional information

This article is part of the Topical Collection: Special Issue on Big Data Networking

Guest Editors: Xiaofei Liao, Song Guo, Deze Zeng, and Kun Wang

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, L., Cao, J., Chen, M. et al. Key frame extraction scheme based on sliding window and features. Peer-to-Peer Netw. Appl. 11, 1141–1152 (2018). https://doi.org/10.1007/s12083-017-0567-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-017-0567-3

Keywords

Navigation