Skip to main content
Log in

Video highlight extraction via content-aware deep transfer

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we focus on detecting highlights in online videos. Given the explosive growth of online videos, it is becoming increasingly important to single out those highlights for audiences instead of requiring them browsing every tedious part of the video. It is ideally that the contents of extracted highlights can be consistent with the topic of the video as well as the preference of the individual audience. To this end, this paper introduces a novel content-aware approach by formulating the highlights detection in a transfer learning framework. Under this framework. The experimental results on three different types of videos show that our content-aware highlight extraction method is particularly useful for online videos content fetching, e.g. showing the abstraction of the entire video while playing focus on the parts that matches the user queries.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Azadi S, Feng J, Darrell T Learning detection with diverse proposals

  2. Bacco R, Lambert P, Ionescu BE (2008) Video summarization from spatio-temporal features. In: ACM Trecvid video summarization workshop, pp 144–148

  3. Dan BG, Curless B, Salesin D, Seitz SM (2006) Schematic storyboarding for video visualization and editing. ACM Transactions on Graphics (Proceedings of ACM SIGGRAPH 2006) 25(3):862–871

    Article  Google Scholar 

  4. Deng J, Dong W, Socher R, Li LJ, Li K, Li FF (2009) Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition, pp 248–255

  5. Ghosh J, Yong JL, Grauman K (2012) Discovering important people and objects for egocentric video summarization. In: Proceedings of the 2012 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, 2012. pp 1346–1353

  6. Gong B, Chao WL, Grauman K, Sha F (2014) Diverse sequential subset selection for supervised video summarization. In: International conference on neural information processing systems, pp 2069– 2077

  7. Gygli M, Grabner H, Riemenschneider H, Gool LV (2014) Creating summaries from user videos. In: European conference on computer vision, pp 505–520

  8. Gygli M, Grabner H, Gool LV (2015) Video summarization by learning submodular mixtures of objectives. In: Computer vision and pattern recognition

  9. Gygli M, Song Y, Cao L (2016) Video2gif: automatic generation of animated gifs from video. In: Computer vision and pattern recognition, pp 1001–1009

  10. Jiao Y, Yang X, Zhang T, Huang S, Xu C (2017) Video highlight detection via deep ranking modeling. Pacific-Rim Symposium on Image and Video Technology. Springer, Cham, pp 28–39

  11. Joshi N, Kienzle W, Toelle M, Uyttendaele M, Cohen MF (2015) Real-time hyperlapse creation via optimal frame selection. ACM Trans Graph 34(4):63

    Article  Google Scholar 

  12. Kulesza A, Taskar B (2012) Determinantal point processes for machine learning. Foundations and Trends in Machine Learning 5(2–3):17

    MATH  Google Scholar 

  13. Liu L, Cheng L, Liu Y, et al. (2016) Recognizing complex activities by a probabilistic interval-based model. In: Thirtieth AAAI conference on artificial intelligence, pp 1266–1272

  14. Liu K, Liu W, Gan C, Tan M, Ma H (2018) T-c3d: temporal convolutional 3d network for real-time action recognition. In: Thirty-second AAAI conference on artificial intelligence

  15. Liu T, Kender JR (2002) Optimization algorithms for the selection of key frame sequences of variable length. In: European conference on computer vision, pp 403–417

  16. Lu Z, Grauman K (2013) Story-driven summarization for egocentric video. In: IEEE conference on computer vision and pattern recognition, pp 2714–2721

  17. Mahasseni B, Lam M, Todorovic S (2017) Unsupervised video summarization with adversarial lstm networks. In: Conference on computer vision and pattern recognition

  18. Nie L, Wang M, Zha ZJ, Chua TS (2012) Oracle in image search: a content-based approach to performance prediction. ACM Trans Inf Syst 30(2):1–23

    Article  Google Scholar 

  19. Nie L, Xiang W, Zhang J, He X, Zhang H, Hong R, Qi T (2017) Enhancing micro-video understanding by harnessing external sounds. In: ACM on multimedia conference

  20. Potapov D, Douze M, Harchaoui Z, Schmid C (2014) Category-specific video summarization. In: European conference on computer vision, pp 540–555

  21. Rui Z, Sheng T, Wu L, Zhang Y, Li J (2016) Multi-modal tag localization for mobile video search. Multimedia Systems 23(6):713–724

    Google Scholar 

  22. Sharghi A, Gong B, Shah M (2016) Query-focused extractive video summarization. In: European conference on computer vision, pp 3–19

  23. Sharghi A, Laurel JS, Gong B (2017) Query-focused video summarization: dataset, evaluation, and a memory network based approach. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), pp 2127–2136

  24. Song J, Gao L, Guo Z, Liu W, Zhang D, Shen HT (2017) Hierarchical lstm with adjusted temporal attention for video captioning. In: International joint conference on artificial intelligence, pp 2737–2743

  25. Sun M, Farhadi A, Seitz S (2014) Ranking domain-specific highlights by analyzing edited videos. In: European conference on computer vision, pp 787–802

  26. Sun M, Zeng KH, Lin Y, Ali F (2017) Semantic highlight retrieval and term prediction. IEEE Trans Image Process 26(7):3303–3316

    Article  MathSciNet  MATH  Google Scholar 

  27. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015) Rethinking the inception architecture for computer vision. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 2818–2826

  28. Truong BT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM Trans Multimed Comput Commun Appl 3(1):3

    Article  Google Scholar 

  29. Vasudevan AB, Gygli M, Volokitin A, Van Gool L (2017) Query-adaptive video summarization via quality-aware relevance estimation. In: 2017 ACM on multimedia conference, pp 582–590

  30. Wang H, Yu H, Hua R, Zou L (2018) Video highlight extraction based on the interests of users. Journal of Image and Graphics 23(5):0748–0755

    Google Scholar 

  31. Wu L, Tao M, Zhang Y, Che C, Luo J (2015) Multi-task deep visual-semantic embedding for video thumbnail selection. In: Computer vision and pattern recognition

  32. Xiong B, Grauman K (2014) Detecting snap points in egocentric video with a web photo prior. In: European conference on computer vision, pp 282–298

  33. Yang H, Wang B, Lin S, Wipf D, Guo M, Guo B (2015) Unsupervised extraction of video highlights via robust recurrent auto-encoders. In: IEEE International conference on computer vision

  34. Yao T, Mei T, Rui Y (2016) Highlight detection with pairwise deep ranking for first-person video summarization. In: Computer vision and pattern recognition, pp 982–990

  35. Ye L, Nie L, Lei H, Zhang L, Rosenblum DS (2015) Action2activity: recognizing complex activities from sensor data. In: IJCAI

  36. Ye L, Nie L, Li L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115

    Article  Google Scholar 

  37. Yi R, Zhu C, Ping T , Lin S Faces as lighting probes via unsupervised deep highlight extraction

  38. Yong JL, Ghosh J, Grauman K (2012) Discovering important people and objects for egocentric video summarization. In: IEEE conference on computer vision and pattern recognition, pp 1346–1353

  39. Zhang CL, Luo JH, Wei XS, Wu J (2017) In defense of fully connected layers in visual representation transfer. In: Pacific-rim conference on multimedia

  40. Zhang D, Han J, Jiang L, Ye S, Chang X (2017) Revealing event saliency in unconstrained video collection. IEEE Trans Image Process 26(4):1746–1758

    Article  MathSciNet  MATH  Google Scholar 

  41. Zhang K, Chao WL, Sha F, Grauman K (2016) Video summarization with long short-term memory. In: ECCV, pp 766–782

  42. Zhao B, Xing EP (2014) Quasi real-time summarization for consumer videos. In: IEEE conference on computer vision and pattern recognition

Download references

Acknowledgments

The research was supported in part by the Natural Science Foundation of China (NSFC) under Grant No. 61703046.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Han Wang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Niu, K., Wang, H. Video highlight extraction via content-aware deep transfer. Multimed Tools Appl 78, 21133–21144 (2019). https://doi.org/10.1007/s11042-019-7442-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-7442-6

Keywords

Navigation