Abstract
With the ever increasing volume of video content, efficient and effective video summarization (VS) techniques are urgently demanded to manage a large amount of video data. Recent developments on sparse representation based approaches have demonstrated promising results for VS. However, these existing approaches treat each frame independently, so the performance can be greatly influenced by each individual frame. In this paper, we formulate the VS problem with a collaborative representation model to take the visual similarity of adjacent frames into consideration. To be specific, during the procedure of reconstruction, both each individual frame and their adjacent frames are reconstructed collaboratively, so the impact of an individual frame can be weakened. In addition, a greedy iterative algorithm is designed for model optimization, where the sparsity and the average percentage of reconstruction (APOR) are adopted to control the iteration. Experimental results on two benchmark datasets with various types of videos demonstrate that the proposed method not only outperforms the state of the art, but also improves the robustness to transitional frames and “outlier” frames.
Similar content being viewed by others
References
Besiris D, Makedonas A, Economou G, Fotopoulos S (2009) Combining graph connectivity & dominant set clustering for video summarization. Multimed Tools Appl 44(2):161–186
Cong Y, Yuan J, Luo J (2012) Towards scalable summarization of consumer videos via sparse dictionary selection. IEEE Trans Multimed 14(1):66–75
Cong Y, Liu J, Sun G, You Q, Li Y, Luo J (2017) Adaptive greedy dictionary selection for web media summarization. IEEE Trans Image Process 26(1):185–195
De Avila SEF, Lopes APB, Da Luz A, De Albuquerque Araújo A (2011) Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn Lett 32(1):56–68
Demir M, Bozma HI (2016) Video summarization via segments summary graphs. In: International conference on computer vision workshop, pp 1071–1077
Deng W, Hu J, Guo J (2012) Extended SRC: undersampled face recognition via intraclass variant dictionary. IEEE Trans Pattern Anal Mach Intell 34(9):1864–1870
Deng W, Hu J, Guo J (2017) Face recognition via collaborative representation: Its discriminant nature and superposed representation. IEEE Transactions on Pattern Analysis and Machine Intelligence
Etezadifar P, Farsi H (2016) Scalable video summarization via sparse dictionary learning and selection simultaneously. Multimed Tools Appl, 1–25
Furini M, Geraci F, Montangero M, Pellegrini M (2010) Stimo: still and moving video storyboard for the web scenario. Multimed Tools Appl 46(1):47
Gao T, Zhao X, Chen T, Liu Z, Ni C (2017) Face description based on adaptive local weighted gabor comprehensive histogram feature. Multimed Tools Appl 76(10):12,893–12,916
Guan G, Wang Z, Lu S, Da Deng J, Feng DD (2013) Keypoint-based keyframe selection. IEEE Trans Circ Syst Video Technol 23(4):729–734
Gygli M, Grabner H, Riemenschneider H, Gool LV (2014) Creating summaries from user videos. In: European Conference on computer vision, pp 505–520
Li B, Sezan MI (2001) Event detection and summarization in sports video. In: IEEE Workshop on content-based access of image and video libraries, pp 132–138
Liu D, Hua G, Chen T (2010) A hierarchical visual model for video object summarization. IEEE Trans Pattern Anal Mach Intell 32(12):2178–2190
Liu H, Liu Y, Yu Y, Sun F (2014) Diversified key-frame selection using structured L 2,0 optimization. IEEE Trans Ind Inf 10(3):1736–1745
Lu S, Wang Z, Mei T, Guan G, Feng DD (2014) A bag-of-importance model with locality-constrained coding based feature learning for video summarization. IEEE Trans Multimed 16(6):1497–1509
Ma M, Mei S, Hou J, Wan S, Wang Z, Feng D (2017) Nonlinear kernel sparse dictionary selection for video summarization. In: International conference on multimedia and expo, pp 637–642
Ma M, Mei S, Ji J, Wan S, Wang Z, Feng D (2017) Exploring the influence of feature representation for dictionary selection based video summarization. In: IEEE International conference on image processing, pp 2911–2915
Mei S, Guan G, Wang Z, He M, Hua XS, Feng DD (2014) L 2,0 constrained sparse dictionary selection for video summarization. In: International conference on multimedia and expo, pp 1–6
Mei S, Guan G, Wang Z, Wan S, He M, Feng DD (2015) Video summarization via minimum sparse reconstruction. Pattern Recogn 48(2):522–533
Mendi E, Clemente HB, Bayrak C (2013) Sports video summarization based on motion analysis. Comput Electr Eng 39(3):790–796
Mo X, Monga V, Bala R, Fan Z (2014) Adaptive sparse representations for video anomaly detection. IEEE Trans Circ Syst Vid Technol 24(4):631–645
Mundur P, Rao Y, Yesha Y (2006) Keyframe-based video summarization using delaunay clustering. Int J Digit Libr 6(2):219–232
Peyrard N, Bouthemy P (2005) Motion-based selection of relevant video segments for video summarization. Multimed Tools Appl 26(3):259–276
Song Y, Vallmitjana J, Stent A, Jaimes A (2015) Tvsum: summarizing web videos using titles. In: Computer vision and pattern recognition
Tao G, Zhao X, Chen T, Liu Z, Li S (2017) Illumination-insensitive image representation via synergistic weighted center-surround receptive field model and weber law. Pattern Recogn 69:124–140
Wolf W (1996) Key frame selection by motion analysis. In:International conference on acoustics, speech, and signal processing, pp 1228–1231
Wu J, Rehg JM (2011) Centrist: a visual descriptor for scene categorization. IEEE Trans Pattern Anal Mach Intell 33(8):1489–1501
Wu J, Zhong SH, Jiang J, Yang Y (2017) A novel clustering method for static video summarization. Multimed Tools Appl 76(7):1–17
Yang J, Wright J, Huang TS, Ma Y (2010) Image super-resolution via sparse representation. IEEE Trans Image Process 19(11):2861–2873
Zhang S, Zhu Y, Roy-Chowdhury AK (2016) Context-aware surveillance video summarization. IEEE Trans Image Process 25(11):5469–5478
Zhang J, Xia Y, Xie Y, Fulham M, Feng D (2017) Classification of medical images in the biomedical literature by jointly using deep and handcrafted visual features. IEEE Journal of Biomedical and Health Informatics
Acknowledgments
This work is partially supported by National Natural Science Foundation of China (61671383) and the Fundamental Research Funds for the Central Universities (3102018AX001).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ma, M., Mei, S., Wan, S. et al. Robust video summarization using collaborative representation of adjacent frames. Multimed Tools Appl 78, 28985–29005 (2019). https://doi.org/10.1007/s11042-018-6053-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6053-y