Robust video summarization using collaborative representation of adjacent frames

Ma, Mingyang; Mei, Shaohui; Wan, Shuai; Wang, Zhiyong; Feng, David Dagan

doi:10.1007/s11042-018-6053-y

Robust video summarization using collaborative representation of adjacent frames

Published: 01 June 2018

Volume 78, pages 28985–29005, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Mingyang Ma¹,
Shaohui Mei ORCID: orcid.org/0000-0002-8018-596X¹,
Shuai Wan¹,
Zhiyong Wang² &
…
David Dagan Feng²

434 Accesses
3 Citations
Explore all metrics

Abstract

With the ever increasing volume of video content, efficient and effective video summarization (VS) techniques are urgently demanded to manage a large amount of video data. Recent developments on sparse representation based approaches have demonstrated promising results for VS. However, these existing approaches treat each frame independently, so the performance can be greatly influenced by each individual frame. In this paper, we formulate the VS problem with a collaborative representation model to take the visual similarity of adjacent frames into consideration. To be specific, during the procedure of reconstruction, both each individual frame and their adjacent frames are reconstructed collaboratively, so the impact of an individual frame can be weakened. In addition, a greedy iterative algorithm is designed for model optimization, where the sparsity and the average percentage of reconstruction (APOR) are adopted to control the iteration. Experimental results on two benchmark datasets with various types of videos demonstrate that the proposed method not only outperforms the state of the art, but also improves the robustness to transitional frames and “outlier” frames.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video summarization using deep learning techniques: a detailed analysis and investigation

Article 15 March 2023

Image Inpainting: A Review

Article 06 December 2019

Attention-based misaligned spatiotemporal auto-encoder for video anomaly detection

Article 13 April 2024

References

Besiris D, Makedonas A, Economou G, Fotopoulos S (2009) Combining graph connectivity & dominant set clustering for video summarization. Multimed Tools Appl 44(2):161–186
Article Google Scholar
Cong Y, Yuan J, Luo J (2012) Towards scalable summarization of consumer videos via sparse dictionary selection. IEEE Trans Multimed 14(1):66–75
Article Google Scholar
Cong Y, Liu J, Sun G, You Q, Li Y, Luo J (2017) Adaptive greedy dictionary selection for web media summarization. IEEE Trans Image Process 26(1):185–195
Article MathSciNet MATH Google Scholar
De Avila SEF, Lopes APB, Da Luz A, De Albuquerque Araújo A (2011) Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn Lett 32(1):56–68
Article Google Scholar
Demir M, Bozma HI (2016) Video summarization via segments summary graphs. In: International conference on computer vision workshop, pp 1071–1077
Deng W, Hu J, Guo J (2012) Extended SRC: undersampled face recognition via intraclass variant dictionary. IEEE Trans Pattern Anal Mach Intell 34(9):1864–1870
Article Google Scholar
Deng W, Hu J, Guo J (2017) Face recognition via collaborative representation: Its discriminant nature and superposed representation. IEEE Transactions on Pattern Analysis and Machine Intelligence
Etezadifar P, Farsi H (2016) Scalable video summarization via sparse dictionary learning and selection simultaneously. Multimed Tools Appl, 1–25
Furini M, Geraci F, Montangero M, Pellegrini M (2010) Stimo: still and moving video storyboard for the web scenario. Multimed Tools Appl 46(1):47
Article Google Scholar
Gao T, Zhao X, Chen T, Liu Z, Ni C (2017) Face description based on adaptive local weighted gabor comprehensive histogram feature. Multimed Tools Appl 76(10):12,893–12,916
Article Google Scholar
Guan G, Wang Z, Lu S, Da Deng J, Feng DD (2013) Keypoint-based keyframe selection. IEEE Trans Circ Syst Video Technol 23(4):729–734
Article Google Scholar
Gygli M, Grabner H, Riemenschneider H, Gool LV (2014) Creating summaries from user videos. In: European Conference on computer vision, pp 505–520
Li B, Sezan MI (2001) Event detection and summarization in sports video. In: IEEE Workshop on content-based access of image and video libraries, pp 132–138
Liu D, Hua G, Chen T (2010) A hierarchical visual model for video object summarization. IEEE Trans Pattern Anal Mach Intell 32(12):2178–2190
Article Google Scholar
Liu H, Liu Y, Yu Y, Sun F (2014) Diversified key-frame selection using structured L _2,0 optimization. IEEE Trans Ind Inf 10(3):1736–1745
Article Google Scholar
Lu S, Wang Z, Mei T, Guan G, Feng DD (2014) A bag-of-importance model with locality-constrained coding based feature learning for video summarization. IEEE Trans Multimed 16(6):1497–1509
Article Google Scholar
Ma M, Mei S, Hou J, Wan S, Wang Z, Feng D (2017) Nonlinear kernel sparse dictionary selection for video summarization. In: International conference on multimedia and expo, pp 637–642
Ma M, Mei S, Ji J, Wan S, Wang Z, Feng D (2017) Exploring the influence of feature representation for dictionary selection based video summarization. In: IEEE International conference on image processing, pp 2911–2915
Mei S, Guan G, Wang Z, He M, Hua XS, Feng DD (2014) L _2,0 constrained sparse dictionary selection for video summarization. In: International conference on multimedia and expo, pp 1–6
Mei S, Guan G, Wang Z, Wan S, He M, Feng DD (2015) Video summarization via minimum sparse reconstruction. Pattern Recogn 48(2):522–533
Article Google Scholar
Mendi E, Clemente HB, Bayrak C (2013) Sports video summarization based on motion analysis. Comput Electr Eng 39(3):790–796
Article Google Scholar
Mo X, Monga V, Bala R, Fan Z (2014) Adaptive sparse representations for video anomaly detection. IEEE Trans Circ Syst Vid Technol 24(4):631–645
Article Google Scholar
Mundur P, Rao Y, Yesha Y (2006) Keyframe-based video summarization using delaunay clustering. Int J Digit Libr 6(2):219–232
Article Google Scholar
Peyrard N, Bouthemy P (2005) Motion-based selection of relevant video segments for video summarization. Multimed Tools Appl 26(3):259–276
Article Google Scholar
Song Y, Vallmitjana J, Stent A, Jaimes A (2015) Tvsum: summarizing web videos using titles. In: Computer vision and pattern recognition
Tao G, Zhao X, Chen T, Liu Z, Li S (2017) Illumination-insensitive image representation via synergistic weighted center-surround receptive field model and weber law. Pattern Recogn 69:124–140
Article Google Scholar
Wolf W (1996) Key frame selection by motion analysis. In:International conference on acoustics, speech, and signal processing, pp 1228–1231
Wu J, Rehg JM (2011) Centrist: a visual descriptor for scene categorization. IEEE Trans Pattern Anal Mach Intell 33(8):1489–1501
Article Google Scholar
Wu J, Zhong SH, Jiang J, Yang Y (2017) A novel clustering method for static video summarization. Multimed Tools Appl 76(7):1–17
Article Google Scholar
Yang J, Wright J, Huang TS, Ma Y (2010) Image super-resolution via sparse representation. IEEE Trans Image Process 19(11):2861–2873
Article MathSciNet MATH Google Scholar
Zhang S, Zhu Y, Roy-Chowdhury AK (2016) Context-aware surveillance video summarization. IEEE Trans Image Process 25(11):5469–5478
Article MathSciNet MATH Google Scholar
Zhang J, Xia Y, Xie Y, Fulham M, Feng D (2017) Classification of medical images in the biomedical literature by jointly using deep and handcrafted visual features. IEEE Journal of Biomedical and Health Informatics

Download references

Acknowledgments

This work is partially supported by National Natural Science Foundation of China (61671383) and the Fundamental Research Funds for the Central Universities (3102018AX001).

Author information

Authors and Affiliations

Northwestern Polytechnical University, Xi’an, China
Mingyang Ma, Shaohui Mei & Shuai Wan
The University of Sydney, Sydney, Australia
Zhiyong Wang & David Dagan Feng

Authors

Mingyang Ma
View author publications
You can also search for this author in PubMed Google Scholar
Shaohui Mei
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Wan
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Wang
View author publications
You can also search for this author in PubMed Google Scholar
David Dagan Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shaohui Mei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ma, M., Mei, S., Wan, S. et al. Robust video summarization using collaborative representation of adjacent frames. Multimed Tools Appl 78, 28985–29005 (2019). https://doi.org/10.1007/s11042-018-6053-y

Download citation

Received: 08 January 2018
Revised: 02 April 2018
Accepted: 23 April 2018
Published: 01 June 2018
Issue Date: October 2019
DOI: https://doi.org/10.1007/s11042-018-6053-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust video summarization using collaborative representation of adjacent frames

Abstract

Access this article

Similar content being viewed by others

Video summarization using deep learning techniques: a detailed analysis and investigation

Image Inpainting: A Review

Attention-based misaligned spatiotemporal auto-encoder for video anomaly detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust video summarization using collaborative representation of adjacent frames

Abstract

Access this article

Similar content being viewed by others

Video summarization using deep learning techniques: a detailed analysis and investigation

Image Inpainting: A Review

Attention-based misaligned spatiotemporal auto-encoder for video anomaly detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation