Interactive video summarization with human intentions

Liu, Huaping; Sun, Fuchun; Zhang, Xinyu; Fang, Bin

doi:10.1007/s11042-018-6305-x

Interactive video summarization with human intentions

Published: 30 June 2018

Volume 78, pages 1737–1755, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Huaping Liu ORCID: orcid.org/0000-0002-4042-6044¹,
Fuchun Sun¹,
Xinyu Zhang² &
…
Bin Fang¹

397 Accesses
2 Citations
Explore all metrics

Abstract

Automatic video summarization, which is a typical cognitive-inspired task and attempts to select a small set of the most representative images or video clips for a specific video sequence, is therefore vital for enabling many tasks. In this work, we develop an interactive Non-negative Matrix Factorization (NMF) method for representative action video discovery. The original video is first evenly segmented into short clips, and the bag-of-words model is used to describe each clip. A temporally consistent NMF model is subsequently used for clustering and action segmentation. Because the clustering and segmentation results may not satisfy user intention, the user-controlled operations MERGE and ADD are developed to permit the user to adjust the results in line with expectations. The newly developed interactive NMF method can therefore generate personalized results.Experimental results on the public Weizman dataset demonstrate that our approach provides satisfactory action discovery and segmentation results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Representative Video Action Discovery Using Interactive Non-negative Matrix Factorization

Category-Specific Video Summarization

Creating Summaries from User Videos

References

Amato FF, Castiglione A, Moscato V et al. (2018) Multimedia summarization using social media content[J]. Multimed Tools Appl, 1–25
Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: Proceedings of international conference on computer vision (ICCV), pp 1395–1402
Borzeshi E, Concha O, Xu R, Piccardi M (2013) Joint action segmentation and classification by an extended Hidden Markov model. IEEE Signal Process Lett, 1207–1210
Cai D, He X, Wu X, Han J (2008) Non-negative matrix factorization on manifold. In: Proceedings of international conference in data mining (ICDM), pp 63–72
Chang XX, Yang Y (2017) Semisupervised feature analysis by mining correlations among multiple tasks[J]. IEEE Trans Neural Netw Learn Syst 28(10):2294–2305
Article MathSciNet Google Scholar
Chang X, Nie F, Wang S et al. (2016) Compound rank-k projections for bilinear analysis[J]. IEEE Trans Neural Netw Learn Syst 27(7):1502–1513
Article MathSciNet Google Scholar
Chang X, Yu Y, Yang Y et al. (2017) Semantic pooling for complex event analysis in untrimmed videos[J]. IEEE Trans Pattern Anal Mach Intell 39(8):1617–1632
Article Google Scholar
Chen Y, Rege M, Dong M, Hua J (2007) Incorporating user provided constraints into document clustering. In: Proceedings of international conference on data mining (ICDM), pp 103–112
Chen S, Xin Y, Luo B (2016) Action-based pedestrian identification via hierarchical matching pursuit and order preserving sparse coding. Cognitive Computation
Choo J, Lee C, Reddy C, Park H (2013) Utopian: user-driven topic modeling based on interactive nonnegative matrix factorization. IEEE Trans Visual Comput Graph 19(12):1992–2001
Article Google Scholar
Cui P, Wang F, Sun L, Zhang J, Yang S (2012) A matrix-based approach to unsupervised human action categorization. IEEE Trans Multimed, 102–110
Hossain M, Ojili P, Grimm C, Muller R, Watson L, Ramakrishnan N (2012) Scatter/gather clsutering: flexibly incorporating user feedback to steer clustering results. IEEE Trans Visual Comput Graph 18(12):2829–2838
Article Google Scholar
Hu T, Zhu X, Guo W et al. (2018) Human action recognition based on scene semantics[J]. Multimed Tools Appl, 1–22
Huang H, Fu S, Cai Z et al. (2018) Video abstract system based on spatial-temporal neighborhood trajectory analysis algorithm[J]. Multimed Tools Appl, 1–18
Hughes M, Sudderth E (2012) Nonparametric discovery of activity patterns from video collections. In: Proceedings of computer vision and pattern recognition workshops (CVPRW), pp 25–32
Kumaran N, Vadivel A, Kumar S (2018) Recognition of human actions using CNN-GWO: a novel modeling of CNN for enhancement of classification performance[J]. Multimed Tools Appl, 1–33
Lee D, Seung H (2001) Algorithms for non-negative matrix factorization. Adv Neural Inf Process Syst, 556–562
Liu H, Liu Y, Yu Y, Sun F (2014) Diversified key-frame selection using structured L _2,1 optimization. IEEE Trans Indus Inform 10(3):1736–1745
Article Google Scholar
Liu H, Liu H, Sun F, Fang B (In press) Kernel regularized nonlinear dictionary learning for sparse coding. IEEE Trans Syst Man Cybern Syst. https://doi.org/10.1109/TSMC.2017.2736248
Luo M, Nie F, Chang XX et al. (2017) Adaptive unsupervised feature selection with structure regularization[J]. IEEE Transactions on Neural Networks and Learning Systems
Ma Z, Chang X, Xu Z et al. (2017) Joint attributes and event analysis for multimedia event detection[J]. IEEE Transactions on Neural Networks and Learning Systems
Shao L, Jones S, Li X (2014) Efficient search and localization of human actions in video databases. IEEE Trans Circ Syst Video Technol 24(3):504–512
Article Google Scholar
Tang J, Lewis P (2008) Non-negative matrix factorization for object class discovery and image auto-annotation. In: Proceedings of international conference on content-based image and video retrieval (CIVR), pp 105–112
Tu Z, Abel A, Zhang L, Luo B, Hussain A (2016) A new spatio-temporal saliency-based video object segmentation. Cognitive Computation
Wang M, Ji D, Tian Q, Hua X (2012) Intelligent photo clustering with user interaction and distance metric learning. Pattern Recogn Lett, 462–470
Zhao B, Xing E (2014) Quasi real-time summarization for consumer videos. In: Proceedings of computer vision and pattern recognition (CVPR), pp 2513–2520
Zhao G, Qin S, Wang D (2018) Interactive segmentation of texture image based on active contour model with local inverse difference moment feature. Multimed Tools Appl, 1–28
Evaluation of clustering: http://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html

Download references

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant U1613212, Grant 61673238, in part by the Beijing Municipal Science and Technology Commission under Grant D171100005017002, and in part by the National High Technology Research and Development Program of China under Grant 2016YFB0100903.

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tsinghua University, BNRist, State Key Lab. of Intelligent Technology and Systems, Beijing, China
Huaping Liu, Fuchun Sun & Bin Fang
State Key Laboratory of Automotive Safety and Energy, Tsinghua University, Beijing, China
Xinyu Zhang

Authors

Huaping Liu
View author publications
You can also search for this author in PubMed Google Scholar
Fuchun Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xinyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bin Fang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huaping Liu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, H., Sun, F., Zhang, X. et al. Interactive video summarization with human intentions. Multimed Tools Appl 78, 1737–1755 (2019). https://doi.org/10.1007/s11042-018-6305-x

Download citation

Received: 04 March 2018
Revised: 07 June 2018
Accepted: 22 June 2018
Published: 30 June 2018
Issue Date: January 2019
DOI: https://doi.org/10.1007/s11042-018-6305-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interactive video summarization with human intentions

Abstract

Access this article

Similar content being viewed by others

Representative Video Action Discovery Using Interactive Non-negative Matrix Factorization

Category-Specific Video Summarization

Creating Summaries from User Videos

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Interactive video summarization with human intentions

Abstract

Access this article

Similar content being viewed by others

Representative Video Action Discovery Using Interactive Non-negative Matrix Factorization

Category-Specific Video Summarization

Creating Summaries from User Videos

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation