research-article

MvsGCN: A Novel Graph Convolutional Network for Multi-video Summarization

Authors:

Sheng-Hua Zhong,

Yan LiuAuthors Info & Claims

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Pages 827 - 835

https://doi.org/10.1145/3343031.3350938

Published: 15 October 2019 Publication History

Abstract

Multi-video summarization, which tries to generate a single summary for a collection of video, is an important task in dealing with ever-growing video data. In this paper, we are the first to propose a graph convolutional network for multi-video summarization. The novel network measures the importance and relevance of each video shot in its own video as well as in the whole video collection. The important node sampling method is proposed to emphasize the effective features which are more possible to be selected as the final video summary. Two strategies are proposed to integrate into the network to solve the inherent class imbalance problem in the task of video summarization. The loss regularization for diversity is used to encourage a diverse summary to be generated. Extensive experiments are carried out, and in comparison with traditional and recent graph models and the state-of-the-art video summarization methods, our proposed model is effective in generating a representative summary for multiple videos with good diversity. It also achieves state-of-the-art performance on two standard video summarization datasets.

References

[1]

Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheyns. 2017. Geometric Deep Learning: Going beyond Euclidean data. IEEE Signal Processing Magazine, Vol. 34, 4 (2017), 18--42.

[2]

Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-based Reranking for Reordering Documents and Producing Summaries. In the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval . 335--336.

[3]

Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling. In the International Conference on Learning Representations .

[4]

Kevin Dale, Eli Shechtman, Shai Avidan, and Hanspeter Pfister. 2012. Multi-video browsing and summarization. In the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1--8.

[5]

Michaë l Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Advances in Neural Information Processing Systems. 2224--2232.

[6]

David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Alan Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional Networks on Graphs for Learning Molecular Fingerprints. In Advances in Neural Information Processing Systems. 2224--2232.

[7]

Litong Feng, Ziyin Li, Zhanghui Kuang, and Wei Zhang. 2018. Extractive Video Summarizer with Memory Augmented Neural Networks. In the ACM International Conference on Multimedia. 976--983.

[8]

Marco Gori, Gabriele Monfardini, and Franco Scarselli. 2005. A new model for learning in graph domains. In the IEEE International Joint Conference on Neural Networks, Vol. 2. 729--734.

[9]

Michael Gygli, Helmut Grabner, Hayko Riemenschneider, and Luc Van. 2014. Creating summaries from user videos. In the European Conference on Computer Vision .

[10]

William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In Advances in Neural Information Processing Systems .

[11]

David K. Hammond, Pierre Vandergheynst, and Ré mi Gribonval. 2009. Wavelets on Graphs via Spectral Graph Theory. Applied and Computational Harmonic Analysis, Vol. 30 (12 2009), 129--150.

[12]

Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep Convolutional Networks on Graph-Structured Data. arXir, CoRR, Vol. abs/1506.05163 (2015).

[13]

Sheng hua Zhong, Jiaxin Wu, and Jianmin Jiang. 2019. Video summarization via spatio-temporal deep architecture. Neurocomputing, Vol. 332 (2019), 224 -- 235.

Digital Library

[14]

Zhong Ji, Yaru Ma, Yanwei Pang, and Xuelong Li. 2019. Query-aware sparse coding for web multi-video summarization. Information Sciences, Vol. 478 (2019), 152 -- 166.

[15]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In the International Conference on Learning Representations .

[16]

Yingbo Li and Bernard Merialdo. 2016. Multimedia maximal marginal relevance for multi-video summarization. Multimedia Tools and Applications, Vol. 75, 1 (2016), 199--220.

Digital Library

[17]

Behrooz Mahasseni, Michael Lam, and Sinisa Todorovic. 2017. Unsupervised Video Summarization with Adversarial LS™ Networks. In the IEEE Conference on Computer Vision on Pattern Recognition .

[18]

Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning Convolutional Neural Networks for Graphs. In International Conference on Machine Learning .

[19]

Rameswar Panda, Niluthpol Chowdhury Mithun, and Amit K. Roy-Chowdhury. 2017. Diversity-Aware Multi-Video Summarization. IEEE Transactions on Image Processing, Vol. 26, 10 (2017), 4712--4724.

Digital Library

[20]

Rameswar Panda and Amit Roy-Chowdhury. 2017. Collaborative Summarization of Topic-Related Videos. In the IEEE Conference on Computer Vision on Pattern Recognition. 4274--4283.

[21]

Yale Song, Jordi Vallmitjana, Amanda Stent, and Alejandro Jaimes. 2015. TVSum: Summarizing web videos using titles. In the IEEE Conference on Computer Vision and Pattern Recognition. 5179--5187.

[22]

Du Tran, Lubomir D. Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2014. C3D: Generic Features for Video Analysis. arXiv, CoRR, Vol. abs/1412.0767 (2014).

[23]

Ba Tu Truong and Svetha Venkatesh. 2007. Video Abstraction: A Systematic Review and Classification. ACM Transactions on Multimedia Computing, Communications, and Applications, Vol. 3, 1 (2007), 1--37.

Digital Library

[24]

Arun Balajee Vasudevan, Michael Gygli, Anna Volokitin, and Luc Van Gool. 2017. Query-adaptive Video Summarization via Quality-aware Relevance Estimation. In the ACM International Conference on Multimedia. 582--590.

Digital Library

[25]

Ulrike von Luxburg. 2007. A tutorial on spectral clustering. Statistics and Computing, Vol. 17, 4 (2007), 395--416.

Digital Library

[26]

Feng Wang and Bernard Merialdo. 2009. Multi-document video summarization. In the IEEE International Conference on Multimedia and Expo. 1326--1329.

[27]

Jiaxin Wu, Sheng-hua Zhong, Jianmin Jiang, and Yunyun Yang. 2017. A novel clustering method for static video summarization. Multimedia Tools and Applications, Vol. 76, 7 (2017), 9625--9641.

Digital Library

[28]

Itheri Yahiaoui, Bernard Merialdo, and Benoit Huet. 2001. Generating summaries of multi-episode video. In the International Conference on Multimedia and Expo .

[29]

Ke Zhang, Wei-Lun Chao, Fei Sha, and Kristen Grauman. 2016. Video Summarization with Long Short-Term Memory. In the European Conference on Computer Vision . 766--782.

[30]

Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, and Maosong Sun. 2018a. Graph Neural Networks: A Review of Methods and Applications. arXiv, CoRR, Vol. abs/1812.08434 (2018).

[31]

Kaiyang Zhou, Yu Qiao, and Tao Xiang. 2018b. Deep Reinforcement Learning for Unsupervised Video Summarization With Diversity-Representativeness Reward. In the AAAI Conference on Artificial Intelligence .

Cited By

Yeh CLien CZhan ZTsai FChen M(2025)Graph convolutional network for fast video summarization in compressed domainNeurocomputing10.1016/j.neucom.2024.128945617(128945)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.128945
Zhang JWu GBi XCui Y(2024)Video Summarization Generation Network Based on Dynamic Graph Contrastive Learning and Feature FusionElectronics10.3390/electronics1311203913:11(2039)Online publication date: 23-May-2024
https://doi.org/10.3390/electronics13112039
He TLiu HNi ZLi YMa XZhong CZhang YWang YLin W(2024)Achieving Procedure-Aware Instructional Video Correlation Learning Under Weak Supervision from a Collaborative PerspectiveInternational Journal of Computer Vision10.1007/s11263-024-02272-8Online publication date: 4-Nov-2024
https://doi.org/10.1007/s11263-024-02272-8
Show More Cited By

Index Terms

MvsGCN: A Novel Graph Convolutional Network for Multi-video Summarization

Recommendations

DeepQAMVS: Query-Aware Hierarchical Pointer Networks for Multi-Video Summarization
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

The recent growth of web video sharing platforms has increased the demand for systems that can efficiently browse, retrieve and summarize video content. Query-aware multi-video summarization is a promising technique that caters to this demand. In this ...
Multi-video summarization with query-dependent weighted archetypal analysis
Abstract
Given the tremendous growth of web videos, video summarization is becoming increasingly important to improve user’s browsing experience. Since most existing methods focus on generating an informative summarization from a single video ...
Video summarization based on balanced AV-MMR
MMM'12: Proceedings of the 18th international conference on Advances in Multimedia Modeling

Among the techniques of video processing, video summarization is a promising approach to process the multimedia content. In this paper we present a novel summarization algorithm, Balanced Audio Video Maximal Marginal Relevance (Balanced AV-MMR or BAV-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '19: Proceedings of the 27th ACM International Conference on Multimedia

October 2019

2794 pages

ISBN:9781450368896

DOI:10.1145/3343031

General Chairs:
Laurent Amsaleg
CNRS-IRISA, France
,
Benoit Huet
EURECOM, France
,
Martha Larson
Radboud University and TU Delft (Netherlands)
,
Program Chairs:
Guillaume Gravier
CNRS-IRISA, France
,
Hayley Hung
Delft University of Technology Netherlands
,
Chong-Wah Ngo
City University of Hong Kong Hong Kong
,
Wei Tsang Ooi
National University of Singapore Singapore

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

the National Engineering Laboratory for Big Data System Computing Technology
the Inlife-Handnet Open Fund
the Hong Kong Polytechnic University Grant G-UAEU
the Shenzhen high-level overseas talents program
the Natural Science Foundation of Guangdong Province
the communication platform at the Third Afflicated Hospital of SUN Yat-Sen University

Conference

MM '19

Sponsor:

SIGMM

MM '19: The 27th ACM International Conference on Multimedia

October 21 - 25, 2019

Nice, France

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
677
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)3

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yeh CLien CZhan ZTsai FChen M(2025)Graph convolutional network for fast video summarization in compressed domainNeurocomputing10.1016/j.neucom.2024.128945617(128945)Online publication date: Feb-2025
https://doi.org/10.1016/j.neucom.2024.128945
Zhang JWu GBi XCui Y(2024)Video Summarization Generation Network Based on Dynamic Graph Contrastive Learning and Feature FusionElectronics10.3390/electronics1311203913:11(2039)Online publication date: 23-May-2024
https://doi.org/10.3390/electronics13112039
He TLiu HNi ZLi YMa XZhong CZhang YWang YLin W(2024)Achieving Procedure-Aware Instructional Video Correlation Learning Under Weak Supervision from a Collaborative PerspectiveInternational Journal of Computer Vision10.1007/s11263-024-02272-8Online publication date: 4-Nov-2024
https://doi.org/10.1007/s11263-024-02272-8
Zhang JWu GSong S(2023)Video Summarization Generation Based on Graph Structure ReconstructionElectronics10.3390/electronics1223475712:23(4757)Online publication date: 23-Nov-2023
https://doi.org/10.3390/electronics12234757
Song XFang SChen XWei YZhao ZNie L(2023)Modality-Oriented Graph Learning Toward Outfit Compatibility ModelingIEEE Transactions on Multimedia10.1109/TMM.2021.313416425(856-867)Online publication date: 2023
https://doi.org/10.1109/TMM.2021.3134164
Wu GWang SXu S(2023)Feature fusion over hyperbolic graph convolution networks for video summarisationIET Computer Vision10.1049/cvi2.1223218:1(150-164)Online publication date: 25-Aug-2023
https://doi.org/10.1049/cvi2.12232
Wu XMa MWan SHan XMei S(2023)Multi-scale deep feature fusion based sparse dictionary selection for video summarizationSignal Processing: Image Communication10.1016/j.image.2023.117006118(117006)Online publication date: Oct-2023
https://doi.org/10.1016/j.image.2023.117006
Ansari SZafar A(2023)Multi video summarization using query based deep optimization algorithmInternational Journal of Machine Learning and Cybernetics10.1007/s13042-023-01852-314:10(3591-3606)Online publication date: 18-May-2023
https://doi.org/10.1007/s13042-023-01852-3
Gupta DSharma A(2023)A comprehensive study of automatic video summarization techniquesArtificial Intelligence Review10.1007/s10462-023-10429-z56:10(11473-11633)Online publication date: 13-Mar-2023
https://doi.org/10.1007/s10462-023-10429-z
Yi ZWang XOunis IMacdonald CAmigo ECastells PGonzalo JCarterette BCulpepper JKazai G(2022)Multi-modal Graph Contrastive Learning for Micro-video RecommendationProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3532027(1807-1811)Online publication date: 6-Jul-2022
https://dl.acm.org/doi/10.1145/3477495.3532027
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten