skip to main content
10.1145/3343031.3350938acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

MvsGCN: A Novel Graph Convolutional Network for Multi-video Summarization

Published: 15 October 2019 Publication History

Abstract

Multi-video summarization, which tries to generate a single summary for a collection of video, is an important task in dealing with ever-growing video data. In this paper, we are the first to propose a graph convolutional network for multi-video summarization. The novel network measures the importance and relevance of each video shot in its own video as well as in the whole video collection. The important node sampling method is proposed to emphasize the effective features which are more possible to be selected as the final video summary. Two strategies are proposed to integrate into the network to solve the inherent class imbalance problem in the task of video summarization. The loss regularization for diversity is used to encourage a diverse summary to be generated. Extensive experiments are carried out, and in comparison with traditional and recent graph models and the state-of-the-art video summarization methods, our proposed model is effective in generating a representative summary for multiple videos with good diversity. It also achieves state-of-the-art performance on two standard video summarization datasets.

References

[1]
Michael M. Bronstein, Joan Bruna, Yann LeCun, Arthur Szlam, and Pierre Vandergheyns. 2017. Geometric Deep Learning: Going beyond Euclidean data. IEEE Signal Processing Magazine, Vol. 34, 4 (2017), 18--42.
[2]
Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-based Reranking for Reordering Documents and Producing Summaries. In the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval . 335--336.
[3]
Jie Chen, Tengfei Ma, and Cao Xiao. 2018. FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling. In the International Conference on Learning Representations .
[4]
Kevin Dale, Eli Shechtman, Shai Avidan, and Hanspeter Pfister. 2012. Multi-video browsing and summarization. In the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1--8.
[5]
Michaë l Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering. In Advances in Neural Information Processing Systems. 2224--2232.
[6]
David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Alan Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional Networks on Graphs for Learning Molecular Fingerprints. In Advances in Neural Information Processing Systems. 2224--2232.
[7]
Litong Feng, Ziyin Li, Zhanghui Kuang, and Wei Zhang. 2018. Extractive Video Summarizer with Memory Augmented Neural Networks. In the ACM International Conference on Multimedia. 976--983.
[8]
Marco Gori, Gabriele Monfardini, and Franco Scarselli. 2005. A new model for learning in graph domains. In the IEEE International Joint Conference on Neural Networks, Vol. 2. 729--734.
[9]
Michael Gygli, Helmut Grabner, Hayko Riemenschneider, and Luc Van. 2014. Creating summaries from user videos. In the European Conference on Computer Vision .
[10]
William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive Representation Learning on Large Graphs. In Advances in Neural Information Processing Systems .
[11]
David K. Hammond, Pierre Vandergheynst, and Ré mi Gribonval. 2009. Wavelets on Graphs via Spectral Graph Theory. Applied and Computational Harmonic Analysis, Vol. 30 (12 2009), 129--150.
[12]
Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep Convolutional Networks on Graph-Structured Data. arXir, CoRR, Vol. abs/1506.05163 (2015).
[13]
Sheng hua Zhong, Jiaxin Wu, and Jianmin Jiang. 2019. Video summarization via spatio-temporal deep architecture. Neurocomputing, Vol. 332 (2019), 224 -- 235.
[14]
Zhong Ji, Yaru Ma, Yanwei Pang, and Xuelong Li. 2019. Query-aware sparse coding for web multi-video summarization. Information Sciences, Vol. 478 (2019), 152 -- 166.
[15]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In the International Conference on Learning Representations .
[16]
Yingbo Li and Bernard Merialdo. 2016. Multimedia maximal marginal relevance for multi-video summarization. Multimedia Tools and Applications, Vol. 75, 1 (2016), 199--220.
[17]
Behrooz Mahasseni, Michael Lam, and Sinisa Todorovic. 2017. Unsupervised Video Summarization with Adversarial LS™ Networks. In the IEEE Conference on Computer Vision on Pattern Recognition .
[18]
Mathias Niepert, Mohamed Ahmed, and Konstantin Kutzkov. 2016. Learning Convolutional Neural Networks for Graphs. In International Conference on Machine Learning .
[19]
Rameswar Panda, Niluthpol Chowdhury Mithun, and Amit K. Roy-Chowdhury. 2017. Diversity-Aware Multi-Video Summarization. IEEE Transactions on Image Processing, Vol. 26, 10 (2017), 4712--4724.
[20]
Rameswar Panda and Amit Roy-Chowdhury. 2017. Collaborative Summarization of Topic-Related Videos. In the IEEE Conference on Computer Vision on Pattern Recognition. 4274--4283.
[21]
Yale Song, Jordi Vallmitjana, Amanda Stent, and Alejandro Jaimes. 2015. TVSum: Summarizing web videos using titles. In the IEEE Conference on Computer Vision and Pattern Recognition. 5179--5187.
[22]
Du Tran, Lubomir D. Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. 2014. C3D: Generic Features for Video Analysis. arXiv, CoRR, Vol. abs/1412.0767 (2014).
[23]
Ba Tu Truong and Svetha Venkatesh. 2007. Video Abstraction: A Systematic Review and Classification. ACM Transactions on Multimedia Computing, Communications, and Applications, Vol. 3, 1 (2007), 1--37.
[24]
Arun Balajee Vasudevan, Michael Gygli, Anna Volokitin, and Luc Van Gool. 2017. Query-adaptive Video Summarization via Quality-aware Relevance Estimation. In the ACM International Conference on Multimedia. 582--590.
[25]
Ulrike von Luxburg. 2007. A tutorial on spectral clustering. Statistics and Computing, Vol. 17, 4 (2007), 395--416.
[26]
Feng Wang and Bernard Merialdo. 2009. Multi-document video summarization. In the IEEE International Conference on Multimedia and Expo. 1326--1329.
[27]
Jiaxin Wu, Sheng-hua Zhong, Jianmin Jiang, and Yunyun Yang. 2017. A novel clustering method for static video summarization. Multimedia Tools and Applications, Vol. 76, 7 (2017), 9625--9641.
[28]
Itheri Yahiaoui, Bernard Merialdo, and Benoit Huet. 2001. Generating summaries of multi-episode video. In the International Conference on Multimedia and Expo .
[29]
Ke Zhang, Wei-Lun Chao, Fei Sha, and Kristen Grauman. 2016. Video Summarization with Long Short-Term Memory. In the European Conference on Computer Vision . 766--782.
[30]
Jie Zhou, Ganqu Cui, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, and Maosong Sun. 2018a. Graph Neural Networks: A Review of Methods and Applications. arXiv, CoRR, Vol. abs/1812.08434 (2018).
[31]
Kaiyang Zhou, Yu Qiao, and Tao Xiang. 2018b. Deep Reinforcement Learning for Unsupervised Video Summarization With Diversity-Representativeness Reward. In the AAAI Conference on Artificial Intelligence .

Cited By

View all
  • (2025)Graph convolutional network for fast video summarization in compressed domainNeurocomputing10.1016/j.neucom.2024.128945617(128945)Online publication date: Feb-2025
  • (2024)Video Summarization Generation Network Based on Dynamic Graph Contrastive Learning and Feature FusionElectronics10.3390/electronics1311203913:11(2039)Online publication date: 23-May-2024
  • (2024)Achieving Procedure-Aware Instructional Video Correlation Learning Under Weak Supervision from a Collaborative PerspectiveInternational Journal of Computer Vision10.1007/s11263-024-02272-8Online publication date: 4-Nov-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '19: Proceedings of the 27th ACM International Conference on Multimedia
October 2019
2794 pages
ISBN:9781450368896
DOI:10.1145/3343031
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. class imbalance problem
  2. graph convolutional network
  3. multi-video summarization

Qualifiers

  • Research-article

Funding Sources

  • the National Engineering Laboratory for Big Data System Computing Technology
  • the Inlife-Handnet Open Fund
  • the Hong Kong Polytechnic University Grant G-UAEU
  • the Shenzhen high-level overseas talents program
  • the Natural Science Foundation of Guangdong Province
  • the communication platform at the Third Afflicated Hospital of SUN Yat-Sen University

Conference

MM '19
Sponsor:

Acceptance Rates

MM '19 Paper Acceptance Rate 252 of 936 submissions, 27%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)3
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Graph convolutional network for fast video summarization in compressed domainNeurocomputing10.1016/j.neucom.2024.128945617(128945)Online publication date: Feb-2025
  • (2024)Video Summarization Generation Network Based on Dynamic Graph Contrastive Learning and Feature FusionElectronics10.3390/electronics1311203913:11(2039)Online publication date: 23-May-2024
  • (2024)Achieving Procedure-Aware Instructional Video Correlation Learning Under Weak Supervision from a Collaborative PerspectiveInternational Journal of Computer Vision10.1007/s11263-024-02272-8Online publication date: 4-Nov-2024
  • (2023)Video Summarization Generation Based on Graph Structure ReconstructionElectronics10.3390/electronics1223475712:23(4757)Online publication date: 23-Nov-2023
  • (2023)Modality-Oriented Graph Learning Toward Outfit Compatibility ModelingIEEE Transactions on Multimedia10.1109/TMM.2021.313416425(856-867)Online publication date: 2023
  • (2023)Feature fusion over hyperbolic graph convolution networks for video summarisationIET Computer Vision10.1049/cvi2.1223218:1(150-164)Online publication date: 25-Aug-2023
  • (2023)Multi-scale deep feature fusion based sparse dictionary selection for video summarizationSignal Processing: Image Communication10.1016/j.image.2023.117006118(117006)Online publication date: Oct-2023
  • (2023)Multi video summarization using query based deep optimization algorithmInternational Journal of Machine Learning and Cybernetics10.1007/s13042-023-01852-314:10(3591-3606)Online publication date: 18-May-2023
  • (2023)A comprehensive study of automatic video summarization techniquesArtificial Intelligence Review10.1007/s10462-023-10429-z56:10(11473-11633)Online publication date: 13-Mar-2023
  • (2022)Multi-modal Graph Contrastive Learning for Micro-video RecommendationProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3532027(1807-1811)Online publication date: 6-Jul-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media