skip to main content
10.1145/2964284.2971474acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

First Person View Video Summarization Subject to the User Needs

Published: 01 October 2016 Publication History

Abstract

Our life is becoming heavily documented and expressed on the digital substrate. This booming flow of consumer video has lead to an increasing demand of multimedia analysis tools to organize and summarize those visual memories. Due to the personal nature of such videos, though, the summarization needs to be adapted to the user needs and preferences. Yet, most summarization systems rely solely on pre-defined criteria, e.g. story-coherence or interestingness pre-trained classifiers. I propose a system which is capable of finding relevant digital memories to a given semantic query, and then summarize them on a customized manner. The proposed framework includes a wide set of tools to match a user's needs, from retrieval using multimodal queries to summarization striving to his/her preferences, both provided passively and actively. Preliminary results show the high potential of such a framework, with over 70% retrieval accuracy. More importantly, as seen from the user study, the summaries generated achieve an unprecedented compromise between usability and quality.

References

[1]
K. Aizawa, K. Ishijima, and M. Shiina. Summarizing wearable video. In International Conference on Image Processing, volume 3, pages 398--401. IEEE, 2001.
[2]
V. Chandrasekhar, W. Min, X. Li, C. Tan, B. Mandal, L. Li, and J. H. Lim. Efficient retrieval from large-scale egocentric visual data using a sparse graph representation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 527--534, 2014.
[3]
A. G. del Molino, B. Mandal, L. Li, and J. H. Lim. Organizing and retrieving episodic memories from first person view. In International Conference on Multimedia and Expo Workshops, pages 1--6. IEEE, 2015.
[4]
A. G. del Molino, C. Tan, J. H. Lim, and A. H. Tan. Summarization of egocentric videos: A comprehensive survey. submitted for publication on THMS, 2016.
[5]
M. Gygli, H. Grabner, H. Riemenschneider, and L. Van Gool. Creating summaries from user videos. In Computer Vision--ECCV, pages 505--520. Springer, 2014.
[6]
M. Gygli, H. Grabner, and L. Van Gool. Video summarization by learning submodular mixtures of objectives. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3090--3098, 2015.
[7]
B. Han, J. Hamm, and J. Sim. Personalized video summarization with human in the loop. In Applications of Computer Vision (WACV), 2011 IEEE Workshop on, pages 51--57. IEEE, 2011.
[8]
A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097--1105, 2012.
[9]
J. Lafferty, A. McCallum, and F. C. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML, 2001.
[10]
Y. J. Lee, J. Ghosh, and K. Grauman. Discovering important people and objects for egocentric video summarization. In Computer Vision and Pattern Recognition, volume 2, page 6, 2012.
[11]
Y. J. Lee and K. Grauman. Predicting important objects for egocentric video summarization. International Journal of Computer Vision, pages 1--18, 2015.
[12]
Y.-L. Lin, V. Morariu, and W. Hsu. Summarizing while recording: Context-based highlight detection for egocentric videos. In Proceedings of the IEEE International Conference on Computer Vision Workshops, pages 51--59, 2015.
[13]
Z. Lu and K. Grauman. Story-driven summarization for egocentric video. In Computer Vision and Pattern Recognition, pages 2714--2721. IEEE, 2013.
[14]
K. Masumitsu and T. Echigo. Video summarization using reinforcement learning in eigenspace. In Image Processing, 2000. Proceedings. 2000 International Conference on, volume 2, pages 267--270. IEEE, 2000.
[15]
A. G. Money and H. Agius. Video summarisation: A conceptual framework and survey of the state of the art. Journal of Visual Communication and Image Representation, 19(2):121--143, 2008.
[16]
W.-T. Peng, W.-T. Chu, C.-H. Chang, C.-N. Chou, W.-J. Huang, W.-Y. Chang, and Y.-P. Hung. Editing by viewing: automatic home video summarization by viewing behavior analysis. Multimedia, IEEE Transactions on, 13(3):539--550, 2011.
[17]
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211--252, 2015.
[18]
Y. Sawahata and K. Aizawa. Wearable imaging system for summarizing personal experiences. In International Conference on Multimedia and Expo, page 45. IEEE, 2003.
[19]
D. Tancharoen, T. Yamasaki, and K. Aizawa. Practical experience recording and indexing of life log video. In Proceedings of the 2nd ACM workshop on Continuous archival and retrieval of personal experiences, pages 61--66. ACM, 2005.
[20]
B. L. Tseng and J. R. Smith. Hierarchical video summarization based on context clustering. In ITCom 2003, pages 14--25. International Society for Optics and Photonics, 2003.
[21]
P. Varini, G. Serra, and R. Cucchiara. Egocentric video summarization of cultural tour based on user preferences. In Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, pages 931--934. ACM, 2015.
[22]
B. Xiong, G. Kim, and L. Sigal. Storyline representation of egocentric videos with an applications to story-based search. In Proceedings of the IEEE International Conference on Computer Vision, pages 4525--4533, 2015.
[23]
J. Xu, L. Mukherjee, Y. Li, J. Warner, J. M. Rehg, and V. Singh. Gaze-enabled egocentric video summarization via constrained submodular maximization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2235--2244, 2015.
[24]
H. Yang, L. Chaisorn, Y. Zhao, S.-Y. Neo, and T.-S. Chua. Videoqa: question answering on news video. In Proceedings of the eleventh ACM international conference on Multimedia, pages 632--641. ACM, 2003.
[25]
A. Yoshitaka and K. Sawada. Personalized video summarization based on behavior of viewer. In Signal Image Technology and Internet Based Systems (SITIS), 2012 Eighth International Conference on, pages 661--667. IEEE, 2012.
[26]
B. Zhao and E. Xing. Quasi real-time summarization for consumer videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2513--2520, 2014.
[27]
B. Zhou, A. Lapedriza, J. Xiao, A. Torralba, and A. Oliva. Learning deep features for scene recognition using places database. In Advances in Neural Information Processing Systems, pages 487--495, 2014.

Cited By

View all
  • (2024)Personalized Video Summarization: A Comprehensive Survey of Methods and DatasetsApplied Sciences10.3390/app1411440014:11(4400)Online publication date: 22-May-2024
  • (2018)Personalized Serious Games for Cognitive Intervention with Lifelog Visual AnalyticsProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240598(328-336)Online publication date: 15-Oct-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '16: Proceedings of the 24th ACM international conference on Multimedia
October 2016
1542 pages
ISBN:9781450336031
DOI:10.1145/2964284
© 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. episodic memory retrieval
  2. first person view
  3. user's preferences
  4. video summarization
  5. wearable and consumer videos

Qualifiers

  • Research-article

Funding Sources

  • Obra Social
  • Singapore International Graduate Award (SINGA)

Conference

MM '16
Sponsor:
MM '16: ACM Multimedia Conference
October 15 - 19, 2016
Amsterdam, The Netherlands

Acceptance Rates

MM '16 Paper Acceptance Rate 52 of 237 submissions, 22%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Personalized Video Summarization: A Comprehensive Survey of Methods and DatasetsApplied Sciences10.3390/app1411440014:11(4400)Online publication date: 22-May-2024
  • (2018)Personalized Serious Games for Cognitive Intervention with Lifelog Visual AnalyticsProceedings of the 26th ACM international conference on Multimedia10.1145/3240508.3240598(328-336)Online publication date: 15-Oct-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media