research-article

Gaze-Driven Video Re-Editing

Authors:
Eakta Jain

Carnegie Mellon University, Disney Research Pittsburgh, and University of Florida, Pittsburgh, PA

Carnegie Mellon University, Disney Research Pittsburgh, and University of Florida, Pittsburgh, PA
View Profile

,
Yaser Sheikh

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Ariel Shamir

Disney Research Pittsburgh and Interdisciplinary Center Israel, Pittsburgh, PA

Disney Research Pittsburgh and Interdisciplinary Center Israel, Pittsburgh, PA
View Profile

,
Jessica Hodgins

Carnegie Mellon University and Disney Research Pittsburgh, Pittsburgh, PA

Carnegie Mellon University and Disney Research Pittsburgh, Pittsburgh, PA
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 34 Issue 2Article No.: 21pp 1–12https://doi.org/10.1145/2699644

Published:02 March 2015Publication History

ACM Transactions on Graphics

Abstract

Given the current profusion of devices for viewing media, video content created at one aspect ratio is often viewed on displays with different aspect ratios. Many previous solutions address this problem by retargeting or resizing the video, but a more general solution would re-edit the video for the new display. Our method employs the three primary editing operations: pan, cut, and zoom. We let viewers implicitly reveal what is important in a video by tracking their gaze as they watch the video. We present an algorithm that optimizes the path of a cropping window based on the collected eyetracking data, finds places to cut, and computes the size of the cropping window. We present results on a variety of video clips, including close-up and distant shots, and stationary and moving cameras. We conduct two experiments to evaluate our results. First, we eyetrack viewers on the result videos generated by our algorithm, and second, we perform a subjective assessment of viewer preference. These experiments show that viewer gaze patterns are similar on our result videos and on the original video clips, and that viewers prefer our results to an optimized crop-and-warp algorithm.

Supplemental Material

a21.mp4

mp4

17.2 MB

Download

Available for Download

zip

jain.zip (110.7 MB)

Supplemental movie, appendix, image and software files for, Gaze-Driven Video Re-Editing

References

W. Abbot and F. Aldo. 2011. Ultra-low cost eyetracking as an high information throughput alternative to BMIS. BMC Neurosci. 12,1.Google Scholar
J. S. Agustin, H. Skovsgaard, E. Mollenbach, M. Barret, M. Tall, D. W. Hansen, and J. P. Hansen. 2010. Evaluation of a low-cost open-source gaze tracker. In Proceedings of the Symposium on Eyetracking Research and Applications (ETRA'10). 77—80. Google ScholarDigital Library
S. Avidan and A. Shamir. 2007. Seam carving for content-aware image resizing. ACM Trans. Graph. 26, 3. Google ScholarDigital Library
F. Baluch and L. Itti. 2011. Mechanisms of top-down attention. Trends Neurosci. 34, 210--224.Google ScholarCross Ref
S. Castillo, T. Judd, and D. Gutierrez. 2011. Using eye-tracking to assess different image retargeting methods. In Proceedings of the Symposium on Applied Perception in Graphics and Visualization (APGV'11). Google ScholarDigital Library
C. Chamaret and O. Le Meur. 2008. Attention-based video reframing: Validation using eye-tracking. In Proceedings of the International Conference on Pattern Recognition (ICPR'08).Google Scholar
D. DeCarlo and A. Santella. 2002. Stylization and abstraction of photographs. ACM Trans. Graph. 21, 3, 769--776. Google ScholarDigital Library
T. Deselaers, P. Dreuw, and H. Ney. 2008. Pan, zoom, scan -- Time coherent, trained automatic video cropping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'08). 1--8.Google Scholar
E. Dmytryk. 1984. On Film Editing. Focal Press.Google Scholar
M. Dorr, T. Martinetz, K. Gegenfurtner, and E. Barth. 2010. Variability of eye movements when viewing dynamic natural scenes. J. Vis. 10, 10.Google ScholarCross Ref
H. El-Alfy, D. Jacobs, and L. Davis. 2007. Multi-scale video cropping. In Proceedings of the 15^th ACM International Conference on Multimedia (MULTIMEDIA'07). 97--106. Google ScholarDigital Library
E. Erdfelder, F. Faul, and A. Buchner. 1996. Gpower: A general power analysis program. Behav. Res. Meth. Instrum. Comput. 28, 1, 1--11.Google ScholarCross Ref
M. A. Fischler and R. C. Bolles. 1981. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. ACM 24, 381--395. Google ScholarDigital Library
J. D. Foley, A. Van Dam, S. K. Feiner, and J. F. Hughes. 1996. Computer Graphics Principles and Practice 2^nd Ed. Addison-Wesley. Google ScholarDigital Library
R. B. Goldstein, R. L. Woods, and E. Peli. 2007. Where people look when watching movies: Do all viewers look at the same place&quest; Comput. Biol. Med. 37, 7, 957--964. Google ScholarDigital Library
C. G. Healey and A. P. Sawant. 2012. On the limits of resolution and visual angle in visualization. ACM Trans. Appl. Percept. 9, 4, 20:1--20:21. Google ScholarDigital Library
E. Jain, Y. Sheikh, and J. Hodgins. 2012. Inferring artistic intention in comic art through viewer gaze. In Proceedings of the ACM Symposium on Applied Perception (SAP'12). Google ScholarDigital Library
T. Judd, F. Durand, and A. Torralba. 2012. A benchmark of computational models of saliency to predict human fixations. Tech. rep. MITCSAIL-TR-2012-001, Massachusetts Institute of Technology. http://dspace.mit.edu/handle/1721.1/68590.Google Scholar
H. Katti, A. K. Rajagopal, M. Kankanhalli, and R. Kalpathi. 2014. Online estimation of evolving human visual interest. ACM Trans. Multimedia Comput. Comm. Appl. 11, 1. Google ScholarDigital Library
S. D. Katz. 1991. Shot by Shot. Michael Wiese Productions, Focal Press.Google Scholar
H. Knoche, J. McCarthy, and M. Sasse. 2008. How low can you go&quest; The effect of low resolutions on shot types in mobile tv. Multimedia Tools Appl. 36, 1--2, 145--166. Google ScholarDigital Library
S. Kopf, T. Haenselmann, J. Kiess, B. Guthier, and W. Effelsberg. 2011. Algorithms for video retargeting. Multimedia Tools Appl. 51, 2, 819--861. Google ScholarDigital Library
P. Krähenbühl, M. Lang, A. Hornung, and M. Gross. 2009. A system for retargeting of streaming video. ACM Trans. Graph. 28, 126:1--126:10. Google ScholarDigital Library
F. Liu and M. Gleicher. 2006. Video retargeting: Automating pan and scan. In Proceedings of the ACM International Conference on Multimedia (MULTIMEDIA'06). 241--250. Google ScholarDigital Library
L. Liu, R. Chen, L. Wolf, and D. Cohen-Or. 2010. Optimizing photo composition. Comput. Graph. Forum 29, 2, 469--478.Google ScholarCross Ref
P. K. Mital, T. J. Smith, R. L. Hill, and J. M. Henderson. 2010. Clustering of gaze during dynamic scene viewing is predicted by motion. Cogn. Comput. 3, 1, 5--24.Google ScholarCross Ref
Y. Niu, F. Liu, X. Li, and M. Gleicher. 2010. Warp propagation for video resizing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10). 537--544.Google Scholar
M. Rubinstein, A. Shamir, and S. Avidan. 2008. Improved seam carving for video retargeting. ACM Trans. Graph. 27, 3, 16:1--16:9. Google ScholarDigital Library
D. Rudoy, D. B. Goldman, E. Shechtman, and L. Zelnik-Manor. 2012. Crowdsourcing gaze data collection. http://arxiv.org/abs/1204. 3367.Google Scholar
A. Santella, M. Agrawala, D. DeCarlo, D. Salesin, and M. Cohen. 2006. Gaze-based interaction for semi-automatic photo cropping. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI'06). 771--780. Google ScholarDigital Library
A. Shamir and O. Sorkine. 2009. Visual media retargeting. In Proceedings of the 1^st ACM SIGGRAPH Conference and Exhibition in Asia (SIGGRAPH-ASIA'09). 11:1--11:13. Google ScholarDigital Library
T. J. Smith and J. M. Henderson. 2008. Edit blindness: The relationship between attention and global change blindness in dynamic scenes. J. Eye Movement Res. 2, 2, 1--17.Google Scholar
C. Tao, J. Jia, and H. Sun. 2007. Active window oriented dynamic video retargeting. In Proceedings of the Workshop on Dynamical Vision at the International Conference on Computer Vision (ICCV'07).Google Scholar
J. Wang, M. J. T. Reinders, R. L. Lagendijk, J. Lindenberg, and M. S. Kankanhalli. 2004. Video content representation on tiny devices. In Proceedings of the IEEE Conference on Multimedia and Expo (ICME'04). 1711--1714.Google Scholar
Y.-S. Wang, H. Fu, O. Sorkine, T.-Y. Lee, and H.-P. Seidel. 2009. Motion-aware temporal coherence for video resizing. ACM Trans. Graph. 28, 127:1--127:10. Google ScholarDigital Library
Y.-S. Wang, J.-H. Hsiao, O. Sorkine, and T.-Y. Lee. 2011. Scalable and coherent video resizing with per-frame optimization. ACM Trans. Graph. 30, 4, 88:1--88:8. Google ScholarDigital Library
Y.-S. Wang, H.-C. Lin, O. Sorkine, and T.-Y. Lee. 2010. Motionbased video retargeting with optimized crop-and-warp. ACM Trans. Graph. 29, 90:1--90:9. Google ScholarDigital Library
Y.-S. Wang, C.-L. Tai, O. Sorkine, and T.-Y. Lee. 2008. Optimized scale-and-stretch for image resizing. ACM Trans. Graph. 27, 118:1--118:8. Google ScholarDigital Library
Wikipedia. 2015. http://en.wikipedia.org/wiki/pan_and_scan.Google Scholar
Y. Y. Xiang and M. S. Kankanhalli. 2010a. Automated aesthetic enhancement of videos. In Proceedings of the ACM International Conference on Multimedia (MM'10). 218--290. Google ScholarDigital Library
Y.-Y. Xiang and M. S. Kankanhalli. 2010b. Video retargeting for aesthetic enhancement. In Proceedings of the ACM International Conference on Multimedia (MM'10). 919--922. Google ScholarDigital Library
J. Young. 2008. Sydney Pollack dies at 73. Variety, May 26.Google Scholar
Q. Zhao and C. Koch. 2012. Learning visual saliency by combining feature maps in a nonlinear manner using adaboost. J. Vis. 12, 6.Google ScholarCross Ref

Index Terms

Gaze-Driven Video Re-Editing
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
  2. Computer graphics
    1. Image manipulation
    2. Rendering

Recommendations

Computational video editing for dialogue-driven scenes

We present a system for efficiently editing video of dialogue-driven scenes. The input to our system is a standard film script and multiple video takes, each capturing a different camera framing or performance of the complete scene. Our system then ...
Read More
Multi-clip video editing from a single viewpoint
CVMP '14: Proceedings of the 11th European Conference on Visual Media Production

We propose a framework for automatically generating multiple clips suitable for video editing by simulating pan-tilt-zoom camera movements within the frame of a single static camera. Assuming important actors and objects can be localized using computer ...
Read More
Automatic Video Editing for Video-Based Interactive Storytelling
ICME '12: Proceedings of the 2012 IEEE International Conference on Multimedia and Expo

The development of interactive narratives with the quality of feature films is the central challenge of what we can name Video-Based Interactive Storytelling. A promising approach to this question is the use of prerecorded videos with real actors. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Graphics Volume 34, Issue 2
February 2015
136 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2742222
Editor:
Holly Rushmeier
Yale University
Issue’s Table of Contents
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 2 March 2015
- Revised: 1 September 2014
- Accepted: 1 September 2014
- Received: 1 December 2012
Published in tog Volume 34, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Perceptually-based algorithms
curve fitting
eyetracking
video editing
video retargeting
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 36
  Total Citations
  View Citations
- 934
  Total Downloads
- Downloads (Last 12 months)34
- Downloads (Last 6 weeks)8
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Gaze-Driven Video Re-Editing

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Computational video editing for dialogue-driven scenes

Multi-clip video editing from a single viewpoint

Automatic Video Editing for Video-Based Interactive Storytelling