research-article

Crowdsourced automatic zoom and scroll for video retargeting

Authors:

Vincent Charvillat,

Romulus Grigoras,

Geraldine MorinAuthors Info & Claims

MM '10: Proceedings of the 18th ACM international conference on Multimedia

Pages 201 - 210

https://doi.org/10.1145/1873951.1873962

Published: 25 October 2010 Publication History

Abstract

Screen size and display resolution limit the experience of watching videos on mobile devices. The viewing experience can be improved by determining important or interesting regions within the video (called regions of interest, or ROIs) and displaying only the ROIs to the viewer. Previous work focuses on analyzing the video content using visual attention model to infer the ROIs. Such content-based technique, however, has limitations. In this paper, we propose an alternative paradigm to infer ROIs from a video. We crowdsource from a large number of users through their implicit viewing behavior using a zoom and pan interface, and infer the ROIs from their collective wisdom. A retargeted video, consisting of relevant shots determined from historical users behavior, can be automatically generated and replayed to subsequent users who would prefer a less interactive viewing experience. This paper presents how we collect the user traces, infer the ROIs and their dynamics, group the ROIs into shots, and automatically reframe those shots to improve the aesthetics of the video. A user study with 48 participants shows that our automatically retargeted video is of comparable quality to one handcrafted by an expert user

References

[1]

D. Arijon. Grammar of the Film Language. Silman-James Press, 1991.

[2]

D. Comaniciu and P. Meer. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell., 24(5):603--619, 2002.

Digital Library

[3]

P. Doubek, I. Geys, T. Svoboda, and L. V. Gool. Cinematographic rules applied to a camera network. In Proc. of the 5th Workshop on Omnidirectional Vision, pages 17--30, 2004.

[4]

H. El-Alfy, D. Jacobs, and L. Davis. Multi-scale video cropping. In Proc. of MULTIMEDIA '07, pages 97?106, Augsburg, Germany, 2007.

Digital Library

[5]

X. Fan, X. Xie, H.-Q. Zhou, and W.-Y. Ma. Looking into video frames on small displays. In Proc. of MULTIMEDIA '03, pages 247--250, Berkeley, CA, 2003.

Digital Library

[6]

M. L. Gleicher and F. Liu. Re-cinematography: Improving the camerawork of casual video. ACM Trans. Multimedia Comput. Commun. Appl., 5(1):1--28, 2008.

Digital Library

[7]

J. Han, K. N. Ngan, M. Li, and H. Zhang. Unsupervised extraction of visual attention objects in color images. IEEE Trans. Circuits Syst. Video Techn., 16(1):141--145, 2006.

Digital Library

[8]

L.-w. He, M. F. Cohen, and D. H. Salesin. The virtual cinematographer: a paradigm for automatic real-time camera control and directing. In Proc. of SIGGRAPH '96, pages 217--224, 1996.

Digital Library

[9]

T.-H. Huang, K.-Y. Cheng, and Y.-Y. Chuang. A collaborative benchmark for region of interest detection algorithms. In Proc. of CVPR '09, Miami, FL, June 2009.

[10]

L. Itti, C. Koch, and E. Niebur. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell., 20(11):1254--1259, 1998.

Digital Library

[11]

T.-Y. Li and X.-Y. Xiao. An interactive camera planning system for automatic cinematographer. In Proc. of Multimedia Modeling, pages 310--315, Los Alamitos, 2005.

Digital Library

[12]

F. Liu and M. Gleicher. Video retargeting: automating pan and scan. In Proc. of MULTIMEDIA '06, pages 241--250, Santa Barbara, CA, 2006.

Digital Library

[13]

Y. Pritch, A. Rav-Acha, and S. Peleg. Nonchronological video synopsis and indexing. IEEE Trans. Pattern Anal. Mach. Intell., 30(11):1971--1984, 2008.

Digital Library

[14]

N. Quang Minh Khiem, G. Ravindra, A. Carlier, and W. T. Ooi. Supporting zoomable video streams with dynamic region-of-interest cropping. In Proc. of ACM MMSYS '10, pages 259--270, Phoenix, AZ, 2010.

Digital Library

[15]

M. Rubinstein, A. Shamir, and S. Avidan. Improved seam carving for video retargeting. ACM Trans. Graph., 27(3):1--9, 2008.

Digital Library

[16]

D. Shamma, R. Shaw, P. Shafton, and Y. Liu. Watch What I Watch. In Proc. ACM MIR '07, Augsburg, Germany, 2007.

Digital Library

[17]

N. Ukita, T. Ono, and M. Kidode. Region extraction of a gaze object using the gaze point and view image sequences. In Proc. of the 7th International Conference on Multimodal Interfaces, pages 129--136, Torento, Italy, 2005.

Digital Library

[18]

D. Walther and C. Koch. Modeling attention to salient proto-objects. Neural Networks, 19:1395--1407, 2006.

Digital Library

[19]

X. Xie, H. Liu, S. Goumaz, and W.-Y. Ma. Learning user interest for image browsing on small-form-factor devices. In Proc. of CHI '05, pages 671--680, Portland, OR, 2005.

Digital Library

Cited By

Wang XSun YSu Y(2022)A Study of Interaction Design for E-learning System Based on Distributed Asynchronous CommunicationProceedings of the Tenth International Symposium of Chinese CHI10.1145/3565698.3565793(249-255)Online publication date: 22-Oct-2022
https://dl.acm.org/doi/10.1145/3565698.3565793
Tang ZLv CTang Y(2022)Adaptive cropping with interframe relative displacement constraint for video retargetingSignal Processing: Image Communication10.1016/j.image.2022.116666104(116666)Online publication date: May-2022
https://doi.org/10.1016/j.image.2022.116666
Chen YMonroy-Hernández AWehrman IOney SLasecki WVaish R(2020)Sifter: A Hybrid Workflow for Theme-based Video Curation at ScaleProceedings of the 2020 ACM International Conference on Interactive Media Experiences10.1145/3391614.3393657(65-73)Online publication date: 17-Jun-2020
https://dl.acm.org/doi/10.1145/3391614.3393657
Show More Cited By

Index Terms

Crowdsourced automatic zoom and scroll for video retargeting
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Combining content-based analysis and crowdsourcing to improve user interaction with zoomable video
MM '11: Proceedings of the 19th ACM international conference on Multimedia

This paper introduces a new paradigm for interacting with zoomable video. Our interaction technique reduces the number of zooms and pans required by providing recommended viewports to the users, and replaces multiple zoom and pan actions with a simple ...
Towards characterizing users' interaction with zoomable video
SAPMIA '10: Proceedings of the 2010 ACM workshop on Social, adaptive and personalized multimedia interaction and access

We conducted a user study with 4 video clips and 37 viewing sessions on how users interact with a web-based zoomable video system, where users can zoom and pan within the video to view selected regions-of-interest with more detail. The study shows that ...
COZI: crowdsourced and content-based zoomable video player
MM '11: Proceedings of the 19th ACM international conference on Multimedia

We present a new user interface designed to allow easy yet effective zooming and panning into high-definition videos for playback on low resolution displays. Our system first applies state-of-the-art video analysis algorithms to detect salient regions ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '10: Proceedings of the 18th ACM international conference on Multimedia

October 2010

1836 pages

ISBN:9781605589336

DOI:10.1145/1873951

General Chairs:
Alberto del Bimbo
University of Florence, Italy
,
Shih-Fu Chang
Columbia University, USA
,
Program Chair:
Arnold Smeulders
University of Amsterdam, NL

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 October 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MM '10

Sponsor:

SIGMM

MM '10: ACM Multimedia Conference

October 25 - 29, 2010

Firenze, Italy

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

30
Total Citations
View Citations
543
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)1

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang XSun YSu Y(2022)A Study of Interaction Design for E-learning System Based on Distributed Asynchronous CommunicationProceedings of the Tenth International Symposium of Chinese CHI10.1145/3565698.3565793(249-255)Online publication date: 22-Oct-2022
https://dl.acm.org/doi/10.1145/3565698.3565793
Tang ZLv CTang Y(2022)Adaptive cropping with interframe relative displacement constraint for video retargetingSignal Processing: Image Communication10.1016/j.image.2022.116666104(116666)Online publication date: May-2022
https://doi.org/10.1016/j.image.2022.116666
Chen YMonroy-Hernández AWehrman IOney SLasecki WVaish R(2020)Sifter: A Hybrid Workflow for Theme-based Video Curation at ScaleProceedings of the 2020 ACM International Conference on Interactive Media Experiences10.1145/3391614.3393657(65-73)Online publication date: 17-Jun-2020
https://dl.acm.org/doi/10.1145/3391614.3393657
Mitrovic AGordon MPiotrkowicz ADimitrova V(2019)Investigating the Effect of Adding Nudges to Increase Engagement in Active Video WatchingArtificial Intelligence in Education10.1007/978-3-030-23204-7_27(320-332)Online publication date: 21-Jun-2019
https://doi.org/10.1007/978-3-030-23204-7_27
Kiess JKopf SGuthier BEffelsberg W(2018)A Survey on Content-Aware Image and Video RetargetingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/323159814:3(1-28)Online publication date: 24-Jul-2018
https://dl.acm.org/doi/10.1145/3231598
Luoto AHeino PYou Y(2018)Lightweight Visualization and User Logging for Mobile 360-degree Videos2018 IEEE 11th Workshop on Software Engineering and Architectures for Real-time Interactive Systems (SEARIS)10.1109/SEARIS44442.2018.9180230(1-8)Online publication date: Mar-2018
https://doi.org/10.1109/SEARIS44442.2018.9180230
Kerbiche AJabra SZagrouba ECharvillat V(2018)A robust video watermarking based on feature regions and crowdsourcingMultimedia Tools and Applications10.1007/s11042-018-5888-677:20(26769-26791)Online publication date: 1-Oct-2018
https://dl.acm.org/doi/10.1007/s11042-018-5888-6
Crivelli TPerez POisel L(2016)Visual object trappingComputer Vision and Image Understanding10.1016/j.cviu.2016.07.007153:C(3-15)Online publication date: 1-Dec-2016
https://dl.acm.org/doi/10.1016/j.cviu.2016.07.007
Mate SCurcio IEronen ALehtiniemi AWainwright RCorchado JBechini AHong J(2015)Automatic multi-camera remix from single videoProceedings of the 30th Annual ACM Symposium on Applied Computing10.1145/2695664.2695881(1270-1277)Online publication date: 13-Apr-2015
https://dl.acm.org/doi/10.1145/2695664.2695881
Johnson-Roberson MBryson MDouillard BPizarro OWilliams S(2015)Discovering salient regions on 3D photo-textured mapsComputer Vision and Image Understanding10.1016/j.cviu.2014.07.006131:C(28-41)Online publication date: 1-Feb-2015
https://dl.acm.org/doi/10.1016/j.cviu.2014.07.006
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten