skip to main content
10.1145/2393347.2393373acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

MoViMash: online mobile video mashup

Published: 29 October 2012 Publication History

Abstract

With the proliferation of mobile video cameras, it is becoming easier for users to capture videos of live performances and socially share them with friends and public. As an attendee of such live performances typically has limited mobility, each video camera is able to capture only from a range of restricted viewing angles and distance, producing a rather monotonous video clip. At such performances, however, multiple video clips can be captured by different users, likely from different angles and distances. These videos can be combined to produce a more interesting and representative mashup of the live performances for broadcasting and sharing. The earlier works select video shots merely based on the quality of currently available videos. In real video editing process, however, recent selection history plays an important role in choosing future shots. In this work, we present MoViMash, a framework for automatic online video mashup that makes smooth shot transitions to cover the performance from diverse perspectives. Shot transition and shot length distributions are learned from professionally edited videos. Further, we introduce view quality assessment in the framework to filter out shaky, occluded, and tilted videos. To the best of our knowledge, this is the first attempt to incorporate history-based diversity measurement, state-based video editing rules, and view quality in automated video mashup generations. Experimental results have been provided to demonstrate the effectiveness of MoViMash framework.

References

[1]
http://www.bloomberg.com/news/2011-04-07/high-resolution-cameras-will-drive-mobile-phone-shipments-above-1-billion.html.
[2]
http://www.businesswire.com/news/home /20110829005068/en/photobucket-survey-video-uploads-mobile-devices-rise.
[3]
M. Al-Hames, B. Hörnler, C. Scheuermann, and G. Rigoll. Using audio, visual, and lexical features in a multi-modal virtual meeting director. In Proc. of Machine Learning for Multimodal Interaction, pages 63--74. Springer, Bethesda, MD, USA, May 2006.
[4]
M. Campanella, H. Weda, and M. Barbieri. Edit while watching: home video editing made easy. In Proc. of SPIE 19th Ann. Symp. Electronic Imaging, Multimedia Content Access: Algorithms and Systems, volume 6506, page 21, San Jose, CA, USA, January 2007.
[5]
F. Crete, T. Dolmiere, P. Ladret, and M. Nicolas. The blur effect: perception and estimation with a new no-reference perceptual blur metric. In Proc. of SPIE Human Vision and Electronic Imaging XII, volume 6492, San Jose, CA, USA, January 2007.
[6]
R. Cutler, Y. Rui, A. Gupta, J. Cadiz, I. Tashev, L. He, A. Colburn, Z. Zhang, Z. Liu, and S. Silverberg. Distributed meetings: A meeting capture and broadcasting system. In Proc. of ACM International Conference on Multimedia, pages 503--512. ACM, Juan Les Pins, France, December 2002.
[7]
E. de Lima, C. Pozzer, M. d'Ornellas, A. Ciarlini, B. Feijó, and A. Furtado. Virtual cinematography director for interactive storytelling. In Proc. of International Conference on Advances in Computer Enterntainment Technology, pages 263--270, Athens, Greece, October 2009.
[8]
A. Engstrom, M. Esbjornsson, O. Juhlin, and M. Perry. Producing collaborative video: developing an interactive user experience for mobile tv. In Proc. of the International Conference on Designing Interactive User Experiences for TV and Video, pages 115--124, Silicon Valley, CA, USA, October 2008.
[9]
E. Machnicki and L. Rowe. Virtual director: Automating a webcast. In Proc. of the SPIE Multimedia Computing and Networking, volume 4673, 2002.
[10]
A. K. Moorthy, P. Obrador, and N. Oliver. Towards computational models of the visual aesthetic appeal of consumer videos. In Proc. of the 11th European conference on Computer vision: Part V, pages 1--14, Crete, Greece, September 2010.
[11]
N. Quang Minh Khiem, G. Ravindra, A. Carlier, and W. Ooi. Supporting zoomable video streams with dynamic region-of-interest cropping. In Proc. of the first annual ACM SIGMM Conference on Multimedia Systems, pages 259--270. ACM, 2010.
[12]
A. Ranjan, R. Henrikson, J. Birnholtz, R. Balakrishnan, and D. Lee. Automatic camera control using unobtrusive vision and audio tracking. In Proc. of Graphics Interface, pages 47--54. Canadian Information Processing Society, Ottawa, Ontario, Canada, May 2010.
[13]
A. Senior, A. Hampapur, Y. Tian, L. Brown, S. Pankanti, and R. Bolle. Appearance models for occlusion handling. Image and Vision Computing, 24(11):1233--1243, 2006.
[14]
P. Shrestha, M. Barbieri, and H. Weda. Synchronization of multi-camera video recordings based on audio. In Proc of ACM International Conference on Multimedia, pages 545--548, Augsburg, Germany, September 2007.
[15]
P. Shrestha, H. de With Peter, H. Weda, M. Barbieri, and E. Aarts. Automatic mashup generation from multiple-camera concert recordings. In Proc. of ACM International Conference on Multimedia, pages 541--550, Firenze, Italy, October 2010.
[16]
J. Wang, C. Xu, E. Chng, H. Lu, and Q. Tian. Automatic composition of broadcast sports video. Multimedia Systems, 14(4):179--193, 2008.
[17]
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assessment: From error visibility to structural similarity. IEEE Trans. on Image Processing, 13(4):600--612, 2004.
[18]
Z. Wang, H. Sheikh, and A. Bovik. No-reference perceptual quality assessment of jpeg compressed images. In Proc. of IEEE International Conference on Image Processing, pages 477--480, Rochester, NY, USA, September 2002.
[19]
T. Yang, Q. Pan, J. Li, and S. Li. Real-time multiple objects tracking with occlusion handling in dynamic scenes. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pages 970--975, San Diego, CA, USA, June 2005.
[20]
B. Yu, C. Zhang, Y. Rai, and K. Nahrstedt. A three-layer virtual director model for supporting automated multi-site distributed education. In IEEE International Conference on Multimedia and Expo, pages 637--640, Toronto, Ontario, Canada, July 2006.
[21]
C. Zhang, Y. Rui, J. Crawford, and L. He. An automated end-to-end lecture capture and broadcasting system. ACM Tran. on Multimedia Computing, Communications, and Applications, 4(1):6:1--6:23, 2008.

Cited By

View all
  • (2024)Composition and Transmission of Videos Generated by Multiple UsersFrom Multimedia Communications to the Future Internet10.1007/978-3-031-71874-8_14(202-218)Online publication date: 13-Sep-2024
  • (2023)A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language ModelProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611878(6441-6450)Online publication date: 26-Oct-2023
  • (2023)COSense: collaborative and opportunistic sensing of road events by vehicles’ camerasCCF Transactions on Pervasive Computing and Interaction10.1007/s42486-023-00126-95:3(276-287)Online publication date: 15-Feb-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '12: Proceedings of the 20th ACM international conference on Multimedia
October 2012
1584 pages
ISBN:9781450310895
DOI:10.1145/2393347
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. mobile video
  2. video mashup
  3. virtual director

Qualifiers

  • Research-article

Conference

MM '12
Sponsor:
MM '12: ACM Multimedia Conference
October 29 - November 2, 2012
Nara, Japan

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)24
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Composition and Transmission of Videos Generated by Multiple UsersFrom Multimedia Communications to the Future Internet10.1007/978-3-031-71874-8_14(202-218)Online publication date: 13-Sep-2024
  • (2023)A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language ModelProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611878(6441-6450)Online publication date: 26-Oct-2023
  • (2023)COSense: collaborative and opportunistic sensing of road events by vehicles’ camerasCCF Transactions on Pervasive Computing and Interaction10.1007/s42486-023-00126-95:3(276-287)Online publication date: 15-Feb-2023
  • (2022)PopStageACM Transactions on Graphics10.1145/3550454.355546741:6(1-13)Online publication date: 30-Nov-2022
  • (2022)Automated video editing based on learned styles using LSTM-GANProceedings of the 37th ACM/SIGAPP Symposium on Applied Computing10.1145/3477314.3507141(73-80)Online publication date: 25-Apr-2022
  • (2021)Learning to Visualize Music Through Shot Sequence for Automatic Concert Video MashupIEEE Transactions on Multimedia10.1109/TMM.2020.300363123(1731-1743)Online publication date: 2021
  • (2021)Reinforcement Learning Based Automatic Personal Mashup Generation2021 IEEE International Conference on Multimedia and Expo (ICME)10.1109/ICME51207.2021.9428357(1-6)Online publication date: 5-Jul-2021
  • (2020)Learning From Music to Visual Storytelling of Shots: A Deep Interactive Learning MechanismProceedings of the 28th ACM International Conference on Multimedia10.1145/3394171.3413985(102-110)Online publication date: 12-Oct-2020
  • (2020)Prosuming Live Multimedia Content at the EdgeProceedings of the 2020 ACM International Conference on Interactive Media Experiences10.1145/3391614.3399393(160-164)Online publication date: 17-Jun-2020
  • (2020)PopMash: an automatic musical-mashup system using computation of musical and lyrical agreement for transitionsMultimedia Tools and Applications10.1007/s11042-020-08934-2Online publication date: 13-May-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media