Modeling the timing of cuts in automatic editing of concert videos

Roininen, Mikko J.; Leppänen, Jussi; Eronen, Antti J.; Curcio, Igor D. D.; Gabbouj, Moncef

doi:10.1007/s11042-016-3304-7

Modeling the timing of cuts in automatic editing of concert videos

Published: 20 February 2016

Volume 76, pages 6683–6707, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Mikko J. Roininen¹,
Jussi Leppänen²,
Antti J. Eronen²,
Igor D. D. Curcio² &
…
Moncef Gabbouj¹

360 Accesses
1 Citation
Explore all metrics

Abstract

Increasing amount of video content is being recorded by people in public events. However, the editing of such videos can be challenging for the average user. We describe an approach for modeling the shot cut timing of professionally edited concert videos. We analyze the temporal positions of cuts in relation to the music meter grid and form Markov chain models from the found switching patterns and their occurrence frequencies. The stochastic Markov chain models are combined with audio change point analysis and cut deviation models for automatically generating temporal editing cues for unedited concert video recordings. Videos edited according to the modeling are compared in a user study against a baseline automatic editing method as well as against videos edited by hand. The study results show that users prefer the cut timing from the proposed system over the baseline with a clear margin, whereas a much smaller difference is observed in the preference of hand-made videos over the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Narrative theory and the dynamics of popular movies

Article Open access 03 May 2016

A survey of methods for time series change point detection

Article 08 September 2016

The Relation Between Music Technology and Music Industry

Notes

The same term has also been used for combining parts of multiple audio and video clips from different, possibly unrelated sources [16, 19]. We will refer to such works as a video remix [17].
In this work we use the term cut to collectively refer to both abrupt switches and longer transitions between shots.

References

Cai R, Zhang L, Jing F, Lai W, Ma W (2007) Automated music video generation using WEB image resource. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, ICASSP 2007. doi:10.1109/ICASSP.2007.366341, Honolulu, pp 737–740
Chu WT, Chen JC, Wu JL (2007) Tiling slideshow: an audiovisual presentation method for consumer photos. IEEE Multimedia 14(3):36–45. doi:10.1109/MMUL.2007.66
Article Google Scholar
Dwelle T (1994) Music video 101: home camcorder production. DASH Entertainment Productions Incorporated. http://books.google.fi/books?id=tQKsHAAACAAJ
Ellis DP (2007) Beat tracking by dynamic programming. J New Music Res 36(1):51–60
Article Google Scholar
Eronen A, Klapuri A (2010) Music tempo estimation with k-NN regression. IEEE Audio, Speech Language Process 18(1):50–57
Article Google Scholar
Foote J, Cooper M, Girgensohn A (2002) Creating music videos using automatic media analysis. In: Proceedings of the tenth ACM international conference on multimedia, MULTIMEDIA ’02. doi:10.1145/641007.641119. ACM, New York, pp 553–560
Gillet O, Essid S, Richard G (2007) On the correlation of automatic audio and visual segmentations of music videos. IEEE Trans Circuits Syst Video Technol 17(3):347–355. doi:10.1109/TCSVT.2007.890831
Article Google Scholar
Hua XS, Lu L, Zhang HJ (2004) Automatic music video generation based on temporal pattern analysis. In: Proceedings of the 12th annual ACM international conference on multimedia, MULTIMEDIA ’04. doi:10.1145/1027527.1027641. ACM, New York, pp 472–475
Hua XS, Lu L, Zhang HJ (2004) Optimization-based automated home video editing system. IEEE Trans Circuits Syst Video Technol 14(5):572–583. doi:10.1109/TCSVT.2004.826750
Article Google Scholar
Kennedy L, Naaman M (2009) Less talk, more rock: Automated organization of community-contributed collections of concert videos. In: Proceedings of the 18th international conference on world wide web, WWW ’09. doi:10.1145/1526709.1526752. ACM, New York, pp 311–320
Klapuri A, Eronen A, Astola J (2006) Analysis of the meter of acoustic musical signals. IEEE Audio, Speech, Language Process 14(1):342–355. doi:10.1109/TSA.2005.854090
Article Google Scholar
Lai PS, Cheng SS, Sun SY, Huang TY, Su JM, Xu YY, Chen YH, Chuang SC, Tseng CL, Hsieh CL, Lu YL, Shen YC, Chen JR, Nie JB, Tsai FP, Huang HC, Pao HT, Fu HC (2005) Automated information mining on multimedia tv news archives Proceedings of the 9th international conference on knowledge-based intelligent information and engineering systems - Volume Part II, KES’05. Springer-Verlag, Berlin, Heidelberg, pp 1238–1244
Liao C, Wang PP, Zhang Y (2008) Mining association patterns between music and video clips in professional MTV. In: Proceedings of the 15th international multimedia modeling conference on advances in multimedia modeling, MMM ’09. doi:10.1007/978-3-540-92892-8_41. Springer-Verlag, Berlin, Heidelberg, pp 401–412
Matsuo Y, Amano M, Uehara K (2002) Mining video editing rules in video streams. In: Proceedings of the 10th ACM international conference on multimedia, MULTIMEDIA ’02. doi:10.1145/641007.641058. ACM, New York, pp 255–258
Naci U (2010) Multimedia content analysis, indexing and summarization: a perspective on real-life uses cases. Phd thesis, Technische Universiteit Delft. http://repository.tudelft.nl/view/ir/uuid:93784cba-42d9-4d94-a430-0aacfa01bf28/
Nakano T, Murofushi S, Goto M, Morishima S (2011) Dancereproducer: An automatic mashup music video generation system by reusing dance video clips on the web. In: Proceedings of the 8th sound and music computing conference (SMC 2011), pp 183–189
Nitta N, Babaguchi N (2011) Example-based video remixing. Multimedia Tools Appl 51(2):649–673. doi:10.1007/s11042-010-0633-9
Article Google Scholar
Norris JR (1998) Markov chains. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press
Ohya H, Morishima S (2013) Automatic mash up music video generation system by remixing existing video content. In: International conference on culture and computing (Culture Computing), 2013. doi:10.1109/CultureComputing.2013.44, pp 157–158
Ojala J, Mate S, Curcio IDD, Lehtiniemi A, Väänänen-Vainio-Mattila K (2014) Automated creation of mobile video remixes: user trial in three event contexts. In: Proceedings of the 13th international conference on mobile and ubiquitous multimedia, MUM ’14. doi:10.1145/2677972.2677975. ACM, New York, pp 170–179
Peeters G, Papadopoulos H (2011) Simultaneous beat and downbeat-tracking using a probabilistic framework: theory and large-scale evaluation. IEEE Audio, Speech, Language Process 19(6):1754–1769. doi:10.1109/TASL.2010.2098869
Article Google Scholar
Saini M, Venkatagiri SP, Ooi WT, Chan MC (2013) The Jiku mobile video dataset. In: Proceedings of the 4th ACM multimedia systems conference, MMSys ’13. doi:10.1145/2483977.2483990. ACM, New York, pp 108–113
Saini MK, Gadde R, Yan S, Ooi WT (2012) MoViMash: online mobile video mashup. In: Proceedings of the 20th ACM international conference on multimedia, MM ’12. doi:10.1145/2393347.2393373. ACM, New York, pp 139–148
Shamma DA, Pardo B, Hammond KJ (2005) MusicStory: a personalized music video creator. In: Proceedings of the 13th annual ACM international conference on multimedia, MULTIMEDIA ’05. doi:10.1145/1101149.1101278. ACM, New York, pp 563–566
Shao X, Xu C, Maddage NC, Tian Q, Kankanhalli MS, Jin JS (2006) Automatic summarization of music videos. ACM Trans Multimedia Comput Commun Appl 2(2):127–148. doi:10.1145/1142020.1142023
Article Google Scholar
Shrestha P, de With PH, Weda H, Barbieri M, Aarts EH (2010) Automatic mashup generation from multiple-camera concert recordings. In: Proceedings of the international conference on multimedia, MM ’10. doi:10.1145/1873951.1874023. ACM, New York, pp 541–550
Tomás B, Pereira F (2011) Musical slideshow: boosting user experience in photo presentation. Multimedia Tools Appl 55(3):627–653. doi:10.1007/s11042-010-0582-3
Article Google Scholar
Vihavainen S, Mate S, Seppälä L, Cricri F, Curcio IDD (2011) We want more: human-computer collaboration in mobile social video remixing of music concerts. ACM, New York
Wang J, Chng E, Xu C, Lu H, Tian Q (2007) Generation of personalized music sports video using multimodal cues. IEEE Trans Multimedia 9(3):576–588. doi:10.1109/TMM.2006.888013
Article Google Scholar
Wang JC, Yang YH, Jhuo IH, Lin YY, Wang HM (2012) The acousticvisual emotion Guassians model for automatic generation of music video. In: Proceedings of the 20th ACM international conference on multimedia, MM ’12. doi:10.1145/2393347.2396494. ACM, New York, pp 1379–1380
Wu X, Xu B, Qiao Y, Tang X (2012) Automatic music video generation: cross matching of music and image. In: Proceedings of the 20th ACM international conference on multimedia, MM ’12. doi:10.1145/2393347.2396495. ACM, New York, pp 1381–1382
Xu S, Jin T, Lau F (2008) Automatic generation of music slide show using personal photos. In: 10th IEEE international symposium on multimedia, 2008. ISM 2008. doi:10.1109/ISM.2008.39, pp 214–219
Yoon JC, Lee IK, Byun S (2009) Automated music video generation using multi-level feature-based segmentation. Multimedia Tools Appl 41(2):197–214. doi:10.1007/s11042-008-0225-0
Article Google Scholar
Young S, Young S (1994) The HTK hidden Markov model toolkit: design and philosophy. Entropic Cambridge Research Laboratory Ltd 2:2–44
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Signal Processing, Tampere University of Technology, Korkeakoulunkatu 10, 33720, Tampere, Finland
Mikko J. Roininen & Moncef Gabbouj
Nokia Technologies, Visiokatu 3, 33720, Tampere, Finland
Jussi Leppänen, Antti J. Eronen & Igor D. D. Curcio

Authors

Mikko J. Roininen
View author publications
You can also search for this author in PubMed Google Scholar
Jussi Leppänen
View author publications
You can also search for this author in PubMed Google Scholar
Antti J. Eronen
View author publications
You can also search for this author in PubMed Google Scholar
Igor D. D. Curcio
View author publications
You can also search for this author in PubMed Google Scholar
Moncef Gabbouj
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mikko J. Roininen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Roininen, M.J., Leppänen, J., Eronen, A.J. et al. Modeling the timing of cuts in automatic editing of concert videos. Multimed Tools Appl 76, 6683–6707 (2017). https://doi.org/10.1007/s11042-016-3304-7

Download citation

Received: 07 July 2015
Revised: 08 December 2015
Accepted: 26 January 2016
Published: 20 February 2016
Issue Date: March 2017
DOI: https://doi.org/10.1007/s11042-016-3304-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling the timing of cuts in automatic editing of concert videos

Abstract

Access this article

Similar content being viewed by others

Narrative theory and the dynamics of popular movies

A survey of methods for time series change point detection

The Relation Between Music Technology and Music Industry

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Modeling the timing of cuts in automatic editing of concert videos

Abstract

Access this article

Similar content being viewed by others

Narrative theory and the dynamics of popular movies

A survey of methods for time series change point detection

The Relation Between Music Technology and Music Industry

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation