ABSTRACT
The capacity to capture and broadcast sports events with close to real-time volumetric reconstruction techniques opens exciting perspectives in how audiences can consume and interact with these contents. In this work, we propose the design of an real-time cinematography system that is capable of generating qualitative framing and editing of volumetric-captured content, by mimicking real broadcast footage. To illustrate our approach, we focus on the specific problem of cinematography for ring-based events such as Ultimate Fighting Championships (UFC). We start by extracting statistical features from hours of real footage to understand the specific framing and cutting behaviors of real broadcast directors. We then exploit these features in a real-time editing system, not only to replicate the behaviors in existing broadcasting, but also to generalize to novel camera layouts. We demonstrate our approach on the volumetric reconstruction of UFC fights, compare our results with baseline methods, and report different qualitative and quantitative evaluations.
Supplemental Material
- Christine Chen, Oliver Wang, Simon Heinzle, Peter Carr, Aljoscha Smolic, and Markus Gross. 2013. Computational sports broadcasting: Automated director assistance for live sports. In 2013 IEEE International Conference on Multimedia and Expo (ICME). 1–6. https://doi.org/10.1109/ICME.2013.6607445Google ScholarCross Ref
- Jianhui Chen, Keyu Lu, Sijia Tian, and Jim Little. 2019. Learning sports camera selection from internet videos. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 1682–1691.Google ScholarCross Ref
- Jianhui Chen, Lili Meng, and James J Little. 2018. Camera selection for broadcasting soccer games. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 427–435.Google ScholarCross Ref
- Fahad Daniyal and Andrea Cavallaro. 2011. Multi-camera scheduling for video production. In 2011 Conference for Visual Media Production. IEEE, 11–20.Google ScholarDigital Library
- Quentin Galvane, Rémi Ronfard, Christophe Lino, and Marc Christie. 2015. Continuity editing for 3D animation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29.Google ScholarCross Ref
- Yuzhong Huang, Xue Bai, Oliver Wang, Fabian Caba, and Aseem Agarwala. 2021. Learning Where to Cut from Edited Videos. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3215–3223.Google ScholarCross Ref
- Hongda Jiang, Marc Christie, Xi Wang, Libin Liu, Bin Wang, and Baoquan Chen. 2021. Camera Keyframing with Style and Control. ACM Trans. Graph. 40, 6, Article 209 (dec 2021), 13 pages. https://doi.org/10.1145/3478513.3480533Google ScholarDigital Library
- Elia Kaufmann, Leonard Bauersfeld, Antonio Loquercio, Matthias Müller, Vladlen Koltun, and Davide Scaramuzza. 2023. Champion-level drone racing using deep reinforcement learning. Nature 620, 7976 (2023), 982–987.Google Scholar
- Mackenzie Leake, Abe Davis, Anh Truong, and Maneesh Agrawala. 2017. Computational Video Editing for Dialogue-driven Scenes. ACM Trans. Graph. 36, 4, Article 130 (July 2017), 14 pages. https://doi.org/10.1145/3072959.3073653Google ScholarDigital Library
- Florent Lefevre, Vincent Bombardier, Patrick Charpentier, Nicolas Krommenacker, and Bertrand Petat. 2018. Automatic camera selection in the context of basketball game. In International Conference on Image and Signal Processing. Springer, 72–79.Google ScholarDigital Library
- Pierre-François Marteau. 2008. Time warp edit distance with stiffness adjustment for time series matching. IEEE transactions on pattern analysis and machine intelligence 31, 2 (2008), 306–318.Google Scholar
- Billal Merabti, Marc Christie, and Kadi Bouatouch. 2016. A virtual director using hidden markov models. In Computer Graphics Forum, Vol. 35. Wiley Online Library, 51–67.Google Scholar
- Frederic P. Miller, Agnes F. Vandome, and John McBrewster. 2009. Levenshtein Distance: Information Theory, Computer Science, String (Computer Science), String Metric, Damerau?Levenshtein Distance, Spell Checker, Hamming Distance. Alpha Press.Google Scholar
- Yingwei Pan, Yue Chen, Qian Bao, Ning Zhang, Ting Yao, Jingen Liu, and Tao Mei. 2021. Smart Director: An Event-Driven Directing System for Live Broadcasting. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 17, 4 (2021), 1–18.Google ScholarDigital Library
- Gregory Rogez, Philippe Weinzaepfel, and Cordelia Schmid. 2019. Lcr-net++: Multi-person 2d and 3d pose detection in natural images. IEEE transactions on pattern analysis and machine intelligence 42, 5 (2019), 1146–1161.Google ScholarCross Ref
- Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 815–823.Google ScholarCross Ref
- Jinjun Wang, Changsheng Xu, Engsiong Chng, Hanqing Lu, and Qi Tian. 2008. Automatic composition of broadcast sports video. Multimedia Systems 14, 4 (2008), 179–193.Google ScholarDigital Library
- Xi Wang, Robin Courant, Jinglei Shi, Eric Marchand, and Marc Christie. 2023. JAWS: Just a Wild Shot for Cinematic Transfer in Neural Radiance Fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 16933–16942.Google ScholarCross Ref
- Xueting Wang, Kensho Hara, Yu Enokibori, Takatsugu Hirayama, and Kenji Mase. 2016. Personal multi-view viewpoint recommendation based on trajectory distribution of the viewing target. In Proceedings of the 24th ACM international conference on Multimedia. 471–475.Google ScholarDigital Library
- Philippe Weinzaepfel, Romain Brégier, Hadrien Combaluzier, Vincent Leroy, and Grégory Rogez. 2020. Dope: Distillation of part experts for whole-body 3d pose estimation in the wild. In European Conference on Computer Vision. Springer, 380–397.Google ScholarDigital Library
- Hui-Yin Wu, Francesca Palù, Roberto Ranon, and Marc Christie. 2018. Thinking like a director: Film editing patterns for virtual cinematographic storytelling. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 14, 4 (2018), 1–22.Google ScholarDigital Library
- Danqing Yang, Longfei Zhang, Yufeng Wu, Shugang Li, Dong Liang, and Gangyi Ding. 2019. Computable Framework For Live Sport Broadcast Directing. In 2019 IEEE International Symposium on Multimedia (ISM). IEEE, 239–2391.Google Scholar
Index Terms
- Real-time Computational Cinematographic Editing for Broadcasting of Volumetric-captured events: an Application to Ultimate Fighting
Recommendations
Heuristics for continuity editing of cinematic computer graphics scenes
Sandbox '09: Proceedings of the 2009 ACM SIGGRAPH Symposium on Video GamesWe present a set of heuristics for editing footage of 3D computer graphics cinematic sequences into a coherent movie clip which obeys the conventions of continuity editing. Our approach mimics the decision processes of an editor assembling a clip out of ...
Example-driven virtual cinematography by learning camera behaviors
Designing a camera motion controller that has the capacity to move a virtual camera automatically in relation with contents of a 3D animation, in a cinematographic and principled way, is a complex and challenging task. Many cinematographic rules exist, ...
Stretchable cartoon editing for skeletal captured animations
SA '11: SIGGRAPH Asia 2011 SketchesIn this work, we describe a new and simple approach to re-use skeleton of animation with joint-based Laplacian-type regularization, in the context of exaggerated skeleton-based character animation. Despite decades of research, interactive character ...
Comments