Abstract
Automatic video production of sports aims at producing an aesthetic broadcast of sporting events. This is an enabler of low-cost solutions for TV-like streaming of sports events. We present a new video system able to automatically produce a smooth and pleasant broadcast of football games using a camera rig composed of three fixed 4K cameras. The system automatically detects and localizes the main action of a football game and frame it, so it yields a professional cameraman-like production of the football event. We compute the actionness of a football game using a ball detector and an occupancy map representation of players based on a saliency map. The action framing is computed through geometrically pitch modeling. The whole video processing pipeline is done in the edge using smartphones and we report a 25 FPS throughput.
Similar content being viewed by others
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: Large-scale machine learning on heterogeneous systems (2015). https://www.tensorflow.org/
Ariki, Y., Kubota, S., Kumano, M.: Automatic production system of soccer sports video by digital camera work based on situation recognition. In: Eighth IEEE International Symposium on Multimedia (ISM’06), pp. 851–860. IEEE (2006)
Bialkowski, A., Lucey, P., Carr, P., Denman, S., Matthews, I., Sridharan, S.: Recognising team activities from noisy data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 984–990 (2013)
Cantine, J., et al.: Shot by shot: a practical guide to filmmaking. In: ERIC (1995)
Carr, P., Mistry, M., Matthews, I.: Hybrid robotic/virtual pan-tilt-zom cameras for autonomous event recording. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 193–202 (2013)
Carr, P., Sheikh, Y., Matthews, I.: Monocular object detection using 3d geometric primitives. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Comput. Vis. ECCV 2012, pp. 864–878. Springer, Berlin, Heidelberg (2012)
Casiez, G., Roussel, N., Vogel, D.: One-euro filter: a simple speed-based low-pass filter for noisy input in interactive systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2527–2530 (2012)
Chaumette, F., Hutchinson, S., Corke, P.: Visual Servoing, chap. 34, pp. 841–866. Springer (2016)
Chen, F., De Vleeschouwer, C.: Personalized production of basketball videos from multi-sensored data under limited display resolution. Comput. Vis. Image Underst. 114(6), 667–680 (2010)
Chen, J., Carr, P.: Autonomous camera systems: a survey. In: AAAI Workshops (2014)
Chen, J., Carr, P.: Mimicking human camera operators. In: 2015 IEEE Winter Conference on Applications of Computer Vision, pp. 215–222. IEEE (2015)
Chen, J., Le, H.M., Carr, P., Yue, Y., Little, J.J.: Learning online smooth predictors for realtime camera planning using recurrent decision trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4688–4696 (2016)
Chen, J., Meng, L., Little, J.J.: Camera selection for broadcasting soccer games. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 427–435 (2018). https://doi.org/10.1109/WACV.2018.00053
Daniyal, F., Cavallaro, A.: Multi-camera scheduling for video production. In: 2011 Conference for Visual Media Production, pp. 11–20. IEEE (2011)
Foote, E., Carr, P., Lucey, P., Sheikh, Y., Matthews, I.: One-man-band: a touch screen interface for producing live multi-camera sports broadcasts. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 163–172 (2013)
Gaddam, V.R., Eg, R., Langseth, R., Griwodz, C., Halvorsen, P.: The cameraman operating my virtual camera is artificial: can the machine be as good as a human? ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 11(4), 1–20 (2015)
Gaddam, V.R., Langseth, R., Stensland, H.K., Griwodz, C., Halvorsen, P., Johansen, D.: Scaling virtual camera services to a large number of users. In: Proceedings of the 6th ACM Multimedia Systems Conference, MMSys ’15, p. 93-96. Association for Computing Machinery, New York (2015). https://doi.org/10.1145/2713168.2713189
Gaddam, V.R., Langseth, R., Stensland, H.K., Gurdjos, P., Charvillat, V., Griwodz, C., Johansen, D., Halvorsen, P.: Be your own cameraman: Real-time support for zooming and panning into stored and live panoramic video. MMSys ’14, pp. 168–171. Association for Computing Machinery, New York (2014). https://doi.org/10.1145/2557642.2579370
Gadde, R., Karlapalem, K.: Aesthetic guideline driven photography by robots. In: Twenty-Second International Joint Conference on Artificial Intelligence (2011)
Gleicher, M.L., Liu, F.: Re-cinematography: improving the camera dynamics of casual video. In: Proceedings of the 15th ACM International Conference on Multimedia, pp. 27–36 (2007)
Gleicher, M.L., Liu, F.: Re-cinematography: Improving the camerawork of casual video. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 5(1), 1–28 (2008)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2003)
Heck, R., Wallick, M., Gleicher, M.: Virtual videography. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 3(1), 4 (2007)
Ishikawa, A., Kato, D., Fukushima, H.: Measurement method of zooming by a cameraman. In: S.K. Park, R.D. Juday (eds.) Visual Information Processing VII, vol. 3387, pp. 13–24. International Society for Optics and Photonics, SPIE (1998). https://doi.org/10.1117/12.316406
Kato, D., Ishikawa, A., Tsuda, T., Shimoda, S., Fukushima, H.: Automatic control of a robot camera for broadcasting and subjective evaluation and analysis of reproduced images. In: Human Vision and Electronic Imaging V, vol. 3959, pp. 468–479. International Society for Optics and Photonics (2000)
Kumano, M., Ariki, Y., Tsukada, K.: A method of digital camera work focused on players and a ball. In: Pacific-Rim Conference on Multimedia, pp. 466–473. Springer (2004)
Lavigne, F., Chen, F., Desurmont, X.: Automatic video zooming for sport team video broadcasting on smart phones. In: P. Richard, J. Braz (eds.) VISAPP 2010—Proceedings of the Fifth International Conference on Computer Vision Theory and Applications, Angers, France, May 17–21, 2010, Vol. 1, pp. 157–163. INSTICC Press (2010)
Laws of the Game. https://img.fifa.com/image/upload/datdz0pms85gbnqy4j3k.pdf. Accessed: 20 Mar, 2021
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision—ECCV 2014, pp. 740–755. Springer, Cham (2014)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.E., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: B. Leibe, J. Matas, N. Sebe, M. Welling (eds.) ECCV (1), Lecture Notes in Computer Science, vol. 9905, pp. 21–37. Springer (2016). http://dblp.uni-trier.de/db/conf/eccv/eccv2016-1.html#LiuAESRFB16
Margolis, E., Pauwels, L.: The Sage Handbook of Visual Research Methods. Sage, London (2011)
Matsui, K., Iwase, M., Agata, M., Tanaka, T.T., Ohnishi, N.: Soccer image sequence computed by a virtual camera. In: Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No. 98CB36231), pp. 860–865. IEEE (1998)
Owens, J.: Television Sports Production. CRC Press, London (2015)
Pidaparthy, H., Elder, J.: Keep your eye on the puck: Automatic hockey videography. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1636–1644. IEEE (2019)
Quiroga, J., Carrillo, H., Maldonado, E., Ruiz, J., Zapata, L.M.: As seen on tv: Automatic basketball video production using gaussian-based actionness and game states recognition. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 3911–3920 (2020). https://doi.org/10.1109/CVPRW50498.2020.00455
Rachavarapu, K.K., Kumar, M., Gandhi, V., Subramanian, R.: Watch to edit: Video retargeting using gaze. In: Computer Graphics Forum, vol. 37, pp. 205–215. Wiley Online Library (2018)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Shi, W., Cao, J., Zhang, Q., Li, Y., Xu, L.: Edge computing: vision and challenges. IEEE Internet Things J. 3(5), 637–646 (2016). https://doi.org/10.1109/JIOT.2016.2579198
Stensland, H.K., Gaddam, V.R., Tennøe, M., Helgedagsrud, E., Næss, M., Alstad, H.K., Griwodz, C., Halvorsen, P., Johansen, D.: Processing panorama video in real-time. Int. J. Seman. Comput. 8(02), 209–227 (2014)
Thomas, G., Gade, R., Moeslund, T., Carr, P., Hilton, A.: Computer vision for sports: current applications and research topics. Comput. Vis. Image Underst. 159, 3–18 (2017)
Thrun, S., Burgard, W., Fox, D.: Probabilistic Robotics (Intelligent Robotics and Autonomous Agents). The MIT Press, New York (2005)
Tsuda, T., Kato, D., Ishikawa, A., Inoue, S.: Automatic tracking sensor camera system. In: Machine Vision Applications in Industrial Inspection IX, vol. 4301, pp. 144–153. International Society for Optics and Photonics (2001)
Tsuda, T., Okuda, M., Mutou, K., Yanagisawa, H., Kubo, N., Date, Y.: A high-accuracy image composition system using a mobile robotic camera. SMPTE Motion Imaging J. 117(2), 30–37 (2008)
Unleashing the New Era of Long-Tail OTT Content (2020). https://www.sportbusiness.com/2020/01/genius-sports-whitepaper-unleashing-the-new-era-of-long-tail-ott-content/. Accessed: 02 Mar, 2020
Wang, J., Xu, C., Chng, E., Lu, H., Tian, Q.: Automatic composition of broadcast sports video. Multimedia Syst. 14(4), 179–193 (2008)
Yokoi, T., Fujiyoshi, H.: Virtual camerawork for generating lecture video from high resolution images. In: 2005 IEEE International Conference on Multimedia and Expo, pp. 4. IEEE (2005)
Zivkovic, Z., van der Heijden, F.: Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recognit. Lett. 27(7), 773–780 (2006). https://doi.org/10.1016/j.patrec.2005.11.005
Author information
Authors and Affiliations
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Carrillo, H., Quiroga, J., Zapata, L. et al. Automatic football video production system with edge processing. Machine Vision and Applications 33, 32 (2022). https://doi.org/10.1007/s00138-022-01283-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-022-01283-0