Abstract
In this article, we propose a novel and effective system based on multiple cameras to extract the events for soccer matches. A precise ontological definition of the soccer events is still an open point. According to our definition, the events include the free kick, corner kick, penalty kick and the goal, because they are the representative shots for the audience to watch. The events are very important for highlights selection and sport data analysis. At present, the events including the ball and players information are selected and labeled manually from the images, which is a big workload for the staffs. Addressing this problem, our system provides an automatic extraction of the events. For soccer videos, our system first uses the local-based deep neural network for the ball and player detection from the input images. Then, we handle with the ball and player bounding boxes separately. For players, a player can be labeled as one of the three types: two teams or the referee, and a novel unsupervised U-encoder is designed for the player labeling. For soccer ball, the application of multiple cameras allows us to refine the ball detection results. We can get the world coordinate of ball according to the camera parameters and then rebuild the ball trajectory and the court in a top view. Based on the reconstructed map, we get the soccer events by motion analysis of ball trajectory and then apply the ball location and player classification results to display the events for each camera. The test results on real videos of European soccer league show the good detection and labeling performance of our system. We find all the events in the test videos. Our proposed system can deal with many complex cases such as occlusion and pose variation that happen frequently in real applications.
Similar content being viewed by others
Notes
This work is supported by the National Key Research and Development Program of China (No. 2018YFC0116800).
The dataset is provided by Union of European Football Associations.
References
Assfalg J, Bertini M, Del Bimbo A, Nunziati W (2002) Soccer highlights detection and recognition using HMMS. In: Proceedings of the IEEE international conference on multimedia and expo, 2002 (ICME’02), vol 1, pp 825–828
Bayat F, Moin MS, Bayat F (2014) Goal detection in soccer video: role-based events detection approach. Int J Electr Comput Eng 4(6):2088–8708
Cai Q, Aggarwal JK (1996) Tracking human motion using multiple cameras. In: International conference on pattern recognition, p 68
Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387
D’Orazio T, Ancona N, Cicirelli G, Nitti M (2002) A ball detection algorithm for real soccer image sequences. In: Proceedings of the international conference on pattern recognition, vol 1, pp 210–213
Girshick R (2015) Fast R-CNN. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 580–587
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Huang Y, Llach J, Bhagavathy S (2007) Players and ball detection in soccer videos based on color segmentation and shape analysis. In: Proceedings of multimedia content analysis and mining, international workshop (MCAM 2007), Weihai, China, June 30–July 1, 2007, pp 416–425
Javed O, Rasheed Z, Shafique K, Shah M (2003) Tracking across multiple cameras with disjoint views. In: IEEE international conference on computer vision, p 952
Khan S, Shah M (2003) Consistent labeling of tracked objects in multiple cameras with overlapping fields of view. IEEE Trans Pattern Anal Mach Intell 25(10):1355–1360
Khan A, Lazzerini B, Calabrese G, Serafini L (2018) Soccer event detection. In: 4th international conference on image processing and pattern recognition (IPPR 2018). AIRCC Publishing Corporation, pp 119–129
Kim K, Davis LS (2006) Multi-camera tracking and segmentation of occluded people on ground plane using search-guided particle filtering. In: European conference on computer vision, pp 98–109
Kolmogorov V, Zabih R (2002) Multi-camera scene reconstruction via graph cuts. In: European conference on computer vision, pp 82–96
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105
Li Z, Tang J, He X (2018) Robust structured nonnegative matrix factorization for image representation. IEEE Trans Neural Netw Learn Syst 29(5):1947–1960
Li Z, Tang J, Tao M (2018) Deep collaborative embedding for social image understanding. In: IEEE transactions on pattern analysis and machine intelligence, p 1
Liu J, Tong X, Li W, Wang T, Zhang Y, Wang H (2009) Automatic player detection, labeling and tracking in broadcast soccer video. Pattern Recognit Lett 30(2):103–113
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In: European conference on computer vision. Springer, Cham, pp 21–37
Moore BC, Moore BC (1981) Principle component analysis in linear systems: controllability, observability, and model reduction. IEEE Trans Autom Control 26(1):17–32
Peursum P, Venkatesh S, West GAW, Bui HH (2003) Object labelling from human action recognition. In: IEEE international conference on pervasive computing and communications, pp 399–406
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: IEEE conference on computer vision and pattern recognition, pp 6517–6525
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767
Ren S, He K, Girshick R, Sun J (2015) Faster r-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137
Shelhamer E, Long J, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640
Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 761–769
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Tong XF, Lu HQ, Liu QS (2004) An effective and fast soccer ball detection and tracking method. In: International conference on pattern recognition, vol 4, pp 795–798
Van der Maaten L, Hinton G, Van Der Maaten L (2017) Visualizing data using t-SNE. J Mach Learn Res 9(2605):2579–2605
Wang X, Han TX, Yan S (2010) An hog-LBP human detector with partial occlusion handling. In: IEEE international conference on computer vision, pp 32–39
Xu D, Chang SF (2007) Visual event recognition in news video using kernel methods with multi-level temporal alignment. In: IEEE conference on computer vision and pattern recognition, pp 1–8
Xu M, Maddage NC, Xu C, Kankanhalli M, Tian Q (2003) Creating audio keywords for event detection in soccer video. In: ICME, pp 281–284
Yang B, Yan J, Lei Z, Li SZ (2016) Craft objects from images. In: Computer vision and pattern recognition, pp 6043–6051
Ye Q, Huang Q, Gao W, Jiang S (2005) Exciting event detection in broadcast soccer video with mid-level description and incremental learning. In: ACM international conference on multimedia, Singapore, pp 455–458
Yu X, Leong HW, Xu C, Tian Q (2006) Trajectory-based ball detection and tracking in broadcast soccer video. IEEE Trans Multimed 8(6):1164–1178
Yu X, Tian Q (2003) A novel ball detection framework for real soccer video. In: ICME, pp 265–268
Zhang D, Chang SF (2002) Event detection in baseball video using superimposed caption recognition. In: International ACM conference on multimedia, pp 315–318
Zhu X, Jin X, Zhang X, Li C, He F, Wang L (2015) Context-aware local abnormality detection in crowded scene. Sci China Inf Sci 58(5):52110–052110
Zhu X, Jing L, Wang J, Li C, Lu H (2014) Sparse representation for robust abnormality detection in crowded scenes. Pattern Recognit 47(5):1791–1799
Zivkovic Z, Krose B (2004) An EM-like algorithm for color-histogram-based object tracking. In: Proceedings of the 2004 IEEE computer society conference on computer vision and pattern recognition, 2004 (CVPR 2004), vol 1, pp I-798–I-803
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, K., Wu, J., Tong, X. et al. An automatic multi-camera-based event extraction system for real soccer videos. Pattern Anal Applic 23, 953–965 (2020). https://doi.org/10.1007/s10044-019-00830-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-019-00830-2