Skip to main content
Log in

Multimodal feature extraction and fusion for semantic mining of soccer video: a survey

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

This paper presents a classified review of soccer video analysis works. The existing approaches in the aspects of highlight event detection, video summarization and retrieval based on video stream, ball and player tracking for provision of match statistics, technical and tactical analysis and application of different sources in soccer video analysis have been surveyed. In addition, some major existing commercial softwares developed for video analysis are introduced and compared. With regard to the existing challenge for automatic and realtime provision of video analysis, different computer vision approaches are discussed and compared. Audio, video and text feature extraction methods have been investigated and the future trends for improvement of the reviewed systems have been introduced in terms of response time optimization, increase of precision and eliminating the need of human intervention for video analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abreu P, Moura J, Silva DC, Reis LP, Garganta J (2010) Football scientia—an automated tool for professional soccer coaches. IEEE conference on cybernetics and intelligent systems, pp 126–131

  • Ariki Y, Kubota S, Kumano M (2006) Automatic production system of soccer sports video by digital camera work based on situation recognition. In: Proceedings of 8th IEEE inter symposium on multimedia

  • Assfalg J, Bertini M, Colombo C, Del Bimbo A, Nunziati W (2003) Semantic annotation of soccer videos: automatic highlights identification. Comput Vis Image Underst 92: 285–305

    Article  Google Scholar 

  • Ballan L, Bertini M, Del Bimbo A, Serra G (2010) Semantic annotation of soccer videos by visual instance clustering and spatial temporal reasoning in ontologies. Multimed Tools Appl 48: 313–337

    Article  Google Scholar 

  • Ballan L, Bertini M, Del Bimbo A, Seidenari L, Serra G (2011) Event detection and recognition for semantic annotation of video. Multimed Tools Appl 51: 279–302

    Article  Google Scholar 

  • Bayar M, Alan OZ, Akpinar S, Sabuncu O, Cicekli NK, Alpaslan FN (2010) Event boundary detection using audio-visual features and web-casting texts with imprecise time information. IEEE international conference on multimedia and expo, pp 578–583

  • Beetz M, Hoyningen-Huene NV, Bandouch J, Kirchlechner B, Gedikli S, Maldonado A (2006) Camera-based observation of football games for analyzing multi-agent activities. In: Proceedings of the 5th international joint conference on autonomous agents and multiagent systems, pp 42–49

  • Cheng CC, Hsu CT (2006) Fusion of audio and motion information on HMM-based highlight extraction for baseball games. IEEE Trans Multimed 8: 585–599

    Article  Google Scholar 

  • Chen SC, Shyu ML, Zhang C (2005) Innovative shot boundary detection for video indexing. Video data management and information retrieval, pp 217–236

  • Choi K, Seo Y (2005) Tracking soccer ball in TV broadcast video. In: Proceedings of international conference of image analysis and processing, pp 661–668

  • D’Orazio T, Leo M (2010) A review of vision-based systems for soccer video analysis. Pattern Recognit 43: 2911–2926

    Article  Google Scholar 

  • D’Orazio T, Ancona N, Cicirelli G, Nitti M (2002) A ball detection algorithm for real soccer image sequences. International conference on pattern recognition, pp 210–213

  • D’Orazio T, Guaragnella C, Leo M, Distante A (2004) A new algorithm for ball recognition using circle hough transform and neural classifier. Pattern Recognit 37: 393–408

    Article  Google Scholar 

  • D’Orazio T, Leo M, Spagnolo P, Nitti M, Mosca N (2009a) A visual system for real time detection of goal events during soccer matches. Comput Vis Image Underst 113: 622–632

    Article  Google Scholar 

  • D’Orazio T, Leo M, Spagnolo P, Mazzeo PL, Mosca N, Nitti M, Distante A (2009b) An investigation into the feasibility of real-time soccer offside detection from a multiple camera system. IEEE Trans Circuits Syst Video Technol 19:1804–1818

    Google Scholar 

  • De Sousa J’unior SF, De A. Ara’ujo A, Menotti D (2011) An overview of automatic event detection in soccer matches. IEEE workshop on applications of computer vision, pp 31–38

  • Du W, Piater J (2007) Multi-camera people tracking by collaborative particle filters and principal axis-based integration. In: Proceedings of the 8th Asian conference on computer vision, pp 365–374

  • Du W, Hayet J. B, Piater J, Verly J (2006) Collaborative Multi camera tracking of athletes in team sports. In: Workshop on computer vision based analysis in sport environments, pp 2–13

  • Duan LY, Xu M, Chua TS, Qi T, Xu CS (2003) A mid-level representation framework for semantic sports video analysis. In: Proceedings of 11th ACM international conference on multimedia, pp 33–44

  • Ekin A, Takalp AM, Mehrotra R (2003) Automatic soccer video analysis and summarization. IEEE Trans Image Process 12: 796–807

    Article  Google Scholar 

  • Eldib MY, AbouZaid BS, Zawbaa HM, Zahar ME, Saban ME (2009) Soccer video summarization using enhanced logo detection. IEEE Int Conf Image Process 43: 45–4348

    Google Scholar 

  • Figueroa PJ, Leite NJ, Barros RML (2006) Tracking soccer players aiming their kinematical motion analysis. Comput Vis Image Underst 101: 122–135

    Article  Google Scholar 

  • Gao Y, Wang WB, Yong JH, Gu HJ (2009) Dynamic video summarization using two-level redundancy detection. Multimed Tools Appl 42: 233–250

    Article  Google Scholar 

  • Gao X, Niu Zh, Tao D, Li X (2011) Non-goal scene analysis for soccer video. Neurocomputing 74: 540–548

    Article  Google Scholar 

  • Gedikli S, Bandouch J, Hoyningen-Huene N, Kirchlechner B, Beetz M (2007) An adaptive vision system for tracking soccer players from variable camera settings. In: Proceedings of the 5th international conference on computer vision systems, pp 21–24

  • Gonzales R, Woods R (2008) Digital image processing, 3rd edn. Prentice-Hall, Upper Saddle River, NJ

    Google Scholar 

  • Hartley R, Zisserman A (2000) Multiple view geometry in computer vision. Cambridge University Press, Cambridge, UK

    MATH  Google Scholar 

  • Hashimoto S, Ozawa S (2006) A system for automatic judgment of offsides in soccer games. In: Proceedings of IEEE international conference on multimedia and expo, pp 1889–1892

  • Hossein-Khani J, Soltanian-Zadeh H, Kamarei M, Staadt O (2011) Ball detection with the aim of corner event detection in soccer video. 9th IEEE international symposium on parallel and distributed processing with applications workshops, pp 147–152

  • Hu Sh (2010) Personalized content adaptation using multimodal highlights of soccer video. Proceedings of the 11th Pacific rim conference on advances in multimedia information processing, pp 537–548

  • Hu Sh, Jia Y, Tan Sh (2010) Content aware retargeting of soccer video. 2nd international conference on information science and engineering, pp 1–4

  • Huang CL, Shih HC, Chao CY (2006) Semantic analysis of soccer video using dynamic Bayesian network. IEEE Trans Multimed 8: 749–760

    Article  Google Scholar 

  • Jiang Sh, Huang Q, Gao W (2007) Mining information of Attack-Defense status from soccer video based on scene analysis. IEEE international conference on multimedia and expo, pp 1095–1098

  • Joo SW, Chellappa R (2007) A multiple-hypothesis approach for multi object visual tracking. IEEE Trans Image Process 16: 2849–2854

    Article  MathSciNet  Google Scholar 

  • Kang C, Hwang J, Li NK (2006) Trajectory analysis for soccer players. In: Proceedings of the 6th IEEE international conference on data mining workshops, pp 377–381

  • Kang YL, Lim JH, Kankanhalli MS, Xu CS, Tian Q (2004) Goal detection in soccer video using audio/visual keywords. In: Proceedings of IEEE international conference on image processing (ICIP), pp 1629–1632

  • Khatoonabadi HS, Rahmati M (2009) Automatic soccer players tracking in goal scenes by camera motion elimination. Image Vis Comput 27: 469–479

    Article  Google Scholar 

  • Kim HG, Roeber S, Samour A, Sikora T (2005) Detection of goal event in soccer videos. In: Proceedings of storage and retrieval methods and applications for multimedia, pp 317–325

  • Kolekar MH (2010) Bayesian belief network based broadcast sports video indexing. Multimed Tools Appl 54: 27–54

    Article  Google Scholar 

  • Kolekar MH, Palaniappan K, Sengupta S, Seetharaman G (2009a) Semantic concept mining based on hierarchical event detection for soccer video indexing. J Multimed 4: 298–312

    Google Scholar 

  • Kolekar MH, Palaniappan K, Sengupta S, Seetharaman G (2009b) Event detection and semantic identification using bayesian belief network. Workshop of IEEE 12th international conference on computer vision, Japan, pp 554–561

  • Leonardi R, Migliorati P, Prandini M (2004) Semantic indexing of soccer audio-visual sequences: a multimodal approach based on controlled Markov chains. IEEE Trans Circuits Syst Video Technol 14: 634–643

    Article  Google Scholar 

  • Liu Y, Liang D, Huang Q, Gao W (2006) Extracting 3D information from broadcast soccer video. Image Vis Comput 24: 1146–1162

    Article  Google Scholar 

  • Liu J, Tong X, Li W, Wang T, Zhang Y, Wang H, Yang B, Sun L, Yang S (2007) Automatic player detection, labeling and tracking in broadcast soccer video. In: Proceedings of British machine vision conference

  • Liu J, Tong X, Li W, Wang T, Zhang Y, Wang H (2009a) Automatic player detection, labeling and tracking in broadcast soccer video. Pattern Recognit Lett 30: 103–113

    Article  Google Scholar 

  • Liu J, Tong X, Li W, Wang T, Zhang Y, Wang H (2009b) Automatic player detection, labeling and tracking in broadcast soccer video. Pattern Recognit Lett 30: 103–113

    Article  Google Scholar 

  • Masui K, Dao MS, Babaguchi N (2010) Modeling visual information by spatio-temporal patterns to analyze event tactic in sports video. 2nd European workshop on visual information processing, pp 198–203

  • Misu T, Gohshi S, Izumi Y, Fujita Y, Naemura M (2004) Robust tracking of athletes using multiple features of multiple views. In: Proceedings of international conference in central Europe on computer graphics, visualization and computer vision, pp 285–292

  • Misu T, Matsui A, Naemura M, Fujii M, Yagi N (2007) Distributed particle filtering for multiocular soccer ball tracking. In: Proceedings of IEEE international conference on acoustic, speech and signal processing, pp 937–940

  • Miura J, Kubo H (2008) Tracking players in highly complex scenes in broadcast soccer video using a constraint satisfaction approach. In: Proceedings of CIVR

  • Miura J, Shimawaki T, Sakiyama T, Shirai Y (2009) Ball route estimation under heavy occlusion in broadcast soccer video. Comput Vis Image Underst 113: 653–662

    Article  Google Scholar 

  • Money AG, Agius H (2007) Video summarization: a conceptual framework and survey of the state of the art. J Vis Commun Image Represent 19: 121–143

    Article  Google Scholar 

  • Nguyen VT, Ly NQ (2010) Query events in soccer video using on-screen texts. IEEE RIVF international conference, pp 1–4

  • Nillius P, Sullivan J, Carlsson S (2006) Multi target tracking linking identities using Bayesian network inference. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 2187–2194

  • Nitta N, Takahashi Y, Babaguchi N (2009) Automatic personalized video abstraction for sports videos using metadata. Multimed Tools Appl 41: 1–25

    Article  Google Scholar 

  • Otsuka I, Nakane K, Divakaran A, Hatanaka K, Ogawa M (2005) A highlight scene detection and video summarization system using audio feature for a personal video recorder. IEEE Trans Consumer Electron 51: 112–116

    Article  Google Scholar 

  • Pallavi V, Mukherjee J, Majumdar AK, Sural Sh (2008a) Graph-based multiplayer detection and tracking in broadcast soccer videos. IEEE Trans Multimed 10: 794–805

    Article  Google Scholar 

  • Pallavi V, Mukherjee J, Majumdar AK, Sural Sh (2008b) Ball detection from broadcast soccer videos using static and dynamic features. J Vis Commun Image Represent 19: 426–436

    Article  Google Scholar 

  • Pan H, Van Beek P, Sezan MI (2001) Detection of slow-motion replay segments in sports video for highlights generation. In: Proceedings of IEEE international conference on acoustics, speech and signal processing, pp 1649–1652

  • Pei C, Gao L, Yang S, Hou C (2009) A ROI detection model for soccer video on small display. In: Proceedings of 3rd international symposium on intelligent information technology application, pp 392–395

  • Ping Sh, Qing YX (2009) Goal event detection in soccer videos using multi-clues detection rules. 9th international conference on management and service science, pp 1–4

  • Poppe Ch, Bruyne SD, Walle RVD (2010) Generic architecture for event detection in broadcast sports video. ACM AIEM Proc 10: 51–56

    Google Scholar 

  • Poppe Ch, Bruyne SD, Verstockt S, Van de Walle R (2010) Multi-camera analysis of soccer sequences. 17th IEEE international conference on advanced video and signal based surveillance, pp 26–31

  • Qian X, Wang H, Liu G, Hou X (2010) A novel approach for soccer video summarization. 2nd international conference on multimedia and information technology, pp 138–141

  • Qian X, Wang H, Liu G, Hou X (2011) HMM based soccer video event detection using enhanced mid-level semantic. Multimed Tools Appl 55: 1–23

    Article  Google Scholar 

  • Ren J, Orwell J, Jones G, Xu M (2008) Real-time modeling of 3-d soccer ball trajectories from multiple fixed cameras. IEEE Trans Circuits Syst Video Technol 18: 350–362

    Article  Google Scholar 

  • Ren J, Orwell J, Jones GA, Xu M (2009) Tracking the soccer ball using multiple fixed cameras. Comput Vis Image Underst 113: 633–642

    Article  Google Scholar 

  • Ren J, Xu M, Orwell J, Jones GA (2010) Multi-camera video surveillance for real-time analysis and reconstruction of soccer games. Mach Vis Appl 21: 855–863

    Article  Google Scholar 

  • Shen J, Tao D, Li X (2008) Modality mixture projections for semantic video event detection. IEEE Trans Circuits Syst Video Technol 18: 1587–1596

    Article  Google Scholar 

  • Shyu ML, Xie Z, Chen M, Chen Sh. Ch (2008) Video semantic event/concept detection using a subspace-based multimedia data mining framework. IEEE Trans Multimed 10: 252–259

    Article  Google Scholar 

  • Seo K, Ko J, Ahn I, Kim Ch (2007) An intelligent display scheme of soccer video on mobile devices. IEEE Trans Circuits Syst Video Technol 17: 1395–1401

    Article  Google Scholar 

  • Snoek CGM (2005) The authoring metaphor to machine understanding of multimedia. Doctor of Philosophy Thesis

  • Spagnolo P, Mazzeo PL, Leo M, D’Orazio T (2007) Unsupervised algorithms for segmentation and clustering applied to soccer players classification. In: Proceedings of the international conference on signal processing and multi-media applications, pp 129–134

  • Sullivan J, Carlsson S (2006) Tracking and labeling of interacting multiple targets. In: Proceedings of 9th European conference on computer vision, pp 619–632

  • Sun L, Liu G (2009) Field lines and players detection and recognition in soccer video. IEEE international conference on acoustics, speech and signal processing, pp 1237–1240

  • Taki T, Hasegawa J, Fukumura T (1996) Development of motion analysis system for quantitative evaluation of teamwork in soccer games. International conference of image processing, pp 815–818

  • Tjondronegoro DW, Chen YP, Pham B (2003) Classification of self-consumable highlights for soccer video summaries. In: Proceedings of IEEE international conference on multimedia and expo, pp 579–582

  • Tjondronegoro DW, Phoebe Chen YP (2009) Knowledge-discounted event detection in sports video. IEEE Trans Syst Man Cybern 40: 1009–1024

    Article  Google Scholar 

  • Tong X, Lu H, Liu Q (2004) An effective and fast soccer ball detection and tracking method. International conference on pattern recognition, pp 795–798

  • Vandenbroucke N, Macaire L, Postaire JG (2003) Color image segmentation by pixel classification in an adapted hybrid color space: Application to soccer image analysis. Comput Vis Image Underst 90: 190–216

    Article  Google Scholar 

  • Wang J, Xu Ch, Chng E, Tian Q (2004) Sports highlight detection from keyword sequences using HMM. In: Proceedings IEEE ICME 27–30

  • Wang F, Ma YF, Zhang HJ, Li JT (2005a) A generic framework for semantic sports video analysis using dynamic Bayesian networks. In: Proceedings of the 11th international multimedia modelling conference, Melbourne, Australia, pp 115–122

  • Wang F, Ma YF, Zhang HJ, Li JT (2005b) A generic framework for semantic sports video analysis using dynamic Bayesian networks. In: Proceedings of the 11th international multimedia modelling conference, Melbourne, Australia, pp 115–122

  • Wickramaratna K, Chen M, Chen Sh.Ch, Shyu ML (2005) Neural network based framework for goal event detection in soccer videos. In: Proceedings of seventh IEEE inter symposium on multimedia, pp 21–28

  • Xie L, Chang SF, Divakaran A, Sun H (2003) Unsupervised discovery of multilevel statistical video structures using hierarchical hidden Markov models. IEEE international conference on multimedia and expo, pp 29–32

  • Xie L, Xu P, Chang SF, Divakaran A, Sun H (2004) Structure analysis of soccer video with domain knowledge and hidden Markov models. Pattern Recognit Lett 25: 767–775

    Article  Google Scholar 

  • Xie1 Z, Shyu M.L, Chen Sh.Ch (2007) Video event detection with combined distance-based and rule based data mining techniques. IEEE international conference on multimedia and expo, pp 2026–2029

  • Xu Z, Shi P (2005) Segmentation of player and team discrimination in soccer video. In: Proceedings of the IEEE international workshop on VLSI design and video technology, pp 369–372

  • Xu W, Yi Y (2011) A robust replay detection algorithm for soccer video. IEEE Signal Process Lett 18: 509–512

    Article  Google Scholar 

  • Xu M, Orwell J, Lowey L, Thirde D (2005) Architecture and algorithms for tracking football players with multiple cameras. IEEE Proc Vis Image Signal Process 152: 232–241

    Article  Google Scholar 

  • Xu Ch, Zhang Y. F, Zhu G, Rui Y, Lu H, Huang Q (2008a) Using webcast text for semantic event detection in broadcast sports video. IEEE Trans Multimed 10: 1342–1355

    Article  Google Scholar 

  • Xu Ch, Wang J, Lu H, Zhang Y (2008b) A novel framework for semantic annotation and personalized retrieval of sports video. IEEE Trans Multimed 10: 421–436

    Article  Google Scholar 

  • Xu Ch, Cheng J, Zhang Y, Zhang Y, Lu H (2009) Sports video analysis: semantics extraction editorial content creation and adaptation. J Multimed 4: 69–79

    Google Scholar 

  • Yang YQ, Lu YD, Chen W (2004) A framework for automatic detection of soccer goal event based on cinematic template. In: Proceedings of 2004 international conference on machine learning and cybernetics, pp 3759–3764

  • Yang Y, Lin S, Zhang Y, Tang S (2007) Highlights extraction in soccer videos based on goal-mouth detection. ISSPA, pp 1–4

  • Yilmaz A, Javed O, Shah M (2006) Object tracking: a survey. ACM Comput Surv 38: 13–58

    Article  Google Scholar 

  • Yoon HS, Bae YLJ, Yang YK (2002) A soccer image sequence mosaicing and analysis method using line and advertisement board detection. ETRI J 24: 443–454

    Article  Google Scholar 

  • Yong LH, Tingting H (2009) Integrating multiple feature fusion for semantic event detection in soccer video. International joint conference on artificial intelligence, pp 128–131

  • Yu X, Li L, Leong HW (2009) Interactive broadcast services for live soccer video based on instant semantics acquisition. J Vis Commun Image Represent 20: 117–130

    Article  Google Scholar 

  • Yu X, Leong HW, Xu C, Tian Q (2006) Trajectory-based ball detection and tracking in broadcast soccer video. IEEE Trans Multimed 8: 1164–1178

    Article  Google Scholar 

  • Yu X, Xu C, Leong H. W, Tian Q, Wan K. W (2003) Trajectory-based ball detection and tracking with applications to semantic analysis of broadcast soccer video. ACM conference on multimedia, pp 11–20

  • Zhu G, Xu C, Zhang Y, Huang Q, Lu H (2008) Event tactic analysis based on player and ball trajectory in broadcast video. In: Proceedings of conference on image and video retrieval, pp 515–524

  • Zhu G, Huang Q, Xu C, Rui Y, Jiang S, Gao W, Yao H (2007) Trajectory based event tactics analysis in broadcast sports video. In: Proceedings of the 15th international conference on multimedia, pp 58–67

  • Zhu G, Xu Ch, Huang Q, Rui Y, Jiang Sh, Gao W, Yao H (2009) Event tactic analysis based on broadcast sports video. IEEE Trans Multimed 11: 49–67

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Payam Oskouie.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oskouie, P., Alipour, S. & Eftekhari-Moghadam, AM. Multimodal feature extraction and fusion for semantic mining of soccer video: a survey. Artif Intell Rev 42, 173–210 (2014). https://doi.org/10.1007/s10462-012-9332-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-012-9332-4

Keywords

Navigation