Skip to main content
Log in

HMM based soccer video event detection using enhanced mid-level semantic

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Highlight detection is a fundamental step in semantics based video retrieval and personalized sports video browsing. In this paper, an effective hidden Markov models (HMMs) based soccer video event detection method based on a hierarchical video analysis framework is proposed. Soccer video shots are classified into four coarse mid-level semantics: global, median, close-up and audience. Global and local motion information is utilized for the refinement of coarse mid-level semantics. Sequential soccer video is segmented into event clips. Both the temporal transitions of the mid-level semantics and the overall features of an event clip are fused using HMMs to determine the type of event. Highlight detection performance of dynamic Bayesian networks (DBN), conditional random fields (CRF) and the proposed HMM based approach are compared. The average F-score of our highlights (including goal, shoot, foul and placed kick) detection approach is 82.92%, which outperforms that of DBN and CRF by 9.85% and 11.12% respectively. The effects of number of hidden states, overall features, and the refinement of mid-level semantics on the event detection performance are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Assfalg J, Bertini M, Colombo C, Bimbo A, Nunziati W (2003) Semantic annotation of soccer videos: automatic highlight identification. Comput Vis Image Underst 6(4):285–305

    Article  Google Scholar 

  2. Chen S, Chen M, Zhang C, Shyu M (2006). Exciting event detection using multi-level multimodal descriptors and data classification. in Proc. ISM.

  3. Cheng C, Hsu C (2006) Fusion of audio and motion information on HMM-based highlight extraction for baseball games. IEEE Trans Multimedia 8(3):585–599

    Article  Google Scholar 

  4. Dalal N, Triggs B (2005) Histogram of oriented gradients for human detection. In Proc. Int. Conf. Computer Vision and Pattern Recognition

  5. Dao M, Babaguchi N (2008) Mining temporal information and web-casting text for automatic sports event detection. In Proc. MMSP, pp.616–621

  6. Dao M, Babaguchi N (2008) Sports event detection using temporal patterns mining and web-casting text. In Proc. ACM AREA, pp. 33–40

  7. Duan L, Xu M, Chua T, Tian Q, Xu C (2003) A mid-level representation framework for semantic sports video analysis. In Proc. ACM Multimedia, pp. 29–32

  8. Duan L, Xu M, Tian Q (2003) Semantic shot classification in sports video. In Proc. SPIE Storage and Retrieval for Media Database 5021:300–313

  9. Duan L, Xu M, Tian Q, Xu C, Jin JS (2005) A unified framework for semantic shot classification in sports video. IEEE Trans Multimedia 7(6):1066–1083

    Article  Google Scholar 

  10. Ekin A, Tekalp A (2003) Generic play-break event detection for summarization and hierarchical sports video analysis. In Proc. Int. Conf. Mulmedia and Expo 1:169–172

  11. Ekin A, Tekalp A, Mehrotra R (2003) Automatic soccer video analysis and summarization. IEEE Trans Image Process 12(7):796–807

    Article  Google Scholar 

  12. Hanjialic A (2003) Generic approach to highlights extraction from a sports video. In Proc. Int. Conf. Image Processing 1: 1–4

  13. Huang C, Shih H, Chao C (2006) Semantic analysis of soccer video using dynamic Bayesian network. IEEE Trans Multimedia 8(4):749–760

    Article  Google Scholar 

  14. Jin G, Tao L, Xu G (2004) Hidden markov model based events detection in soccer video. ICIAR 2004, LNCS 3221:605–612

    Google Scholar 

  15. Li B, Errico J, Pan H, Sezan M (2004) Bridging the semantic gap in sports video retrieval and summarization. J Vis Commun Image R 17:393–424

    Google Scholar 

  16. Lien C, Chiang C, Lee C (2007) Scene-based event detection for baseball videos. J Vis Commun Image R 18:1–14

    Article  Google Scholar 

  17. Lyu M, Song J, Cai M (2005) A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans Circ Syst Video Technol 15(2):243–255

    Article  Google Scholar 

  18. Mittal A, Cheong L, Leung T (2001) Dynamic bayesian framework for extracting temporal structure in video. In Proc. Int. Conf. Computer Vision and Pattern Recognition, pp. 110–115

  19. Nan N, Liu G, Qian X, Wang C (2008) An SVM-based soccer video shot classification scheme using projection histograms. PCM

  20. [online] http://worldcup.qq.com/zt2010/zongjie/index.htm

  21. Pan H, Beek P, Sezan M (2001) Detection of slow-motion replay segments in sports video for highlights generation. In Proc. Int. Conf. Acoustics, Speech, and Signal Processing 3:1649–1652, Salt Lake City, USA, May, 2001

  22. Pan H, Li B, Sezan M (2002). Automatic detection of replay segments in broadcast sports programs by detecting of logos in scene transitions. In Proc. Int. Conf. Acoustics, Speech, and Signal Processing 4:3385–3388, Orlando, FL, May 2002

  23. Papadopoulos G, Mezaris V, Kompatsiaris I, Strintzis M (2008) Accumulated motion energy fields estimation and representation for semantic event detection. In Proc. CIVR, pp. 221-230

  24. Qian X, Liu G (2007) Global motion estimation from randomly selected motion vector groups and GM/LM based applications. Signal, Image and Video Processing

  25. Qian X, Liu G, Wang H, Su R (2007) Text detection, localization and tracking in compressed videos. Signal Process Image Commun 22:752–768

    Article  Google Scholar 

  26. Qian X, Liu G, Guo D, Li Z, Wang Z, Wang H (2009) Object categorization using hierarchical wavelet packet texture descriptors. In Proc. ISM, pp. 44–51

  27. Qian X, Wang H, Liu G, Li Z, Wang Z (2010) Soccer video event detection by fusing middle level visual semantics of an event clip. In Proc. PCM, pp. 439–451

  28. Qian X, Liu G, Wang Z, Li Z, Wang H (2010) Highlight events detection in soccer video using HCRF. In Proc. ICIMCS

  29. Rabiner L (1989) A tutorial on hidden markov models and selected applications in speech recognition. Proc IEEE 77(2):257–285

    Article  Google Scholar 

  30. Sadlier D, O’Connor N (2005) Event detection in field sports video using audio-visual features and a support vector Machine. IEEE Trans Circuits Syst Video Technol 15(10):602–615

    Article  Google Scholar 

  31. Snoek C, Worring M (2005) Multimedia event-based video indexing using time intervals. IEEE Trans Multimedia 7(4):638–647

    Article  Google Scholar 

  32. Su Y, Sun M, Hsu V (2005) Global motion estimation from coarsely sampled motion vector field and the applications. IEEE Trans Circuits Syst Video Technol 15(2):232–242

    Article  Google Scholar 

  33. Tjondronegoro DW, Chen Y, Pham B (2004) Classification of self-consumable highlights for soccer video summaries. In Proc. Int. Conf. Mulmedia and Expo pp. 579–582

  34. Wang Y, Liu Z, Huang J (2000) Multimedia content analysis using both audio and video clues. IEEE Signal Processing Magazine

  35. Wang F, Ma Y, Zhang H, Li J (2004) Dynamic Bayesian network based event detection for soccer highlight extraction. In Proc. Int. Conf. Image Processing, pp. 633–636

  36. Wang F, Ma Y, Zhang H, Li J (2005) A generic framework for semantic sports video analysis using dynamic Bayesian networks. In Proc. Int. Conf. Multimedia Modelling, pp. 29–32

  37. Wang T, Li J, Diao Q, Hu W, Zhang Y, Dulong C (2006) Semantic event detection using conditional random fields. In Proc. Computer Vision and Pattern Recognition Workshop, pp. 109–115

  38. Wickramaratna K, Chen M, Chen S, Shyu M (2005) Neural network based framework for goal event detection in soccer videos. In Proc. Int. Symposium on Multimedia. pp. 21–28

  39. Xie L, Chang S, Divakaran A, Sun H (2002) Structure analysis of soccer video with hidden Markov models. In Proc. Int. Conf. Acoustics, Speech, and Signal Processing, pp. 4096–4099

  40. Xiong Z, Radhakrishnan R, Divakaran A, Huang T (2005) Highlights extraction from sports video based on an audio-visual marker detection framework. In Proc. Int. Conf. Multimedia & Expo, pp. 29–32

  41. Xu P, Xie L, Chang S (2001) Algorithms and systems for segmentation and structure analysis in soccer video. In Proc. Int. Conf. Multimedia & Expo, pp. 184–187.

  42. Xu C, Wang J, Lu H, Zhang Y (2008) A novel framework for semantic annotation and personalized retrieval of sports video. IEEE Trans Multimedia 10(3):421–436

    Article  Google Scholar 

  43. Xu C, Zhang Y, Zhu G, Rui Y, Lu H, Huang Q (2008) Using webcast text for semantic event detection in broadcast sports video. IEEE Trans Multimedia 10(7):1342–1325

    Article  Google Scholar 

  44. Xu G, Ma Y, Zhang H, Yang S (2005) An HMM-based framework for video semantic analysis. IEEE Trans Circ Syst Video Technol 15(11):1422–1433

    Article  Google Scholar 

  45. Zhang D, Chang S (2002) Event detection in baseball video using superimposed caption recognition. In Proc. ACM Multimedia, Juan-les- Pins, France, Nov. 1, pp. 315–318

  46. Zhao Z, Jiang S, Huang Q, Zhu G (2006) Highlight summarization in sports video based on replay detection. In Proc. Int. Conf. Mulmedia and Expo pp. 1613–1616, Toronto, Ontario, Canada, July 2006

  47. Zhu X, Wu X, Elmagarmid A, Feng Z, Wu L (2005) Video data mining semantic indexing and event detection from the association perspective. IEEE Trans Knowl Data Eng 17(5):665–677

    Article  Google Scholar 

  48. Zhu G, Xu C, Huang Q, Rui Y, Jiang S, Gao W, Yao H (2009) Event tactic analysis based on broadcast sport video. IEEE Trans Multimedia 11(1):49–67

    Article  Google Scholar 

Download references

Acknowledgement

This work is supported by the National Natural Science Foundation of China No.60903121, Chinese Center University Foundation XJTU-HRT-002, and Microsoft Research Foundation FY11-RES-THEME-052. The authors give their special thanks to Wenjun Zeng with the Computer Science Department of University of Missouri for proof reading the paper and discussion.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xueming Qian or Xingsong Hou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Qian, X., Wang, H., Liu, G. et al. HMM based soccer video event detection using enhanced mid-level semantic. Multimed Tools Appl 60, 233–255 (2012). https://doi.org/10.1007/s11042-011-0817-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-011-0817-y

Keywords

Navigation