Skip to main content

Advertisement

Log in

Deep learning based basketball video analysis for intelligent arena application

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Given the tremendous growth of sport fans, the “Intelligent Arena”, which can greatly improve the fun of traditional sports, becomes one of the new-emerging applications and research topics. The development of multimedia computing and artificial intelligence technologies support intelligent sport video analysis to add live video broadcast, score detection, highlight video generation, and online sharing functions to the intelligent arena applications. In this paper, we have proposed a deep learning based video analysis scheme for intelligent basketball arena applications. First of all, with multiple cameras or mobile devices capturing the activities in arena, the proposed scheme can automatically select the camera to give high-quality broadcast in real-time. Furthermore, with basketball energy image based deep conventional neural network, we can detect the scoring clips as the highlight video reels to support the wonderful actions replay and online sharing functions. Finally, evaluations on a built real-world basketball match dataset demonstrate that the proposed system can obtain 94.59% accuracy with only less than 45m s processing time (i.e., 10m s broadcast camera selection, and 35m s for scoring detection) for each frame. As the outstanding performance, the proposed deep learning based basketball video analysis scheme is implemented into a commercial intelligent basketball arena application named “Standz Basketball”. Although the application had been only released for one month, it achieves the 85t h day download ranking place in the sport category of Chinese iTunes market.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. “Intelligent Arena,” https://www.huiti.com

  2. “Beikantai,” https://www.kantai.tv

  3. “WSC Sports”, https://www.wscouting.com

  4. “Catapult Sports,” https://www.catapultsports.com

  5. “Standz Basketball”, http://standz.com

  6. LibSVM toolkit, “http://www.csie.ntu.edu.tw/~cjlin/libsvm/

  7. Dollár P Piotr’s Computer Vision Matlab Toolbox (PMT). https://github.com/pdollar/toolbox

  8. “AppAnnie,” https://www.appannie.com/

References

  1. Baillie M, Jose JM (2003) Audio-based event detection for sports video. In: CIVR, pp 300–309

  2. Chen C, Wang O, Heinzle S, Carr P, Smolic A, Gross M (2013) Computational sports broadcasting: Automated director assistance for live sports. In: IEEE ICME, pp 1–6

  3. Chen Z, Cao J, Xia T, Song Y, Zhang Y, Li J (2011) Web video retagging. Multimedia Tools Appl 55(1):53–82

    Article  Google Scholar 

  4. Chen Z, Ngo C, Zhang W, Cao J, Jiang Y (2014) Name-face association in web videos: A large-scale dataset, baselines, and open issues. J Comput Sci Technol 29(5):785–798

    Article  Google Scholar 

  5. Chu L, Jiang S, Wang S, Zhang Y, Huang Q (2013) Robust spatial consistency graph model for partial duplicate image retrieval. IEEE Trans Multimedia 15(8):1982–1996

    Article  Google Scholar 

  6. Chu L, Wang S, Liu S, Huang Q, Pei J (2015) Alid: scalable dominant cluster detection. Proceedings of the VLDB Endowment 8(8):826–837

    Article  Google Scholar 

  7. Dollar P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–45

    Article  Google Scholar 

  8. Duan L, Xu M, Chua T, Tian Q, Xu C (2003) A mid-level representation framework for semantic sports video analysis. In: ACM MM, pp 33–44

  9. Ekin A, Tekalp AM, Mehrotra R (2003) Automatic soccer video analysis and summarization. IEEE Trans Image Process 12(7):796–807

    Article  Google Scholar 

  10. Foote E, Carr P, Lucey P, Sheikh Y, Matthews I (2013) One-man-band: a touch screen interface for producing live multi-camera sports broadcasts. In: Acm multimedia, pp 163–172

  11. Friedman J, Hastie T, Tibshirani R et al (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28(2):337–407

    Article  MathSciNet  MATH  Google Scholar 

  12. Gan C, Wang N, Yang Y, Yeung D, Hauptmann AG (2015) Devnet: A deep event network for multimedia event detection and evidence recounting. In: IEEE CVPR, pp 2568–2577

  13. Gygli M, Grabner H, Riemenschneider H, Van Gool L (2014) Creating summaries from user videos. In: ECCV, pp 505–520

  14. Hsu C, Chen H, Chou C, Ho C, Lee S (2014) Trajectory based jump pattern recognition in broadcast volleyball videos. In: ACM MM, pp 1117–1120

  15. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1106–1114

  16. Liu A, Xu N, Su Y, Lin H, Hao T, Yang Z (2015) Single/multi-view human action recognition via regularized multi-task learning. Neurocomputing 151:544–553

    Article  Google Scholar 

  17. Liu C (2009) Beyond pixels: Exploring new representations and applications for motion analysis. Ph.D. thesis, Cambridge, MA, USA

  18. Liu W, Liu J, Gu X, Liu K, Dai X, Ma H (2017) Deep learning based intelligent basketball arena with energy image. In: MMM, pp 601–613

  19. Liu W, Mei T, Zhang Y (2014) Instant mobile video search with layered audio-video indexing and progressive transmission. IEEE Trans Multimedia 16 (8):2242–2255

    Article  Google Scholar 

  20. Liu W, Mei T, Zhang Y, Che C, Luo J (2015) Multi-task deep visual-semantic embedding for video thumbnail selection. In: IEEE CVPR, pp 3707–3715

  21. Lucey P, Bialkowski A, Carr P, Morgan S, Matthews IA, Sheikh Y (2013) Representing and discovering adversarial team behaviors using player roles. In: IEEE CVPR, pp 2706–2713

  22. Maksai A, Wang X, Fua P (2016) What Players do with the Ball: A Physically Constrained Interaction Modeling. In: IEEE CVPR

  23. Oldfield R, Shirley B, Cullen N (2013) Demo paper: Audio object extraction for live sports broadcast. In: IEEE ICME workshop, pp 1–2

  24. Pan H, van Beek PJL, Sezan MI (2001) Detection of slow-motion replay segments in sports video for highlights generation. In: IEEE ICASSP, pp 1649–1652

  25. Radhakrishan R, Xiong Z, Divakaran A, Ishikawa Y (2003) Generation of sports highlights using a combination of supervised & unsupervised learning in audio domain. In: Pacific rim conference on multimedia, vol 2, pp 935–939. IEEE

  26. Sun M, Farhadi A, Seitz S (2014) Ranking domain-specific highlights by analyzing edited videos. In: ECCV, pp 787–802

  27. Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: IEEE ICCV, pp 4489–4497

  28. Wen H, Chang W, Chang C, Lin Y, Wu J (2014) Event detection in broadcasting video for halfpipe sports. In: ACM MM, pp 727–728

  29. Xiong Z, Radhakrishnan R, Divakaran A, Huang TS (2003) Audio events detection based highlights extraction from baseball, golf and soccer games in a unified framework. In: IEEE ICASSP, pp 632– 635

  30. Xu M, Maddage NC, Xu C, Kankanhalli MS, Tian Q (2003) Creating audio keywords for event detection in soccer video. In: IEEE ICME, pp 281–284

  31. Yang H, Wang B, Lin S, Wipf D, Guo M, Guo B (2015) Unsupervised extraction of video highlights via robust recurrent auto-encoders. In: IEEE ICCV, pp 4633–4641

  32. Yao T, Mei T, Rui Y (2016) Highlight detection with pairwise deep ranking for first-person video summarization. In: IEEE CVPR, pp 982–990

  33. Yan C, Zhang Y, Xu J, Dai F, Li L, Dai Q, Wu F (2014) A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Process Lett 21(5):573–576

    Article  Google Scholar 

  34. Yan C, Zhang Y, Xu J, Dai F, Zhang J, Dai Q, Wu F (2014) Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Circuits Syst Video Technol 24(12):2077–2089

    Article  Google Scholar 

  35. Yan C, Xie H, Liu S, Yin J, Zhang Y, Dai Q (2014) Effective Uyghur language text detection in complex background images for traffic prompt identification. IEEE Trans Intell Transp Syst 201

  36. Yan C, Xie H, Yang D, Yin J, Zhang Y, Dai Q (2017) Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Trans Intell Transp Syst

Download references

Acknowledgments

The copyright of the “Standz Basketball” application belongs the Zepp Technology Inc. The authors would like to thank Zepp Technology Inc. for their fundamental support to our research. The authors also would like to thank Xiaowei Dai, Hao Su, Kun Liu, Xinchen Liu, Xiaoyan Gu, Zeyu Liu, Jacky Shen, and Jiahui Du for their participation in developing the application and valuable comments in preparing this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chenggang Clarence Yan.

Additional information

The National Key Research and Development Plan (No. 2016YFC0801005), the Funds for Creative Research Groups of China (No. 61421061), the National Natural Science Foundation of China (No. 61602049), the Beijing Training Project for the Leading Talents in S&T (No. ljrc 201502), and the CCF-Tencent Open Research Fund (No. AGR20160113).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, W., Yan, C.C., Liu, J. et al. Deep learning based basketball video analysis for intelligent arena application. Multimed Tools Appl 76, 24983–25001 (2017). https://doi.org/10.1007/s11042-017-5002-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-5002-5

Keywords

Navigation