Skip to main content
Log in

Content modification of soccer videos using a supervised deep learning framework

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we initially propose a novel framework for replacing advertisement contents in soccer videos in an automatic way by using deep learning strategies. For this purpose, we begin by applying UNET (an image segmentation convolutional neural network technique) for content segmentation and detection. Subsequently, after reconstructing the segmented content in the video frames (considering the apparent loss in detection), we will replace the unwanted content by new one using a homography mapping procedure. Furthermore, the replacement key points will be tracked into the next frames considering the zoom-in and zoom-out controlling using multiplication of the key point coordinates by the homography matrix between each two consecutive frames. Since the movement of objects in video can disrupt the alignment between frames and correspondingly make the homography matrix calculation erroneous, we use Mask R-CNN algorithm to mask and remove the moving objects from the scene. Accordingly, the replacement will be consistent to the video motion of scene. Such framework is denominated as REP-Model which stands for a replacing model. In addition, we have examined the REP-Model over a large database regarding soccer match videos for removing and replacing the playground billboard contents and the results reveal the discriminative nature of our proposed framework. Furthermore, in order to key out the covered object beneath the new content, we use an unsupervised approach in an adversarial learning set-up by learning object masks with playing a game of cut-and-paste, using a discriminator model to find out whether the covered object has been revealed correctly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Aldershoff F, Gevers T (2003) Visual tracking and localization of billboards in streamed soccer matches. In: Storage and retrieval methods and applications for multimedia 2004, vol 5307. International Society for Optics and Photonics, pp 408–416

  2. Algarni A D (2020) Efficient object detection and classification of heat emitting objects from infrared images based on deep learning. Multimed Tools Appl 79:1–24

    Article  Google Scholar 

  3. Bengani S, Vadivel S et al (2020) Automatic segmentation of optic disc in retinal fundus images using semi-supervised deep learning. Multimed Tools Appl 80:1–26

    Google Scholar 

  4. Brock A, Donahue J, Simonyan K (2018) Large scale gan training for high fidelity natural image synthesis. arXiv:1809.11096

  5. Brown M, Lowe D G (2007) Automatic panoramic image stitching using invariant features. Int J Comput Vis 74(1):59–73

    Article  Google Scholar 

  6. Burgess C P, Matthey L, Watters N, Kabra R, Higgins I, Botvinick M, Lerchner A (2019) Monet: unsupervised scene decomposition and representation. arXiv:1901.11390

  7. Caelles S, Maninis K K, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2017) One-shot video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 221–230

  8. Cai G, Chen L, Li J (2003) Billboard advertising detection in sport tv. In: Seventh international symposium on signal processing and its applications, 2003. Proceedings, vol 1. IEEE, pp 537–540

  9. Cao X, Gao S, Chen L, Wang Y (2020) Ship recognition method combined with image segmentation and deep learning feature extraction in video surveillance. Multimed Tools Appl 79(13):9177–9192

    Article  Google Scholar 

  10. Chen M, Artières T, Denoyer L (2019) Unsupervised object segmentation by redrawing. In: Advances in neural information processing systems, pp 12705–12716

  11. Cheng J, Tsai Y H, Hung W C, Wang S, Yang M H (2018) Fast and accurate online video object segmentation via tracking parts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7415–7424

  12. Chum O, Matas J (2008) Optimal randomized ransac. IEEE Trans Pattern Anal Mach Intell 30(8):1472–1482

    Article  Google Scholar 

  13. Dean J, Corrado G, Monga R, Chen K, Devin M, Mao M, Ranzato M, Senior A, Tucker P, Yang K et al (2012) Large scale distributed deep networks. In: Advances in neural information processing systems, pp 1223–1231

  14. Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255

  15. Dewi C, Chen R C, Yu H (2020) Weight analysis for various prohibitory sign detection and recognition using deep learning. Multimed Tools Appl 79:1–19

    Article  Google Scholar 

  16. Egilmez H E, Chao Y H, Ortega A (2020) Graph-based transforms for video coding. IEEE Trans Image Process 29:9330–9344

    Article  MathSciNet  Google Scholar 

  17. Eslami S A, Heess N, Weber T, Tassa Y, Szepesvari D, Hinton G E et al (2016) Attend, infer, repeat: fast scene understanding with generative models. In: Advances in neural information processing systems, pp 3225–3233

  18. Feng Z, Neumann J (2013) Real time commercial detection in videos

  19. Gao Z, Zhang H, Dong S, Sun S, Wang X, Yang G, Wu W, Li S, de Albuquerque V H C (2020) Salient object detection in the distributed cloud-edge intelligent network. IEEE Netw 34(2):216–224

    Article  Google Scholar 

  20. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  21. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  22. Gregor K, Danihelka I, Graves A, Rezende D J, Wierstra D (2015)

  23. Gruosso M, Capece N, Erra U (2020) Human segmentation in surveillance video with deep learning. Multimed Tools Appl 80:1–25

    Google Scholar 

  24. Guo J, Bai H, Tang Z, Xu P, Gan D, Liu B Multi modal human action recognition for video content matching

  25. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  Google Scholar 

  26. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969

  27. Hosang J, Benenson R, Dollár P, Schiele B (2015) What makes for effective detection proposals? IEEE Trans Pattern Anal Mach Intell 38(4):814–830

    Article  Google Scholar 

  28. Hossari M, Dev S, Nicholson M, McCabe K, Nautiyal A, Conran C, Tang J, Xu W, Pitié F (2018) Adnet: a deep network for detecting adverts. arXiv:1811.04115

  29. Hou S, Zhou S, Liu W, Zheng Y (2018) Classifying advertising video by topicalizing high-level semantic concepts. Multimed Tools Appl 77 (19):25475–25511

    Article  Google Scholar 

  30. Hu Y T, Huang J B, Schwing A (2017) Maskrnn: instance level video object segmentation. In: Advances in neural information processing systems, pp 325–334

  31. Hu P, Wang G, Kong X, Kuen J, Tan Y P (2018) Motion-guided cascaded refinement network for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1400–1409

  32. Hussain Z, Zhang M, Zhang X, Ye K, Thomas C, Agha Z, Ong N, Kovashka A (2017) Automatic understanding of image and video advertisements. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1705–1715

  33. Jang S W, Ahn B (2019) Effective detection of exposed target regions based on deep learning from multimedia data. Multimed Tools Appl 79:1–17

    Google Scholar 

  34. Ji X, Henriques J F, Vedaldi A (2018) Invariant information distillation for unsupervised image segmentation and clustering. arXiv:1807.06653

  35. Jindal N et al (2020) Copy move and splicing forgery detection using deep convolution neural network, and semantic segmentation. Multimed Tools Appl 80:1–29

    Google Scholar 

  36. Kanezaki A (2018) Unsupervised image segmentation by backpropagation. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1543–1547

  37. Khoreva A, Benenson R, Ilg E, Brox T, Schiele B (2017) Lucid data dreaming for object tracking. In: The DAVIS challenge on video object segmentation

  38. Kim Y, Jung S, Ji S, Hwang E, Rho S (2019) Iot-based personalized nie content recommendation system. Multimed Tools Appl 78(3):3009–3043. https://doi.org/10.1007/s11042-020-09603-0

    Article  Google Scholar 

  39. Kim D Y, Park J H, Lee Y, Kim S (2020) Network virtualization for real-time processing of object detection using deep learning. Multimed Tools Appl 1–19

  40. Kosub S (2019) A note on the triangle inequality for the jaccard distance. Pattern Recogn Lett 120:36–38

    Article  Google Scholar 

  41. Krizhevsky A, Sutskever I, Hinton G E (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  42. Lee H, Eum S, Kwon H (2019) Me r-cnn: MULti-expert r-cnn for object detection. IEEE Trans Image Process 29:1030–1044

    Article  MathSciNet  Google Scholar 

  43. Levandowsky M, Winter D (1971) Distance between sets. Nature 234(5323):34–35

    Article  Google Scholar 

  44. Li Y, Tang S, Zhang R, Zhang Y, Li J, Yan S (2019) Asymmetric gan for unpaired image-to-image translation. IEEE Trans Image Process 28(12):5881–5896

    Article  MathSciNet  Google Scholar 

  45. Lim J H, Ye J C (2017) Geometric gan. arXiv:1705.02894

  46. Lipkus A H (1999) A proof of the triangle inequality for the tanimoto distance. J Math Chem 26(1-3):263–265

    Article  Google Scholar 

  47. Liu J, Wang C, Su H, Du B, Tao D (2019) Multistage gan for fabric defect detection. IEEE Trans Image Process 29:3388–3400

    Article  Google Scholar 

  48. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  49. Lucas A, Lopez-Tapia S, Molina R, Katsaggelos A K (2019) Generative adversarial networks and perceptual losses for video super-resolution. IEEE Trans Image Process 28(7):3312–3327

    Article  MathSciNet  Google Scholar 

  50. Maninis K K, Caelles S, Chen Y, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2018) Video object segmentation without temporal information. IEEE Trans Pattern Anal Mach Intell 41(6):1515–1530

    Article  Google Scholar 

  51. Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. arXiv:1802.05957

  52. Moulton R, Jiang Y (2018) Maximally consistent sampling and the jaccard index of probability distributions. arXiv:1809.04052

  53. Ostyakov P, Suvorov R, Logacheva E, Khomenko O, Nikolenko S I (2018) Seigan: towards compositional image generation by simultaneously learning to segment, enhance, and inpaint. arXiv:1811.07630

  54. Perazzi F, Khoreva A, Benenson R, Schiele B, Sorkine-Hornung A (2017) Learning video object segmentation from static images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2663–2672

  55. Pham T T, Do T T, Sünderhauf N, Reid I (2018) Scenecut: joint geometric and object segmentation for indoor scenes. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE, pp 1–9

  56. Remez T, Huang J, Brown M (2018) Learning to segment via cut-and-paste. In: Proceedings of the European conference on computer vision (ECCV), pp 37–52

  57. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241

  58. Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: an efficient alternative to sift or surf. In: 2011 International conference on computer vision. IEEE, pp 2564–2571

  59. Sakthivelan R, Rjendran P, Thangavel M (2020) A video analysis on user feedback based recommendation using a-fp hybrid algorithm. Multimed Tools Appl 79(5):3847–3859

    Article  Google Scholar 

  60. Sbai O, Couprie C, Aubry M (2018) Vector image generation by learning parametric layer decomposition. arXiv:1812.05484

  61. Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. In: European conference on computer vision. Springer, pp 746–760

  62. Tran D, Ranganath R, Blei D M (2017) Deep and hierarchical implicit models. arXiv:1702.08896, 7, 3

  63. Uijlings J R, Van De Sande K E, Gevers T, Smeulders A W (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

  64. Voigtlaender P, Leibe B (2017) Online adaptation of convolutional neural networks for the 2017 Davis challenge on video object segmentation. In: The 2017 DAVIS challenge on video object segmentation-CVPR workshops, vol 5

  65. Voigtlaender P, Leibe B (2017) Online adaptation of convolutional neural networks for video object segmentation. arXiv:1706.09364

  66. Voigtlaender P, Chai Y, Schroff F, Adam H, Leibe B, Chen L C (2019) Feelvos: fast end-to-end embedding learning for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9481–9490

  67. Watve A, Sural S (2008) Soccer video processing for the detection of advertisement billboards. Pattern Recogn Lett 29(7):994–1006

    Article  Google Scholar 

  68. Wei W, Fan X, Song H, Wang H (2019) Video tamper detection based on multi-scale mutual information. Multimed Tools Appl 78(19):27109–27126

    Article  Google Scholar 

  69. Xia X, Kulis B (2017) W-net: a deep model for fully unsupervised image segmentation. arXiv:1711.08506

  70. Xiao Y, Tian Z, Yu J, Zhang Y, Liu S, Du S, Lan X (2020) A review of object detection based on deep learning. Multimed Tools Appl 79:1–63

    Article  Google Scholar 

  71. Yang J, Kannan A, Batra D, Parikh D (2017) Lr-gan: layered recursive generative adversarial networks for image generation. arXiv:1703.01560

  72. Yang L, Wang Y, Xiong X, Yang J, Katsaggelos A K (2018) Efficient video object segmentation via network modulation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6499–6507

  73. Yong B, Wang C, Shen J, Li F, Yin H, Zhou R (2020) Automatic ventricular nuclear magnetic resonance image processing with deep learning. Multimed Tools Appl 1–17. https://doi.org/10.1007/s11042-020-08911-9

  74. Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22(11):1330–1334

    Article  Google Scholar 

Download references

Acknowledgements

The completion of this research was made possible thanks to the Natural Sciences and Engineering research Council of Canada (NSERC). In addition, the authors would like to thank Edouard Geze, Adam Alcolado and Robert Graham for their assistance during the project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vahid Khorasani Ghassab.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghassab, V.K., Maanicshah, K., Green, P. et al. Content modification of soccer videos using a supervised deep learning framework. Multimed Tools Appl 81, 481–503 (2022). https://doi.org/10.1007/s11042-021-11383-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-11383-0

Keywords

Navigation