Skip to main content

Improved CNN Model Stability and Robustness with Video Frame Segmentation

  • Conference paper
  • First Online:
Computational Collective Intelligence (ICCCI 2024)

Abstract

This paper investigates the impact of image segmentation on improving the stability and robustness of convolutional neural network (CNN) models, particularly those that handle frames with tiny informative objects. Firstly, the authors introduced a new frame segmentation algorithm designed to preprocess video frames before they are subjected to classification. Secondly, it was proposed to use the average absolute difference between the accuracy of the training and the validation as a metric to measure the reliability and consistency of CNN models during training. This demonstrates the efficacy of the proposed techniques in augmenting image classification outcomes, particularly in scenarios where crucial objects constitute only a small segment of the frame, which allows one to solve similar problems in science. This research not only addresses specific challenges within the realm of sports footage analysis, but also offers broader implications for similar image classification tasks across various domains, thereby setting a foundation for future explorations in enhancing CNN model performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Ba, J., Frey, B.: Adaptive dropout for training deep neural networks. Adv. Neural Inf. Process. Syst. 26 (2013)

    Google Scholar 

  2. Christiansen, P., Nielsen, L.N., Steen, K.A., Jørgensen, R.N., Karstoft, H.: Deepanomaly: combining background subtraction and deep learning for detecting obstacles and anomalies in an agricultural field. Sensors 16(11), 1904 (2016)

    Article  Google Scholar 

  3. Fang, W., Ding, Y., Zhang, F., Sheng, V.S.: DOG: a new background removal for object recognition from images. Neurocomputing 361, 85–91 (2019). https://doi.org/10.1016/j.neucom.2019.05.095

    Article  Google Scholar 

  4. Garcia-Garcia, B., Bouwmans, T., Silva, A.J.R.: Background subtraction in real applications: challenges, current models and future directions. Comput. Sci. Rev. 35, 100204 (2020). https://doi.org/10.1016/j.cosrev.2019.100204

    Article  MathSciNet  Google Scholar 

  5. Ghosh, S., Shet, R., Amon, P., Hutter, A., Kaup, A.: Robustness of deep convolutional neural networks for image degradations. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2916–2920. IEEE (2018)

    Google Scholar 

  6. Elbehiery, H., Hefnawy, A., Elewa, M.: Surface defects detection for ceramic tiles usingimage processing and morphological techniques (2007). https://doi.org/10.5281/ZENODO.1084534

  7. Kasiri, S., Fookes, C., Sridharan, S., Morgan, S.: Fine-grained action recognition of boxing punches from depth imagery. Comput. Vision Image Understand. 159, 143–153 (2017). https://doi.org/10.1016/j.cviu.2017.04.007

    Article  Google Scholar 

  8. Kim, C., Lee, J., Han, T., Kim, Y.M.: A hybrid framework combining background subtraction and deep neural networks for rapid person detection. J. Big Data 5(1) (2018). https://doi.org/10.1186/s40537-018-0131-x

  9. Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., Cho, K.: Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019). https://doi.org/10.48550/ARXIV.1902.07296

  10. Kong, F., Henao, R.: Efficient classification of very large images with tiny objects. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2022). https://doi.org/10.1109/cvpr52688.2022.00242

  11. Kong, F., Liu, X.y., Henao, R.: Quantum tensor network in machine learning: an application to tiny object classification (2021). https://doi.org/10.48550/ARXIV.2101.03154

  12. Koonce, B., Koonce, B.: Resnet 50. Convolutional neural networks with swift for tensorflow: image recognition and dataset categorization, pp. 63–72 (2021)

    Google Scholar 

  13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386

    Article  Google Scholar 

  14. Maron, R.C., et al.: A benchmark for neural network robustness in skin cancer classification. Eur. J. Cancer 155, 191–199 (2021)

    Article  Google Scholar 

  15. Pawlowski, N., Bhooshan, S., Ballas, N., Ciompi, F., Glocker, B., Drozdzal, M.: Needles in haystacks: on classifying tiny objects in large images (2019). https://doi.org/10.48550/ARXIV.1908.06037

  16. Raid, A., Khedr, W., El-dosuky, M., Aoud, M.: Image restoration based on morphological operations. Int. J. Comput. Sci. Eng. Inf. Technol. 4(3), 9–21 (2014). https://doi.org/10.5121/ijcseit.2014.4302

    Article  Google Scholar 

  17. Rushton, W., Baker, H.: Red/green sensitivity in normal vision. Vision. Res. 4(1–2), 75–85 (1964)

    Article  Google Scholar 

  18. Stefański, P., Jach, T., Kozak, J.: Classification of punches in olympic boxing using static RGB cameras. In: Computational Collective Intelligence, pp. 540–551. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41456-5_41

  19. Stefański, P., Kozak, J., Jach, T.: The problem of detecting boxers in the boxing ring. In: Recent Challenges in Intelligent Information and Database Systems: 14th Asian Conference, ACIIDS 2022, Ho Chi Minh City, Vietnam, 28–30 November 2022, Proceedings. pp. 592–603. Springer, Heidelberg (2022). https://doi.org/10.1007/978-981-19-8234-7_46

  20. Stein, M., et al.: Bring it to the pitch: Combining video and movement data to enhance team sport analysis. IEEE Trans. Visualizat. Comput. Graph. 24(1), 13–22 (2018). https://doi.org/10.1109/tvcg.2017.2745181

    Article  Google Scholar 

  21. Thomas, G.: Real-time camera tracking using sports pitch markings. J. Real-Time Image Process. 2(2–3), 117–132 (2007). https://doi.org/10.1007/s11554-007-0041-1

    Article  Google Scholar 

  22. Unzueta, L., Nieto, M., Cortes, A., Barandiaran, J., Otaegui, O., Sanchez, P.: Adaptive multicue background subtraction for robust vehicle counting and classification. IEEE Trans. Intell. Transport. Syst. 13(2), 527–540 (2012). https://doi.org/10.1109/tits.2011.2174358

    Article  Google Scholar 

  23. Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: International Conference on Machine Learning, pp. 1058–1066. PMLR (2013)

    Google Scholar 

  24. Wang, L., Guo, S., Huang, W., Qiao, Y.: Places205-vggnet models for scene recognition. arXiv preprint arXiv:1508.01667 (2015)

  25. Wang, X., Liu, L., Li, G., Dong, X., Zhao, P., Feng, X.: Background subtraction on depth videos with convolutional neural networks. In: 2018 International Joint Conference on Neural Networks (IJCNN). IEEE (2018). https://doi.org/10.1109/ijcnn.2018.8489230

  26. Wu, Y.J., , Tsai, C.M., Shih, F.: Improving leaf classification rate via background removal and ROI extraction. J. Image Graph. 4(2), 93–98 (2016). https://doi.org/10.18178/joig.4.2.93-98

  27. Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013). https://doi.org/10.48550/ARXIV.1301.3557

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Piotr Stefański or Tomasz Jach .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Stefański, P., Jach, T. (2024). Improved CNN Model Stability and Robustness with Video Frame Segmentation. In: Nguyen, N.T., et al. Computational Collective Intelligence. ICCCI 2024. Lecture Notes in Computer Science(), vol 14810. Springer, Cham. https://doi.org/10.1007/978-3-031-70816-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70816-9_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70815-2

  • Online ISBN: 978-3-031-70816-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics