Improved CNN Model Stability and Robustness with Video Frame Segmentation

Stefański, Piotr; Jach, Tomasz

doi:10.1007/978-3-031-70816-9_13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14810))

Included in the following conference series:

International Conference on Computational Collective Intelligence

375 Accesses

Abstract

This paper investigates the impact of image segmentation on improving the stability and robustness of convolutional neural network (CNN) models, particularly those that handle frames with tiny informative objects. Firstly, the authors introduced a new frame segmentation algorithm designed to preprocess video frames before they are subjected to classification. Secondly, it was proposed to use the average absolute difference between the accuracy of the training and the validation as a metric to measure the reliability and consistency of CNN models during training. This demonstrates the efficacy of the proposed techniques in augmenting image classification outcomes, particularly in scenarios where crucial objects constitute only a small segment of the frame, which allows one to solve similar problems in science. This research not only addresses specific challenges within the realm of sports footage analysis, but also offers broader implications for similar image classification tasks across various domains, thereby setting a foundation for future explorations in enhancing CNN model performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep learning for video object segmentation: a review

Article Open access 08 April 2022

Semi-supervised one-shot learning for video object segmentation in dynamic environments

Article 04 January 2025

VideoMatch: Matching Based Video Object Segmentation

References

Ba, J., Frey, B.: Adaptive dropout for training deep neural networks. Adv. Neural Inf. Process. Syst. 26 (2013)
Google Scholar
Christiansen, P., Nielsen, L.N., Steen, K.A., Jørgensen, R.N., Karstoft, H.: Deepanomaly: combining background subtraction and deep learning for detecting obstacles and anomalies in an agricultural field. Sensors 16(11), 1904 (2016)
Article Google Scholar
Fang, W., Ding, Y., Zhang, F., Sheng, V.S.: DOG: a new background removal for object recognition from images. Neurocomputing 361, 85–91 (2019). https://doi.org/10.1016/j.neucom.2019.05.095
Article Google Scholar
Garcia-Garcia, B., Bouwmans, T., Silva, A.J.R.: Background subtraction in real applications: challenges, current models and future directions. Comput. Sci. Rev. 35, 100204 (2020). https://doi.org/10.1016/j.cosrev.2019.100204
Article MathSciNet Google Scholar
Ghosh, S., Shet, R., Amon, P., Hutter, A., Kaup, A.: Robustness of deep convolutional neural networks for image degradations. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2916–2920. IEEE (2018)
Google Scholar
Elbehiery, H., Hefnawy, A., Elewa, M.: Surface defects detection for ceramic tiles usingimage processing and morphological techniques (2007). https://doi.org/10.5281/ZENODO.1084534
Kasiri, S., Fookes, C., Sridharan, S., Morgan, S.: Fine-grained action recognition of boxing punches from depth imagery. Comput. Vision Image Understand. 159, 143–153 (2017). https://doi.org/10.1016/j.cviu.2017.04.007
Article Google Scholar
Kim, C., Lee, J., Han, T., Kim, Y.M.: A hybrid framework combining background subtraction and deep neural networks for rapid person detection. J. Big Data 5(1) (2018). https://doi.org/10.1186/s40537-018-0131-x
Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., Cho, K.: Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019). https://doi.org/10.48550/ARXIV.1902.07296
Kong, F., Henao, R.: Efficient classification of very large images with tiny objects. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2022). https://doi.org/10.1109/cvpr52688.2022.00242
Kong, F., Liu, X.y., Henao, R.: Quantum tensor network in machine learning: an application to tiny object classification (2021). https://doi.org/10.48550/ARXIV.2101.03154
Koonce, B., Koonce, B.: Resnet 50. Convolutional neural networks with swift for tensorflow: image recognition and dataset categorization, pp. 63–72 (2021)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
Article Google Scholar
Maron, R.C., et al.: A benchmark for neural network robustness in skin cancer classification. Eur. J. Cancer 155, 191–199 (2021)
Article Google Scholar
Pawlowski, N., Bhooshan, S., Ballas, N., Ciompi, F., Glocker, B., Drozdzal, M.: Needles in haystacks: on classifying tiny objects in large images (2019). https://doi.org/10.48550/ARXIV.1908.06037
Raid, A., Khedr, W., El-dosuky, M., Aoud, M.: Image restoration based on morphological operations. Int. J. Comput. Sci. Eng. Inf. Technol. 4(3), 9–21 (2014). https://doi.org/10.5121/ijcseit.2014.4302
Article Google Scholar
Rushton, W., Baker, H.: Red/green sensitivity in normal vision. Vision. Res. 4(1–2), 75–85 (1964)
Article Google Scholar
Stefański, P., Jach, T., Kozak, J.: Classification of punches in olympic boxing using static RGB cameras. In: Computational Collective Intelligence, pp. 540–551. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41456-5_41
Stefański, P., Kozak, J., Jach, T.: The problem of detecting boxers in the boxing ring. In: Recent Challenges in Intelligent Information and Database Systems: 14th Asian Conference, ACIIDS 2022, Ho Chi Minh City, Vietnam, 28–30 November 2022, Proceedings. pp. 592–603. Springer, Heidelberg (2022). https://doi.org/10.1007/978-981-19-8234-7_46
Stein, M., et al.: Bring it to the pitch: Combining video and movement data to enhance team sport analysis. IEEE Trans. Visualizat. Comput. Graph. 24(1), 13–22 (2018). https://doi.org/10.1109/tvcg.2017.2745181
Article Google Scholar
Thomas, G.: Real-time camera tracking using sports pitch markings. J. Real-Time Image Process. 2(2–3), 117–132 (2007). https://doi.org/10.1007/s11554-007-0041-1
Article Google Scholar
Unzueta, L., Nieto, M., Cortes, A., Barandiaran, J., Otaegui, O., Sanchez, P.: Adaptive multicue background subtraction for robust vehicle counting and classification. IEEE Trans. Intell. Transport. Syst. 13(2), 527–540 (2012). https://doi.org/10.1109/tits.2011.2174358
Article Google Scholar
Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: International Conference on Machine Learning, pp. 1058–1066. PMLR (2013)
Google Scholar
Wang, L., Guo, S., Huang, W., Qiao, Y.: Places205-vggnet models for scene recognition. arXiv preprint arXiv:1508.01667 (2015)
Wang, X., Liu, L., Li, G., Dong, X., Zhao, P., Feng, X.: Background subtraction on depth videos with convolutional neural networks. In: 2018 International Joint Conference on Neural Networks (IJCNN). IEEE (2018). https://doi.org/10.1109/ijcnn.2018.8489230
Wu, Y.J., , Tsai, C.M., Shih, F.: Improving leaf classification rate via background removal and ROI extraction. J. Image Graph. 4(2), 93–98 (2016). https://doi.org/10.18178/joig.4.2.93-98
Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013). https://doi.org/10.48550/ARXIV.1301.3557

Download references

Author information

Authors and Affiliations

Department of Machine Learning, University of Economics in Katowice, 1 Maja, 40-287, Katowice, Poland
Piotr Stefański & Tomasz Jach

Authors

Piotr Stefański
View author publications
You can also search for this author in PubMed Google Scholar
Tomasz Jach
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Piotr Stefański or Tomasz Jach .

Editor information

Editors and Affiliations

Wrocław University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen
University of Leipzig, Leipzig, Germany
Bogdan Franczyk
University of Leipzig, Leipzig, Sachsen, Germany
André Ludwig
Universidad Complutense de Madrid, Madrid, Spain
Manuel Núñez
Vrije Universiteit Amsterdam, Amsterdam, Noord-Holland, The Netherlands
Jan Treur
University of Münster, Münster, Germany
Gottfried Vossen
Wrocław University of Science and Technology, Wrocław, Poland
Adrianna Kozierkiewicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stefański, P., Jach, T. (2024). Improved CNN Model Stability and Robustness with Video Frame Segmentation. In: Nguyen, N.T., et al. Computational Collective Intelligence. ICCCI 2024. Lecture Notes in Computer Science(), vol 14810. Springer, Cham. https://doi.org/10.1007/978-3-031-70816-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-031-70816-9_13
Published: 28 August 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70815-2
Online ISBN: 978-3-031-70816-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Improved CNN Model Stability and Robustness with Video Frame Segmentation