Abstract
This paper investigates the impact of image segmentation on improving the stability and robustness of convolutional neural network (CNN) models, particularly those that handle frames with tiny informative objects. Firstly, the authors introduced a new frame segmentation algorithm designed to preprocess video frames before they are subjected to classification. Secondly, it was proposed to use the average absolute difference between the accuracy of the training and the validation as a metric to measure the reliability and consistency of CNN models during training. This demonstrates the efficacy of the proposed techniques in augmenting image classification outcomes, particularly in scenarios where crucial objects constitute only a small segment of the frame, which allows one to solve similar problems in science. This research not only addresses specific challenges within the realm of sports footage analysis, but also offers broader implications for similar image classification tasks across various domains, thereby setting a foundation for future explorations in enhancing CNN model performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ba, J., Frey, B.: Adaptive dropout for training deep neural networks. Adv. Neural Inf. Process. Syst. 26 (2013)
Christiansen, P., Nielsen, L.N., Steen, K.A., Jørgensen, R.N., Karstoft, H.: Deepanomaly: combining background subtraction and deep learning for detecting obstacles and anomalies in an agricultural field. Sensors 16(11), 1904 (2016)
Fang, W., Ding, Y., Zhang, F., Sheng, V.S.: DOG: a new background removal for object recognition from images. Neurocomputing 361, 85–91 (2019). https://doi.org/10.1016/j.neucom.2019.05.095
Garcia-Garcia, B., Bouwmans, T., Silva, A.J.R.: Background subtraction in real applications: challenges, current models and future directions. Comput. Sci. Rev. 35, 100204 (2020). https://doi.org/10.1016/j.cosrev.2019.100204
Ghosh, S., Shet, R., Amon, P., Hutter, A., Kaup, A.: Robustness of deep convolutional neural networks for image degradations. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2916–2920. IEEE (2018)
Elbehiery, H., Hefnawy, A., Elewa, M.: Surface defects detection for ceramic tiles usingimage processing and morphological techniques (2007). https://doi.org/10.5281/ZENODO.1084534
Kasiri, S., Fookes, C., Sridharan, S., Morgan, S.: Fine-grained action recognition of boxing punches from depth imagery. Comput. Vision Image Understand. 159, 143–153 (2017). https://doi.org/10.1016/j.cviu.2017.04.007
Kim, C., Lee, J., Han, T., Kim, Y.M.: A hybrid framework combining background subtraction and deep neural networks for rapid person detection. J. Big Data 5(1) (2018). https://doi.org/10.1186/s40537-018-0131-x
Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., Cho, K.: Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019). https://doi.org/10.48550/ARXIV.1902.07296
Kong, F., Henao, R.: Efficient classification of very large images with tiny objects. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2022). https://doi.org/10.1109/cvpr52688.2022.00242
Kong, F., Liu, X.y., Henao, R.: Quantum tensor network in machine learning: an application to tiny object classification (2021). https://doi.org/10.48550/ARXIV.2101.03154
Koonce, B., Koonce, B.: Resnet 50. Convolutional neural networks with swift for tensorflow: image recognition and dataset categorization, pp. 63–72 (2021)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
Maron, R.C., et al.: A benchmark for neural network robustness in skin cancer classification. Eur. J. Cancer 155, 191–199 (2021)
Pawlowski, N., Bhooshan, S., Ballas, N., Ciompi, F., Glocker, B., Drozdzal, M.: Needles in haystacks: on classifying tiny objects in large images (2019). https://doi.org/10.48550/ARXIV.1908.06037
Raid, A., Khedr, W., El-dosuky, M., Aoud, M.: Image restoration based on morphological operations. Int. J. Comput. Sci. Eng. Inf. Technol. 4(3), 9–21 (2014). https://doi.org/10.5121/ijcseit.2014.4302
Rushton, W., Baker, H.: Red/green sensitivity in normal vision. Vision. Res. 4(1–2), 75–85 (1964)
Stefański, P., Jach, T., Kozak, J.: Classification of punches in olympic boxing using static RGB cameras. In: Computational Collective Intelligence, pp. 540–551. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-41456-5_41
Stefański, P., Kozak, J., Jach, T.: The problem of detecting boxers in the boxing ring. In: Recent Challenges in Intelligent Information and Database Systems: 14th Asian Conference, ACIIDS 2022, Ho Chi Minh City, Vietnam, 28–30 November 2022, Proceedings. pp. 592–603. Springer, Heidelberg (2022). https://doi.org/10.1007/978-981-19-8234-7_46
Stein, M., et al.: Bring it to the pitch: Combining video and movement data to enhance team sport analysis. IEEE Trans. Visualizat. Comput. Graph. 24(1), 13–22 (2018). https://doi.org/10.1109/tvcg.2017.2745181
Thomas, G.: Real-time camera tracking using sports pitch markings. J. Real-Time Image Process. 2(2–3), 117–132 (2007). https://doi.org/10.1007/s11554-007-0041-1
Unzueta, L., Nieto, M., Cortes, A., Barandiaran, J., Otaegui, O., Sanchez, P.: Adaptive multicue background subtraction for robust vehicle counting and classification. IEEE Trans. Intell. Transport. Syst. 13(2), 527–540 (2012). https://doi.org/10.1109/tits.2011.2174358
Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: International Conference on Machine Learning, pp. 1058–1066. PMLR (2013)
Wang, L., Guo, S., Huang, W., Qiao, Y.: Places205-vggnet models for scene recognition. arXiv preprint arXiv:1508.01667 (2015)
Wang, X., Liu, L., Li, G., Dong, X., Zhao, P., Feng, X.: Background subtraction on depth videos with convolutional neural networks. In: 2018 International Joint Conference on Neural Networks (IJCNN). IEEE (2018). https://doi.org/10.1109/ijcnn.2018.8489230
Wu, Y.J., , Tsai, C.M., Shih, F.: Improving leaf classification rate via background removal and ROI extraction. J. Image Graph. 4(2), 93–98 (2016). https://doi.org/10.18178/joig.4.2.93-98
Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557 (2013). https://doi.org/10.48550/ARXIV.1301.3557
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Stefański, P., Jach, T. (2024). Improved CNN Model Stability and Robustness with Video Frame Segmentation. In: Nguyen, N.T., et al. Computational Collective Intelligence. ICCCI 2024. Lecture Notes in Computer Science(), vol 14810. Springer, Cham. https://doi.org/10.1007/978-3-031-70816-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-031-70816-9_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70815-2
Online ISBN: 978-3-031-70816-9
eBook Packages: Computer ScienceComputer Science (R0)