Skip to main content
Log in

Grid self-occlusion: a grid self-occlusion data augmentation for better classification

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Data augmentation is a widely used regularization strategy, which can effectively alleviate the over-fitting and improve the robustness of the DCNNs in occlusion situations. Although the data augmentation method based on information erasing has made a significant progress, there is still an imbalance problem between deletion and reservation of information. In this paper, we propose a novel data augmentation method named grid self-occlusion, which exploits binary mask with inconsistent intervals to achieve the erasing balance of information. Specifically, we firstly obtain the central areas of sample through the characteristic of instance distribution, that is, the saliency map is used to get the central region for higher-resolution images, and the central prior is exploited to obtain the central region for lower resolution images. Then, the copy patches selected in the central areas are used to construct the binary mask. Lastly, this strategy is randomly applied during the training stage. Extensive experiments are performed on challenged datasets. For classification, the accuracy is improved by 1.32% on CIFAR-10 dataset. For detection, the mAP is increased by 3.03% and 4.19% for FCOS and CenterNet on the Pascal VOC dataset, respectively. Experimental results demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  2. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)

    Google Scholar 

  3. Szegedy, C., et al.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)

  4. Fu, C.-Y., et al.: Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)

  5. Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)

  6. Tian, Z., et al.: Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)

  7. Meneses, M., et al.: SmartSORT an MLP-based method for tracking multiple objects in real-time. J. Real-Time Image Process. 18(3), 913–921 (2021)

    Article  Google Scholar 

  8. Liang, Y., et al.: Multiple object tracking by reliable tracklets. Signal Image Video Process. 13(4), 823–831 (2019)

    Article  Google Scholar 

  9. Santurkar, S., et al.: How does batch normalization help optimization? Adv. Neural Inf. Process. Syst. 31 (2018)

  10. Srivastava, N., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  11. Hu, B., et al.: A preliminary study on data augmentation of deep learning for image classification. arXiv preprint arXiv:1906.11887 (2019)

  12. Khosla, C., Saini, B.S.: Enhancing performance of deep learning models with different data augmentation techniques: a survey. In: 2020 International Conference on Intelligent Engineering and Management (ICIEM). IEEE (2020)

  13. Grandvalet, Y., Canu, S., Boucheron, S.: Noise injection: theoretical prospects. Neural Comput. 9(5), 1093–1108 (1997)

  14. Cubuk, E.D., et al.: Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501 (2018)

  15. Gong, C., et al.: KeepAugment: a simple information-preserving data augmentation approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)

  16. Zhong, Z., et al.: Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, No. 07 (2020)

  17. DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)

  18. Wang, X., et al.: Repulsion loss: detecting pedestrians in a crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

  19. Singh, K.K., Lee, Y.J.: Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization. In: 2017 IEEE International Conference on Computer Vision (ICCV). IEEE (2017)

  20. Gastaldi, X.: Shake-shake regularization. arXiv preprint arXiv:1705.07485 (2017)

  21. Yamada, Y., Iwamura, M., Kise, K.: Shakedrop regularization (2018)

  22. Shijie, J., et al.: Research on data augmentation for image classification based on convolution neural networks. In: 2017 Chinese Automation Congress (CAC). IEEE (2017)

  23. Oskam, T., et al.: Fast and stable color balancing for images and augmented reality. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission. IEEE (2012)

  24. Zhang, H., et al.: mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)

  25. Yun, S., et al.: Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)

  26. Inoue, H.: Data augmentation by pairing samples for images classification. arXiv preprint arXiv:1801.02929 (2018)

  27. Tokozume, Y., Ushiku, Y., Harada, T.: Between-class learning for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

  28. Chen, P., et al.: Gridmask data augmentation. arXiv preprint arXiv:2001.04086 (2020)

  29. Lin, T.-Y., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (2017)

  30. Duan, K., et al.: Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hao Zhao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, X., Zhao, H., Zhang, H. et al. Grid self-occlusion: a grid self-occlusion data augmentation for better classification. SIViP 17, 705–713 (2023). https://doi.org/10.1007/s11760-022-02278-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-022-02278-0

Keywords

Navigation