Abstract
Data augmentation is a widely used regularization strategy, which can effectively alleviate the over-fitting and improve the robustness of the DCNNs in occlusion situations. Although the data augmentation method based on information erasing has made a significant progress, there is still an imbalance problem between deletion and reservation of information. In this paper, we propose a novel data augmentation method named grid self-occlusion, which exploits binary mask with inconsistent intervals to achieve the erasing balance of information. Specifically, we firstly obtain the central areas of sample through the characteristic of instance distribution, that is, the saliency map is used to get the central region for higher-resolution images, and the central prior is exploited to obtain the central region for lower resolution images. Then, the copy patches selected in the central areas are used to construct the binary mask. Lastly, this strategy is randomly applied during the training stage. Extensive experiments are performed on challenged datasets. For classification, the accuracy is improved by 1.32% on CIFAR-10 dataset. For detection, the mAP is increased by 3.03% and 4.19% for FCOS and CenterNet on the Pascal VOC dataset, respectively. Experimental results demonstrate the effectiveness of the proposed method.
Similar content being viewed by others
References
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)
Szegedy, C., et al.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Fu, C.-Y., et al.: Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)
Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Tian, Z., et al.: Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)
Meneses, M., et al.: SmartSORT an MLP-based method for tracking multiple objects in real-time. J. Real-Time Image Process. 18(3), 913–921 (2021)
Liang, Y., et al.: Multiple object tracking by reliable tracklets. Signal Image Video Process. 13(4), 823–831 (2019)
Santurkar, S., et al.: How does batch normalization help optimization? Adv. Neural Inf. Process. Syst. 31 (2018)
Srivastava, N., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Hu, B., et al.: A preliminary study on data augmentation of deep learning for image classification. arXiv preprint arXiv:1906.11887 (2019)
Khosla, C., Saini, B.S.: Enhancing performance of deep learning models with different data augmentation techniques: a survey. In: 2020 International Conference on Intelligent Engineering and Management (ICIEM). IEEE (2020)
Grandvalet, Y., Canu, S., Boucheron, S.: Noise injection: theoretical prospects. Neural Comput. 9(5), 1093–1108 (1997)
Cubuk, E.D., et al.: Autoaugment: Learning augmentation policies from data. arXiv preprint arXiv:1805.09501 (2018)
Gong, C., et al.: KeepAugment: a simple information-preserving data augmentation approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)
Zhong, Z., et al.: Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, No. 07 (2020)
DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)
Wang, X., et al.: Repulsion loss: detecting pedestrians in a crowd. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Singh, K.K., Lee, Y.J.: Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization. In: 2017 IEEE International Conference on Computer Vision (ICCV). IEEE (2017)
Gastaldi, X.: Shake-shake regularization. arXiv preprint arXiv:1705.07485 (2017)
Yamada, Y., Iwamura, M., Kise, K.: Shakedrop regularization (2018)
Shijie, J., et al.: Research on data augmentation for image classification based on convolution neural networks. In: 2017 Chinese Automation Congress (CAC). IEEE (2017)
Oskam, T., et al.: Fast and stable color balancing for images and augmented reality. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission. IEEE (2012)
Zhang, H., et al.: mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Yun, S., et al.: Cutmix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)
Inoue, H.: Data augmentation by pairing samples for images classification. arXiv preprint arXiv:1801.02929 (2018)
Tokozume, Y., Ushiku, Y., Harada, T.: Between-class learning for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Chen, P., et al.: Gridmask data augmentation. arXiv preprint arXiv:2001.04086 (2020)
Lin, T.-Y., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (2017)
Duan, K., et al.: Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2019)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Deng, X., Zhao, H., Zhang, H. et al. Grid self-occlusion: a grid self-occlusion data augmentation for better classification. SIViP 17, 705–713 (2023). https://doi.org/10.1007/s11760-022-02278-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-022-02278-0