Skip to main content

Downsampling GAN for Small Object Data Augmentation

  • Conference paper
  • First Online:
Computer Analysis of Images and Patterns (CAIP 2023)

Abstract

The limited visual information provided by small objects—under 32 \(\times \) 32 pixels—makes small object detection a particularly challenging problem for current detectors. Moreover, standard datasets are biased towards large objects, limiting the variability of the training set for the small objects subset. Although new datasets specifically designed for small object detection have been recently released, the detection precision is still significantly lower than that of standard object detection. We propose a data augmentation method based on a Generative Adversarial Network (GAN) to increase the availability of small object samples at training time, boosting the performance of standard object detectors in this highly demanding subset. Our Downsampling GAN (DS-GAN) generates new small objects from larger ones, avoiding the unrealistic artifacts created by traditional resizing methods. The synthetically generated objects are inserted in the original dataset images in plausible positions without causing mismatches between foreground and background. The proposed method improves the AP\(^{@[.5,.95]}_{s}\) and AP\(^{@.5}_{s}\) of a standard object detector in the UVDT small subset by more than 4 and 10 points, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bosquet, B., Mucientes, M., Brea, V.M.: STDnet: exploiting high resolution feature maps for small object detection. Eng. App. Artif. Intell. 91, 103615 (2020)

    Article  Google Scholar 

  2. Bulat, A., Yang, J., Tzimiropoulos, G.: To learn image super-resolution, use a GAN to learn how to do image degradation first. In: ECCV, pp. 185–200 (2018)

    Google Scholar 

  3. Chen, C., et al.: RRNet: a hybrid detector for object detection in drone-captured images. In: ICCV Workshops (2019)

    Google Scholar 

  4. Du, D., et al.: The unmanned aerial vehicle benchmark: object detection and tracking. In: ECCV, pp. 370–386 (2018)

    Google Scholar 

  5. Goodfellow, I., et al.: Generative adversarial nets. In: NIPS, pp. 2672–2680 (2014)

    Google Scholar 

  6. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV, pp. 2961–2969 (2017)

    Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  8. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38

    Chapter  Google Scholar 

  9. Ke, X., Zou, J., Niu, Y.: End-to-end automatic image annotation based on deep CNN and multi-label data augmentation. IEEE Trans. Multimedia 21(8), 2093–2106 (2019). https://doi.org/10.1109/TMM.2019.2895511

    Article  Google Scholar 

  10. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  11. Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., Cho, K.: Augmentation for small object detection. arXiv preprint arXiv:1902.07296 (2019)

  12. Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR, pp. 4681–4690 (2017)

    Google Scholar 

  13. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, pp. 2117–2125 (2017)

    Google Scholar 

  14. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  15. Liu, L., Muelly, M., Deng, J., Pfister, T., Li, L.: Generative modeling for small-data object detection. In: ICCV, pp. 6073–6081 (2019)

    Google Scholar 

  16. Liu, L., et al.: Deep learning for generic object detection: a survey. Int. J. Comput. Vis. 128, 261–318 (2020)

    Article  MATH  Google Scholar 

  17. Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: ICLR (2018)

    Google Scholar 

  18. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. In: ICLR (2016)

    Google Scholar 

  19. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  20. Shocher, A., Cohen, N., Irani, M.: “zero-shot” super-resolution using deep internal learning. In: CVPR, pp. 3118–3126 (2018)

    Google Scholar 

  21. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)

    Google Scholar 

  22. Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer GAN to bridge domain gap for person re-identification. In: CVPR, pp. 79–88 (2018)

    Google Scholar 

  23. Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: CVPR, pp. 5505–5514 (2018)

    Google Scholar 

  24. Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: ICCV, pp. 4471–4480 (2019)

    Google Scholar 

  25. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV, pp. 2223–2232 (2017)

    Google Scholar 

  26. Zhu, J., Park, T., Isola, P., Efros, A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV, pp. 2223–2232 (2017)

    Google Scholar 

  27. Zhu, P., et al.: VisDrone-VID2019: the vision meets drone object detection in video challenge results. In: IEEE International Conference on Computer Vision Workshops (2019)

    Google Scholar 

  28. Zoph, B., Cubuk, E.D., Ghiasi, G., Lin, T.-Y., Shlens, J., Le, Q.V.: Learning data augmentation strategies for object detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12372, pp. 566–583. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58583-9_34

    Chapter  Google Scholar 

Download references

Acknowledgements

This research was partially funded by the Spanish Ministerio de Ciencia e Innovación (grant number PID2020-112623GB-I00), and the Galician Consellería de Cultura, Educación e Universidade (grant numbers ED431C 2018/29, ED431C 2021/048, ED431G 2019/04). These grants are co-funded by the European Regional Development Fund (ERDF). This paper was supported by European Union’s Horizon 2020 research and innovation programme under grant number 951911 - AI4Media.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Cores .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cores, D., Brea, V.M., Mucientes, M., Seidenari, L., Del Bimbo, A. (2023). Downsampling GAN for Small Object Data Augmentation. In: Tsapatsoulis, N., et al. Computer Analysis of Images and Patterns. CAIP 2023. Lecture Notes in Computer Science, vol 14184. Springer, Cham. https://doi.org/10.1007/978-3-031-44237-7_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44237-7_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44236-0

  • Online ISBN: 978-3-031-44237-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics