Skip to main content
Log in

A deep data augmentation framework based on generative adversarial networks

  • 1221: Deep Learning for Image/Video Compression and Visual Quality Assessment
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In the process of training convolutional neural networks, the training data is often insufficient to obtain ideal performance and encounters the overfitting problem. To address this issue, traditional data augmentation (DA) techniques, which are designed manually based on empirical results, are often adopted in supervised learning. Essentially, traditional DA techniques are in the implicit form of feature engineering. The augmentation strategies should be designed carefully, for example, the distribution of augmented samples should be close to the original data distribution. Otherwise, it will reduce the performance on the test set. Instead of designing augmentation strategies manually, we propose to learn the data distribution directly. New samples can then be generated from the estimated data distribution. Specifically, a deep DA framework is proposed which consists of two neural networks. One is a generative adversarial network, which is used to learn the data distribution, and the other one is a convolutional neural network classifier. We evaluate the proposed model on a handwritten Chinese character dataset and a digit dataset, and the experimental results show it outperforms baseline methods including one manually well-designed DA method and two state-of-the-art DA methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S (2016) TensorFlow: Large-scale machine learning on heterogeneous systems, arXiv:1603.04467

  2. Antoniou A, Storkey A, Edwards H (2018) Augmenting image classifiers using data augmentation generative adversarial networks. In: International conference on artificial neural networks, pp 594–603

  3. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan. arXiv:1701.07875

  4. Brock A, Donahue J, Simonyan K (2018) Large scale gan training for high fidelity natural image synthesis. arXiv:1809.11096

  5. Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster r-cnn for object detection in the wild. In: The IEEE Conference on conference on computer vision and pattern recognition (CVPR), pp 3339–3348

  6. Cohen G, Afshar S, Tapson J, van Schaik A (2017) EMNIST: Extending MNIST to handwritten letters. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp 2921–2926

  7. Cubuk ED, Zoph B, Mané D, Vasudevan V, Le QV (2018) Autoaugment: Learning augmentation policies from data. arXiv:1805.09501

  8. Cui X, Goel V, Kingsbury B (2014) Data augmentation for deep neural network acoustic modeling. In: 2014 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 5582–5586

  9. Denton E, Chintala S, Szlam A, Fergus R (2015) Deep generative image models using a laplacian pyramid of adversarial networks. In: Advances in neural information processing systems (NIPS), pp 1486–1494

  10. Dixit M, Kwitt R, Niethammer M, Vasconcelos N (2017) Aga: Attribute-guided augmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7455–7463

  11. Fawzi A, Samulowitz H, Turaga D, Frossard P (2016) Adaptive data augmentation for image classification. In: IEEE international conference on image processing (ICIP), pp 3688–3692

  12. Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H (2018) GAN-based synthetic medical image augmentation for increased cnn performance in liver lesion classification. Neurocomputing 321:321–331

    Article  Google Scholar 

  13. Girshick R (2015) Fast R-CNN. In: International conference on computer vision (ICCV), pp 1440–1448

  14. Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale orderless pooling of deep convolutional activation features. In: European conference on computer vision (ECCV), pp 392–407

  15. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems (NIPS), pp 2672–2680

  16. Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777

  17. Hauberg S, Freifeld O, Larsen ABL, Fisher J, Hansen LK (2016) Dreaming more data: Class-dependent distributions over diffeomorphisms for learned data augmentation. In: Proceedings of the 19th International conference on artificial intelligence and statistics, pp 342–350

  18. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778

  19. Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366

    Article  MATH  Google Scholar 

  20. Hu W, Huang Y, Wei L, Zhang F, Li H (2015) Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors 2015, 258619

  21. Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2014) Synthetic data and artificial neural networks for natural scene text recognition. arXiv:1406.2227

  22. Jha G, Cecotti H (2020) Data augmentation for handwritten digit recognition using generative adversarial networks. Multimed Tools Appl 79:35055–35068

    Article  Google Scholar 

  23. Jorge J, Vieco J, Paredes R, Sanchez JA, Benedi JM (2018) Empirical evaluation of variational autoencoders for data augmentation. In: International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP), pp 96–104

  24. Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196

  25. Karras T, Laine S, Aila T (2018) A style-based generator architecture for generative adversarial networks. arXiv:1812.04948

  26. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp 1097–1105

  27. LeCun Y, Huang FJ, Bottou L (2014) Learning methods for generic object recognition with invariance to pose and lighting. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol 2, pp II–104

  28. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp 2278–2324

  29. Li W, Chen C, Zhang M, Li H, Du Q (2019) Data augmentation for hyperspectral image classification with deep cnn. IEEE Geosci Remote Sens Lett 16(4):593–597

    Article  Google Scholar 

  30. Li Z, Guo J, Jiao W, Xu P, Liu B, Zhao X (2018) Random linear interpolation data augmentation for person re-identification. Multimed Tools Appl 79(7):4931–4947

    Google Scholar 

  31. Liu C, Yin F, Wang Q, Wang D (2011) ICDAR 2011 Chinese handwriting recognition competition. In: Proceedings of the 2011 international conference on document analysis and recognition (ICDAR), pp 1464–1469

  32. Long J, Shelhamer E, Darrell T (2015) Fully convolutional models for semantic segmentation. In: IEEE Conference on Computer vision and pattern recognition (CVPR), vol 3, p 4

  33. Mao X, Li Q, Xie H, Lau RY, Wang Z, Smolley SP (2016) Least squares generative adversarial networks. arXiv:1611.04076

  34. Mariani G, Scheidegger F, Istrate R, Bekas C, Malossi C (2018) Bagan: Data augmentation with balancing gan. arXiv:1803.09655

  35. Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784

  36. Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. arXiv:1802.05957

  37. Nowozin S, Cseke B, Tomioka R (2016) f-gan: Training generative neural samplers using variational divergence minimization. arXiv:1606.00709

  38. Odena A (2016) Semi-supervised learning with generative adversarial networks. arXiv:1606.01583

  39. Paulin M, Revaud J, Harchaoui Z, Perronnin F, Schmid C (2014) Transformation pursuit for image classification. In: IEEE Conference on computer vision and pattern recognition, pp 3646–3653

  40. Qian N (1999) On the momentum term in gradient descent learning algorithms. Neural Netw 12(1):145–151

    Article  Google Scholar 

  41. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434

  42. Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in Neural Information Processing Systems (NIPS), pp 2226–2234

  43. Simard PY, Steinkraus D, Platt JC (2003) Best practices for convolutional neural networks applied to visual document analysis. In: International Conference on Document Analysis and Recognition (ICDAR), vol 2, p 958

  44. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  45. Springenberg JT (2015) Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv:1511.06390

  46. Wang G, Kang W, Wu Q, Wang Z, Gao J (2018) Generative adversarial network (gan) based data augmentation for palmprint recognition. In: Digital Image Computing: Techniques and Applications (DICTA), pp 1–7

  47. Wang J, Perez L (2017) The effectiveness of data augmentation in image classification using deep learning. arXiv:1712.04621

  48. Wang SH, Sun J, Phillips P, Zhao G, Zhang YD (2018) Polarimetric synthetic aperture radar image segmentation by convolutional neural network using graphical processing units. J Real Time Image Process 15(3):631–642

    Article  Google Scholar 

  49. Yin X, Yu X, Sohn K, Liu X, Chandraker M (2018) Feature transfer learning for deep face recognition with long-tail data. CoRR arXiv:1803.09014

  50. Yu Q, Lam W (2019) Data augmentation based on adversarial autoencoder handling imbalance for learning to rank. In: AAAI Conference on Artificial Intelligence, vol 33, no 01, pp 411–418

  51. Zeng S, Zhang B, Gou J (2020) Learning double weights via data augmentation for robust sparse and collaborative representation-based classification. Multimed Tools Appl 79:20617–20638

    Article  Google Scholar 

  52. Zhang H, Goodfellow I, Metaxas D, Odena A (2018) Self-attention generative adversarial networks. arXiv:1805.08318

  53. Zhong Z, Zheng L, Kang G, Li S, Yang Y (2017) Random erasing data augmentation. arXiv:1708.04896

  54. Zhu X, Liu Y, Qin Z, Li J (2017) Data augmentation in emotion classification using generative adversarial networks. arXiv:1711.00648

Download references

Acknowledgements

The research of this work has been supported by the Dean’s Research Fund 2018-19 (FLASS/DRF/IDS-3), Departmental Collaborative Research Fund 2019 (MIT/DCRF-R2/18-19) of The Education University of Hong Kong, a grant from the Fundamental Research Funds for the Central Universities, China (Projects: 2022ECNU-HLYT001) and the Direct Grant (DR22A2) of Lingnan University, Hong Kong.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haoran Xie.

Ethics declarations

Conflict of Interests

The authors have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Q., Luo, L., Xie, H. et al. A deep data augmentation framework based on generative adversarial networks. Multimed Tools Appl 81, 42871–42887 (2022). https://doi.org/10.1007/s11042-022-13476-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13476-w

Keywords

Navigation