Abstract
In the process of training convolutional neural networks, the training data is often insufficient to obtain ideal performance and encounters the overfitting problem. To address this issue, traditional data augmentation (DA) techniques, which are designed manually based on empirical results, are often adopted in supervised learning. Essentially, traditional DA techniques are in the implicit form of feature engineering. The augmentation strategies should be designed carefully, for example, the distribution of augmented samples should be close to the original data distribution. Otherwise, it will reduce the performance on the test set. Instead of designing augmentation strategies manually, we propose to learn the data distribution directly. New samples can then be generated from the estimated data distribution. Specifically, a deep DA framework is proposed which consists of two neural networks. One is a generative adversarial network, which is used to learn the data distribution, and the other one is a convolutional neural network classifier. We evaluate the proposed model on a handwritten Chinese character dataset and a digit dataset, and the experimental results show it outperforms baseline methods including one manually well-designed DA method and two state-of-the-art DA methods.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S (2016) TensorFlow: Large-scale machine learning on heterogeneous systems, arXiv:1603.04467
Antoniou A, Storkey A, Edwards H (2018) Augmenting image classifiers using data augmentation generative adversarial networks. In: International conference on artificial neural networks, pp 594–603
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan. arXiv:1701.07875
Brock A, Donahue J, Simonyan K (2018) Large scale gan training for high fidelity natural image synthesis. arXiv:1809.11096
Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster r-cnn for object detection in the wild. In: The IEEE Conference on conference on computer vision and pattern recognition (CVPR), pp 3339–3348
Cohen G, Afshar S, Tapson J, van Schaik A (2017) EMNIST: Extending MNIST to handwritten letters. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp 2921–2926
Cubuk ED, Zoph B, Mané D, Vasudevan V, Le QV (2018) Autoaugment: Learning augmentation policies from data. arXiv:1805.09501
Cui X, Goel V, Kingsbury B (2014) Data augmentation for deep neural network acoustic modeling. In: 2014 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 5582–5586
Denton E, Chintala S, Szlam A, Fergus R (2015) Deep generative image models using a laplacian pyramid of adversarial networks. In: Advances in neural information processing systems (NIPS), pp 1486–1494
Dixit M, Kwitt R, Niethammer M, Vasconcelos N (2017) Aga: Attribute-guided augmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7455–7463
Fawzi A, Samulowitz H, Turaga D, Frossard P (2016) Adaptive data augmentation for image classification. In: IEEE international conference on image processing (ICIP), pp 3688–3692
Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H (2018) GAN-based synthetic medical image augmentation for increased cnn performance in liver lesion classification. Neurocomputing 321:321–331
Girshick R (2015) Fast R-CNN. In: International conference on computer vision (ICCV), pp 1440–1448
Gong Y, Wang L, Guo R, Lazebnik S (2014) Multi-scale orderless pooling of deep convolutional activation features. In: European conference on computer vision (ECCV), pp 392–407
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems (NIPS), pp 2672–2680
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of wasserstein gans. In: Advances in neural information processing systems, pp 5767–5777
Hauberg S, Freifeld O, Larsen ABL, Fisher J, Hansen LK (2016) Dreaming more data: Class-dependent distributions over diffeomorphisms for learned data augmentation. In: Proceedings of the 19th International conference on artificial intelligence and statistics, pp 342–350
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366
Hu W, Huang Y, Wei L, Zhang F, Li H (2015) Deep convolutional neural networks for hyperspectral image classification. Journal of Sensors 2015, 258619
Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2014) Synthetic data and artificial neural networks for natural scene text recognition. arXiv:1406.2227
Jha G, Cecotti H (2020) Data augmentation for handwritten digit recognition using generative adversarial networks. Multimed Tools Appl 79:35055–35068
Jorge J, Vieco J, Paredes R, Sanchez JA, Benedi JM (2018) Empirical evaluation of variational autoencoders for data augmentation. In: International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP), pp 96–104
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196
Karras T, Laine S, Aila T (2018) A style-based generator architecture for generative adversarial networks. arXiv:1812.04948
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp 1097–1105
LeCun Y, Huang FJ, Bottou L (2014) Learning methods for generic object recognition with invariance to pose and lighting. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol 2, pp II–104
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, pp 2278–2324
Li W, Chen C, Zhang M, Li H, Du Q (2019) Data augmentation for hyperspectral image classification with deep cnn. IEEE Geosci Remote Sens Lett 16(4):593–597
Li Z, Guo J, Jiao W, Xu P, Liu B, Zhao X (2018) Random linear interpolation data augmentation for person re-identification. Multimed Tools Appl 79(7):4931–4947
Liu C, Yin F, Wang Q, Wang D (2011) ICDAR 2011 Chinese handwriting recognition competition. In: Proceedings of the 2011 international conference on document analysis and recognition (ICDAR), pp 1464–1469
Long J, Shelhamer E, Darrell T (2015) Fully convolutional models for semantic segmentation. In: IEEE Conference on Computer vision and pattern recognition (CVPR), vol 3, p 4
Mao X, Li Q, Xie H, Lau RY, Wang Z, Smolley SP (2016) Least squares generative adversarial networks. arXiv:1611.04076
Mariani G, Scheidegger F, Istrate R, Bekas C, Malossi C (2018) Bagan: Data augmentation with balancing gan. arXiv:1803.09655
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784
Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. arXiv:1802.05957
Nowozin S, Cseke B, Tomioka R (2016) f-gan: Training generative neural samplers using variational divergence minimization. arXiv:1606.00709
Odena A (2016) Semi-supervised learning with generative adversarial networks. arXiv:1606.01583
Paulin M, Revaud J, Harchaoui Z, Perronnin F, Schmid C (2014) Transformation pursuit for image classification. In: IEEE Conference on computer vision and pattern recognition, pp 3646–3653
Qian N (1999) On the momentum term in gradient descent learning algorithms. Neural Netw 12(1):145–151
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in Neural Information Processing Systems (NIPS), pp 2226–2234
Simard PY, Steinkraus D, Platt JC (2003) Best practices for convolutional neural networks applied to visual document analysis. In: International Conference on Document Analysis and Recognition (ICDAR), vol 2, p 958
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Springenberg JT (2015) Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv:1511.06390
Wang G, Kang W, Wu Q, Wang Z, Gao J (2018) Generative adversarial network (gan) based data augmentation for palmprint recognition. In: Digital Image Computing: Techniques and Applications (DICTA), pp 1–7
Wang J, Perez L (2017) The effectiveness of data augmentation in image classification using deep learning. arXiv:1712.04621
Wang SH, Sun J, Phillips P, Zhao G, Zhang YD (2018) Polarimetric synthetic aperture radar image segmentation by convolutional neural network using graphical processing units. J Real Time Image Process 15(3):631–642
Yin X, Yu X, Sohn K, Liu X, Chandraker M (2018) Feature transfer learning for deep face recognition with long-tail data. CoRR arXiv:1803.09014
Yu Q, Lam W (2019) Data augmentation based on adversarial autoencoder handling imbalance for learning to rank. In: AAAI Conference on Artificial Intelligence, vol 33, no 01, pp 411–418
Zeng S, Zhang B, Gou J (2020) Learning double weights via data augmentation for robust sparse and collaborative representation-based classification. Multimed Tools Appl 79:20617–20638
Zhang H, Goodfellow I, Metaxas D, Odena A (2018) Self-attention generative adversarial networks. arXiv:1805.08318
Zhong Z, Zheng L, Kang G, Li S, Yang Y (2017) Random erasing data augmentation. arXiv:1708.04896
Zhu X, Liu Y, Qin Z, Li J (2017) Data augmentation in emotion classification using generative adversarial networks. arXiv:1711.00648
Acknowledgements
The research of this work has been supported by the Dean’s Research Fund 2018-19 (FLASS/DRF/IDS-3), Departmental Collaborative Research Fund 2019 (MIT/DCRF-R2/18-19) of The Education University of Hong Kong, a grant from the Fundamental Research Funds for the Central Universities, China (Projects: 2022ECNU-HLYT001) and the Direct Grant (DR22A2) of Lingnan University, Hong Kong.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Q., Luo, L., Xie, H. et al. A deep data augmentation framework based on generative adversarial networks. Multimed Tools Appl 81, 42871–42887 (2022). https://doi.org/10.1007/s11042-022-13476-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-13476-w