Data augmentation for handwritten digit recognition using generative adversarial networks

Jha, Ganesh; Cecotti, Hubert

doi:10.1007/s11042-020-08883-w

Data augmentation for handwritten digit recognition using generative adversarial networks

Published: 23 April 2020

Volume 79, pages 35055–35068, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ganesh Jha¹ &
Hubert Cecotti¹

933 Accesses
16 Citations
Explore all metrics

Abstract

Supervised learning techniques require labeled examples that can be time consuming to obtain. In particular, deep learning approaches, where all the feature extraction stages are learned within the artificial neural network, require a large number of labeled examples to train the model. Various data augmentation techniques can be performed to overcome this issue by taking advantage of known variations that have no impact on the label of an example. Typical solutions in computer vision and document analysis and recognition are based on geometric transformations (e.g. shift and rotation) and random elastic deformations of the original training examples. In this paper, we consider Generative Adversarial Networks (GAN), a technique that does not require prior knowledge of the possible variabilities that exist across examples to create novel artificial examples. In the case of a training dataset with a low number of labeled examples, which are described in a high dimensional space, the classifier may generalize poorly. Therefore, we aim at enriching databases of images or signals for improving the classifier performance by designing a GAN for creating artificial images. While adding more images through a GAN can help, the extent to which it will help is unknown, and it may degrade the performance if too many artificial images are added. The approach is tested on four datasets on handwritten digits (Latin, Bangla, Devanagri, and Oriya). The accuracy for each dataset shows that the addition of GAN generated images in the training dataset provides an improvement of the accuracy. However, the results suggest that the addition of too many GAN generated images deteriorates the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Training Dataset Extension Through Multiclass Generative Adversarial Networks and K-nearest Neighbor Classifier

A deep data augmentation framework based on generative adversarial networks

Article 13 August 2022

CNN-based data augmentation for handwritten gurumukhi text recognition

Article 06 February 2024

References

Baird H (1990) Document image defect models. In: Proc. of the IAPR workshop on syntactic and structural pattern recognition, pp 38–46
Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(4):509–522
Article Google Scholar
Bhattacharya U, Chaudhuri B (2005) Databases for research on recognition of handwritten characters of indian scripts. In: Proc. of the 8th int. conf. on document analysis and recognition (ICDAR’05), pp 789–793
Bhowmick T, Parui S, Bhattacharya U, Shaw B (2006) An HMM based recognition scheme for handwritten oriya numerals. In: Proc. of the 9th int. conf. on information technology (ICIT 2006), pp 105–110
Chaudhuri BB, Pal U (1998) A complete printed Bangla OCR system. Pattern Recogn 31:531–549
Article Google Scholar
Cireşan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: Computer vision and pattern recognition (CVPR), pp 3642–3649
Dieleman S, Willett KW, Dambre J (2015) Rotation-invariant convolutional neural networks for galaxy morphology prediction. Mon Not R Astron Soc 450:1441–1459
Article Google Scholar
Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542:115–118
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Advances in neural information processing systems, pp 2672–2680
Guha R, Das N, Kundu M, Nasipuri M, Santosh KC Devnet: an efficient CNN architecture for handwritten devanagari character recognition. International Journal of Pattern Recognition and Artificial Intelligence (2019). https://doi.org/10.1142/S0218001420520096
Isola P, Zhu JY, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Kamble PM, Hegadi RS (2015) Handwritten marathi character recognition using R-HOG feature. Procedia Comput Sci 45:266–274
Article Google Scholar
Kamble PM, Hegadi RS (2016) Comparative study of handwritten marathi characters recognition based on KNN and SVM classifier. In: Int. Conf. on recent trends in image processing and pattern recognition, pp 93–101
Kamble PM, Hegadi RS (2017) Deep neural network for handwritten marathi character recognition. Int J Imag Robot 17(1):95–107
Google Scholar
Keysers D, Deselaers T, Gollan C, Ney H (2007) Deformation models for image recognition. IEEE Trans Pattern Anal Machs Intell 29(8):1422–1435
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: Proc. advances in neural information processing systems, vol 25, pp 1090–1098
Kupyn O, Budzan V, Mykhailych M, Mishkin D, Matas J (2018) Deblurgan: blind motion deblurring using conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8183–8192
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521 (7553):436–444
Article Google Scholar
Leung MK, Xiong HY, Lee LJ, Frey BJ (2014) Deep learning of the tissue-regulated splicing code. Bioinformatics 30:i121–i129
Article Google Scholar
Li W, Gauci M, Groß R (2016) Turing learning: a metric-free approach to inferring behavior and its application to swarms. Swarm Intell 10(3):211–243
Article Google Scholar
Lucic M, Kurach K, Michalski M, Gelly S, Bousquet O (2018) Are gans created equal? A large-scale study. In: Advances in neural information processing systems, pp 700–709
Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V (2015) Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55:263–274
Article Google Scholar
Obaidullah SM, Santosh KC, Goncalves T, Das N, Roy K (eds) (2019) Document processing using machine learning. CRC Press, Boca Raton, FL, USA
Pal U, Chaudhuri BB (2004) Indian script character recognition: a survey. Pattern Recogn 37(9):1887–1899
Article Google Scholar
Pardeshi R, Chaudhuri BB, Hangarge M, Santosh KC (2014) Automatic handwritten indian scripts identification. In: Proc. of the 14th international conference on frontiers in handwriting recognition, pp 375–380
Razali NM, Wah YB, et al. (2011) Power comparisons of shapiro-wilk, kolmogorov-smirnov, lilliefors and anderson-darling tests. J Stat Model Anal 2(1):21–33
Google Scholar
Reed S, Akata Z, Yan X, Logeswaran L, Schiele B, Lee H (2016) Generative adversarial text to image synthesis. arXiv:https://arxiv.org/abs/1605.05396
Santosh KC, Lamiroy B, Wendling L (2012) Symbol recognition using spatial relations. Pattern Recogn Lett 33:331–341
Article Google Scholar
Schawinski K, Zhang C, Zhang H, Fowler L, Santhanam GK (2017) Generative adversarial networks recover features in astrophysical images of galaxies beyond the deconvolution limit. Mon Not R Astronom Soc: Lett 467(1):L110–L114
Google Scholar
Schmidhuber J (1992) Learning factorial codes by predictability minimization. Neural Comput 4(6):863–879
Article Google Scholar
Simard P, Victorri B, LeCun Y, Denker J (1991) Tangent prop - a formalism for specifying selected invariances in an adaptive network. In: Moody RPLEJE, Hanson SJ (eds) Advances in neural information processing systems, pp 895–903
Simard P, Steinkraus D, Platt J (2003) Best practices for convolutional neural networks applied to visual document analysis. In: Proc. of the 7th int. conf. document analysis and recognition (ICDAR), pp 958–962
Ukil S, Ghosh S, Obaidullah SM, Santosh KC, Roy K, Da N Improved word-level handwritten indic script identification by integrating small convolutional neural networks. Neural Computing and Applications (2019). https://doi.org/10.1007/s00521-019-04111-1
Xian Y, Lampert CH, Schiele B, Akata Z (2018) Zero-shot learning-a comprehensive evaluation of the good, the bad and the ugly. IEEE Transactions on Pattern Analysis and Machine Intelligence

Download references

Author information

Authors and Affiliations

Department of Computer Science, College of Science and Mathematics, California State University, Fresno (Fresno State), 2576 E. San Ramon MS ST 109, Fresno, CA, 93740-8039, USA
Ganesh Jha & Hubert Cecotti

Authors

Ganesh Jha
View author publications
You can also search for this author in PubMed Google Scholar
Hubert Cecotti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hubert Cecotti.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jha, G., Cecotti, H. Data augmentation for handwritten digit recognition using generative adversarial networks. Multimed Tools Appl 79, 35055–35068 (2020). https://doi.org/10.1007/s11042-020-08883-w

Download citation

Received: 06 May 2019
Revised: 17 March 2020
Accepted: 27 March 2020
Published: 23 April 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11042-020-08883-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data augmentation for handwritten digit recognition using generative adversarial networks

Abstract

Access this article

Similar content being viewed by others

Training Dataset Extension Through Multiclass Generative Adversarial Networks and K-nearest Neighbor Classifier

A deep data augmentation framework based on generative adversarial networks

CNN-based data augmentation for handwritten gurumukhi text recognition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Data augmentation for handwritten digit recognition using generative adversarial networks

Abstract

Access this article

Similar content being viewed by others

Training Dataset Extension Through Multiclass Generative Adversarial Networks and K-nearest Neighbor Classifier

A deep data augmentation framework based on generative adversarial networks

CNN-based data augmentation for handwritten gurumukhi text recognition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation