Abstract
The performance of deep learning architectures, where all the feature extraction stages are learned within the artificial neural network, requires a large number of labeled examples to model the variability of the different possible inputs. This issue is also present in other classification tasks with a large number of features where the number of examples can limit the number of possible input features, involving the creation of handcraft feature sets. Typical solutions in computer vision and document analysis and recognition increase the size of the database with additional geometric transformations (e.g. shift and rotation) and random elastic deformations of the original training examples. In this paper, we propose to evaluate the impact of additional images created through generative adversarial networks (GANs), which are deep neural network architectures. We study the addition of images created through a multiclass GAN in different databases of handwritten numerals from different scripts (Latin, Devanagari, and Oriya). The contributions of this paper are related to the use of multiclass GANs to extend the size of the training database after filtering the images with a k-nearest neighbor classifier where the k nearest neighbors must all agree on the decision to validate a GAN generated image. The accuracy is evaluated with the original training dataset, the GAN generated images, and the combination of the original training images and the GAN generated images. The results support the conclusion that GAN generated images through a multiclass paradigm can provide a robust and fully data driven solution for enlarging the size of the training database for improving the accuracy on the test dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baird, H.: Document image defect models. In: Proceedings of the IAPR Workshop on Syntactic and Structural Pattern Recognition, pp. 38–46 (1990)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Bhattacharya, U., Chaudhuri, B.: Databases for research on recognition of handwritten characters of Indian scripts. In: Proceedings of the 8th Intetnational Conference on Document Analysis and Recognition (ICDAR 2005), pp. 789–793 (2005)
Bhowmick, T., Parui, S., Bhattacharya, U., Shaw, B.: An HMM based recognition scheme for handwritten Oriya numerals. In: Proceedings of the 9th International Conference on Information Technology (ICIT 2006), pp. 105–110 (2006)
Bouguelia, M.R., Nowaczyk, S., Santosh, K.C., Verikas, A.: Agreeing to disagree: active learning with noisy labels without crowdsourcing. Int. J. Mach. Learn. Cybern. 9, 1307–1319 (2018)
Cecotti, H.: Active graph based semi-supervised learning using image matching: application to handwritten digit recognition. Pattern Recogn. Lett. 73, 76–82 (2016)
Cecotti, H.: Hierarchical k-nearest neighbor with GPUS and high performance cluster: application to handwritten character recognition. Int. J. Pattern Recogn. Artif. Intell. 31(2), 1–24 (2017)
Chaudhuri, B.B., Pal, U.: A complete printed Bangla OCR system. Pattern Recogn. 31, 531–549 (1998)
Cireşan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649 (2012)
Goodfellow, I., et al.: Generative adversarial nets. In: Proceedings of the 2014 Conference on Advances in Neural Information Processing Systems, vol. 27. pp. 2672–2680 (2014)
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., Courville, A.C.: Improved training of Wasserstein GANs. In: Advances in Neural Information Processing Systems, pp. 5767–5777 (2017)
Keysers, D., Deselaers, T., Gollan, C., Ney, H.: Deformation models for image recognition. IEEE Trans. Pattern Anal. Mach. Intell. 29(8), 1422–1435 (2007)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Lucic, M., Kurach, K., Michalski, M., Gelly, S., Bousquet, O.: Are GANs created equal? a large-scale study. arXiv preprint arXiv:1711.10337 (2017)
Pal, U., Chaudhuri, B.B.: Indian script character recognition: a survey. Pattern Recogn. 37(9), 1887–1899 (2004)
Reed, S., Akata, Z., Yan, X.C., Logeswaran, L., Lee, H., Schiele, B.: Generative adversarial text to image synthesis. In: Proceedings of the 33rd International Conference on Machine Learning (ICML) (2016)
Santosh, K.C.: Character recognition based on DTW-radon. In: Interntional Conference on Document Analysis and Recognition (ICDAR), pp. 264–268 (2011)
Santosh, K.C., Wending, L.: Character recognition based on non-linear multi-projection profiles measure. Front. Comput. Sci. 9(5), 678–690 (2015)
Schawinski, K., Zhang, C., Zhang, H., Fowler, L., Santhanam, G.K.: Generative adversarial networks recover features in astrophysical images of galaxies beyond the deconvolution limit. Mon. Not. Roy. Astron. Soc. Lett. 467(1), L110–L114 (2017)
Schmidhuber, J.: Learning factorial codes by predictability minimization. Neural Comput. 4(6), 863–879 (1992)
Simard, P., Steinkraus, D., Platt, J.: Best practices for convolutional neural networks applied to visual document analysis. In: Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR), pp. 958–962, August 2003
Simard, P., Victorri, B., LeCun, Y., Denker, J.: Tangent prop - a formalism for specifying selected in variances in an adaptive network. In: Moody, J.E., Hanson, S.J., Lippmann, R.P. (ed.) Advances in Neural Information Processing Systems, pp. 895–903 (1991)
Vajda, S., Santosh, K.C.: A fast k-nearest neighbor classifier using unsupervised clustering. In: Santosh, K.C., Hangarge, M., Bevilacqua, V., Negi, A. (eds.) RTIP2R 2016. CCIS, vol. 709, pp. 185–193. Springer, Singapore (2017). https://doi.org/10.1007/978-981-10-4859-3_17
Vajda, S., Rangoni, Y., Cecotti, H.: Semi-automatic ground truth generation using unsupervised clustering and limited manual labeling: application to handwritten character recognition. Pattern Recogn. Lett. 58, 23–28 (2015)
Zhang, Y.Z., Gan, Z., Carin, L.: Generating text via adversarial training. In: Proceedings of the 2016 Conference on Advances in Neural Information Processing Systems, p. 29 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Cecotti, H., Jha, G. (2019). Training Dataset Extension Through Multiclass Generative Adversarial Networks and K-nearest Neighbor Classifier. In: Santosh, K., Hegadi, R. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2018. Communications in Computer and Information Science, vol 1035. Springer, Singapore. https://doi.org/10.1007/978-981-13-9181-1_52
Download citation
DOI: https://doi.org/10.1007/978-981-13-9181-1_52
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9180-4
Online ISBN: 978-981-13-9181-1
eBook Packages: Computer ScienceComputer Science (R0)