Elsevier

Pattern Recognition Letters

Volume 106, 15 April 2018, Pages 20-26
Pattern Recognition Letters

Robust offline handwritten character recognition through exploring writer-independent features under the guidance of printed data

https://doi.org/10.1016/j.patrec.2018.02.006Get rights and content

Highlights

  • The printed data as prior knowledge is learnt by adversarial learning objectively.

  • Our model automatically exploits writer-independent features on limited data.

  • We propose a novel adversarial feature learning model for classification task.

  • We achieve the state-of-the-art result on offline ICDAR-2013 HCCR dataset.

Abstract

Deep convolutional neural networks have made great progress in recent handwritten character recognition (HCR) by learning discriminative features from large amounts of labeled data. However, the large variance of handwriting styles across writers is still a big challenge to the robust HCR. To alleviate this issue, an intuitional idea is to extract writer-independent semantic features from handwritten characters, while standard printed characters are writer-independent stencils for handwritten characters. They could be used as prior knowledge to guide models to exploit writer-independent semantic features for HCR. In this paper, we propose a novel adversarial feature learning (AFL) model to incorporate the prior knowledge of printed data and writer-independent semantic features to improve the performance of HCR on limited training data. Different from available handcrafted features methods, the proposed AFL model exploits writer-independent semantic features automatically, and standard printed data as prior knowledge is learnt objectively. Systematic experiments on MNIST and CASIA–HWDB show that the proposed model is competitive with the state-of-the-art methods on the offline HCR task.

Introduction

1 Handwritten character recognition (HCR) has been widely applied in mail sorting [29], historical documents recognition [24], handwritten notes transcription and bank check reading. In addition, it is also an important component of handwritten text recognition [25]. To this end, decades of efforts have been devoted to HCR, but robust HCR is still a challenging task due to the huge variance of handwriting styles. In particular, the recognition task is more difficult when the handwritten characters are offline, while we focus on the offline HCR in this paper.

HCR is a typical classification task, while deep convolutional neural network (DCNN) is one of the most exciting classification models and makes great progress in many fields. At present, DCNN has been widely applied to HCR and achieved significant performance improvements [5], [7], [8], [31], [32], [35]. However, its success heavily relies on a large amount of labeled data which is high-cost. Moreover, DCNN-based HCR system suffers from the shift between the training and test distributions [33] when the variance of handwriting styles is large. In the previous works, there are mainly three methods proposed to alleviate this issue, which are summarized as follows: 1) using data augmentation techniques [5], [7] to generate more data with different handwriting styles, such as affine transformation [19] and distorted generation [14]; 2) adopting writer-adaptation to match the feature distributions from the source domain to the target domain [30], [32], [33]; 3) designing writer-independent features manually to reduce within-class variation of character shape [32], [35], such as normalization-cooperated gradient features [16]. These methods are well-designed on the basis of domain-specific knowledge to compensate for shape variation caused by various handwriting styles. However, generating more distorted data is insufficient to cover all the variations of handwritten characters, while writer adaptation methods need to match specific writers, and handcrafted writer-independent feature is so subjective that some pretty important information in characters may be lost. Besides, they don’t explicitly model the final recognition objective, and they cannot take advantage of extra information.

As known to all, standard printed characters are writer-independent and present more semantic contents of characters. While handwritten characters contain various handwriting styles information of writers, as shown in Fig. 1, which tremendously interferes with the recognition of handwritten characters. Many experiments show that the printed characters can be more easily recognized without the interference of handwriting styles of writers [1]. In fact, standard printed characters are usually used as stencils to instruct us to recognize new characters, especially when the pupils learn to read from a textbook. Therefore, it is possible to improve the performance of HCR by exploiting writer-independent semantic features, while standard printed characters could be used as prior knowledge to guide models to exploit these features.

Generative adversarial network (GAN) is a well-known adversarial learning model [10]. It is composed of a generator and a discriminator. The discriminator can guide the generator to adjust a complex data distribution to another specific distribution by adversarial learning. When GAN is performed on a handwritten character set, such as MNIST dataset [13], it’s interesting to observe that the generator could transfer noise vectors to realistic character images [6], [26]. What’s more, [28] used GAN to transfer images from street view house number (SVHN) dataset to the domain of MNIST dataset via GAN. Therefore, as shown in Fig. 2(a), it could be expected that the handwritten characters with various handwriting styles could be transferred into standard printed characters by adversarial learning.

Inspired by GAN, we propose a novel adversarial feature learning (AFL) model to exploit writer-independent semantic features for HCR. As shown in Fig. 2(b), AFL is a variant of GAN and composed of a feature extractor, a discriminator and a classifier, which concentrates the strengths of discriminative model and generative model for HCR. The feature extractor is used to extract encoding features of handwritten and printed characters. The discriminator judges whether the extracted features come from handwritten or standard printed characters. With the prior knowledge provided by standard printed characters, it can guide the feature extractor to exploit writer-independent semantic features from handwritten characters automatically. Finally, the extracted features are fed into the classifier to recognize the handwritten characters. The feature extractor, the discriminator and the classifier are jointly optimized by adversarial training. In the process of adversarial training, the prior knowledge from standard printed characters and writer-independent semantic features are incorporated, and hence we could get better performance of HCR.

We summarize our contributions as follows: 1) we introduce the writer-independent standard printed data as prior knowledge, which is learnt by AFL objectively, rather than use handcrafted writer-independent features which need a great amount of domain knowledge; 2) the proposed model could exploit writer-independent semantic features automatically, which in turn alleviates the large variance of handwriting styles for HCR; 3) the proposed AFL model could make better classification, which concentrates the strengths of discriminative model and generative model; 4) we achieve superior performance than the state-of-the-art model results on offline ICDAR-2013 handwritten Chinese character recognition competition dataset.

The remaining parts of this paper are organized as follows. Section 2 firstly reviews the related works. Then, we describe the proposed AFL method in Section 3. Experimental results and its detailed analysis are presented in Section 4. Finally, we draw concluding remarks in Section 5.

Section snippets

Related work

GAN [10] is to learn a generative model synthesizing images similar to real images through a two-player game between a generator and discriminator (in Fig. 2(a)). The key idea of GAN is an adversarial loss that forces the generator could find the mapping between noise vectors and real images. Despite many promising developments [2], [11], more recent works focus on image synthesis, such as image generation [4], [22] and representation learning [23]. There are a few trials to use GAN to make

Adversarial feature learning

The proposed AFL model tries to improve the performance of HCR by learning writer-independent semantic features of a handwritten characters, where we provide standard printed characters as prior knowledge. This is different from the conventional feature learning of DCNN, where the primary goal is to extract discriminative feature matching training set. AFL is composed of three neural network components: a feature extractor (F) that characterizes the features of handwritten and standard printed

Datasets

We have conducted preliminary experiments on the widely adopted MNIST [13]. MNIST consists of 60,000 training samples and 10,000 test samples of handwritten digits of size 28 × 28, which have 10 different classes from 0 to 9. To further evaluate the effectiveness of the proposed AFL algorithm for HCR, we present our method on the challenging ICDAR-2013 offline handwritten Chinese character recognition competition dataset [31], which is a large scale classification task with great diversity in

Conclusion

In this paper, we improve the performance of HCR by exploiting writer-independent semantic features with the prior knowledge of standard printed character, which is implemented by the proposed AFL model. Compared with the DCNN methods for HCR, the proposed AFL model could get significant performance improvement, especially when the available training data is inadequate. Specifically, we achieve the state-of-the-art result on offline handwritten Chinese character recognition.

We mention that our

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China (no. 61573357, no. 61503382, no. 61403370, no. 61273267, no. 91120303).

References (35)

  • D. Cireşan et al.

    Multi-column deep neural networks for offline handwritten chinese character classification

    Neural Networks (IJCNN), 2015 International Joint Conference on

    (2015)
  • Y. Ganin et al.

    Unsupervised domain adaptation by backpropagation

    International Conference on Machine Learning

    (2015)
  • I. Goodfellow et al.

    Generative adversarial nets

    Advances in neural information processing systems

    (2014)
  • I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, A. Courville, Improved training of wasserstein gans, arXiv:1704.00028...
  • S. Ioffe et al.

    Batch normalization: Accelerating deep network training by reducing internal covariate shift

    International Conference on Machine Learning

    (2015)
  • Y. LeCun et al.

    Mnist handwritten digit database

    AT&T Labs

    (2010)
  • K. Leung et al.

    Recognition of handwritten chinese characters by combining regularization, fisher’s discriminant and distorted sample generation

    Document Analysis and Recognition, 2009. ICDAR’09. 10th International Conference on

    (2009)
  • Cited by (0)

    View full text