Elsevier

Neurocomputing

Volume 415, 20 November 2020, Pages 146-156
Neurocomputing

Occluded offline handwritten Chinese character inpainting via generative adversarial network and self-attention mechanism

https://doi.org/10.1016/j.neucom.2020.07.046Get rights and content

Abstract

Occluded offline handwritten Chinese characters inpainting is a critical step for handwritten Chinese characters recognition. We propose to apply generative adversarial network and self-attention mechanism to inpaint occluded offline handwritten Chinese characters. First, cyclic loss is used to guarantee the cyclic consistency of the uncorrupted area between corrupted images and original real images instead of masks. Second, self-attention mechanism is combined with generative adversarial network to increase receptive field and explore more Chinese character features. Then an improved character-VGG-19 that is pre-trained with handwritten Chinese character dataset is used to calculate content loss to extract character features more effectively and assist generator to generate realistic characters. Finally, adversarial classification loss is used to make our discriminator classify input images instead of just distinguishing real images from fake images in order to learn the distribution of Chinese characters more effectively. The proposed method is evaluated on an occluded CASIA-HWDB1.1 dataset for three challenging inpainting tasks with different portions of blocks, or pixels randomly missing, or pixels randomly adding. Experimental results show that our method is more effective, compared with several state-of-the-art handwritten Chinese character inpainting methods.

Introduction

Handwritten Chinese character recognition (HCCR) has been an active research endeavor for several decades. It has a broad application prospect in barrier-free reading, document entry, translation, processing of bank notes, sorting of postal letters and express parcels, facilitating users to input information quickly and improve work efficiency in all walks of life. Existing models for solving the problem make use of essential structural features of Chinese characters. However, in the real world, there are usually text breakage and text smudges in ancient books and handwritten manuscripts. When the characters are corrupted, it is difficult to recognize them owing to their invisible structural features. Therefore, for realizing occluded offline HCCR, the first step is to inpaint the images of occluded handwritten Chinese characters effectively.

Some approaches to image inpainting have been proposed, such as total variation [1], low-rank structure [2], convolutional neural networks [3] exemplar-based image inpainting [4] and generative adversarial networks [5]. However, most of these methods need to know the concrete corrupted positions using masks labelled beforehand and then inpaint the corrupted area directly. Recently, [6] proposed a handwritten Chinese character inpainting (HCCI) method based on deep convolutional generative adversarial network (DCGAN) [7]. The generator and discriminator of DCGAN are first combined to generate realistic Chinese characters from corrupted images, and the contextual loss and the content loss are further used to inpaint generated images without needing to know the concrete positions of corrupted regions.

Despite the breakthroughs in inpainting occluded handwritten Chinese characters using unsupervised GAN, some central problems remain largely unsolved. Firstly, traditional methods use masks to locate the concrete corrupted positions and the methods based on convolutional neural networks are also using masks to be a part of the loss function to guide the learning process of networks. By using masks, these methods are capable of protecting uncorrupted portions. However, such prior information is prone to be time-consuming and laborious to annotate. How to automatically locate broken positions in an image is a critical problem to solve. Secondly, the methods based on convolutional neural networks use the local information of corrupted areas to inpaint images. However, with the increase of corrupted areas, convolutional neural networks might obtain small receptive fields, resulting in inadequate utilization of features and generating blurry images. How we use more information from uncorrupted parts of images to inpaint the characters with large occluded regions and generate more realistic characters is another problem to solve. Finally, the methods based on generative adversarial networks learn the distribution of characters and then make use of the distribution to inpaint corrupted characters, leading to another serious problem: the generative adversarial network will generate some fake characters, that is, the generator just generates images like Chinese characters but they do not belong to any category of real characters. For example, as shown in Fig. 1, the generated images seem like Chinese characters, but they do not belong to any category.

In this paper, we present a new model named handwritten Chinese character inpainting generative adversarial network (HCCI-GAN) that addresses the three tough problems mentioned above by employing three key strategies. For the first problem, we use a cyclic loss to make the generator locate occluded positions and protect uncorrupted areas, saving time in annotations and guaranteeing the cyclic consistency of uncorrupted areas between corrupted images and original real images. For the second issue, with intent to use more information from uncorrupted parts of images to inpaint the characters with large occluded regions and generate more realistic images, self-attention mechanism [8] is adopted, which can use the cues of all feature locations. Moreover, we take the intermediate output of VGG-19 [9] as our feature representation to compute the content loss between generated images and original real images. The original VGG-19 is trained on natural color images and it is not suitable for Chinese characters. Therefore, we pre-train a new character-VGG-19 based on a handwritten Chinese character dataset. For the last problem, an adversarial classification loss is combined with the cyclic loss and new content loss, in order to relieve the situation where the generator “invents” meaningless Chinese characters by itself.

The main contributions of our work include:

  • We design a new occluded offline handwritten Chinese character dataset based on CASIA-HWDB1.1 dataset. A series of random occlusions are added onto the original handwritten Chinese characters to obtain corrupted character samples that are used as the input data of our model.

  • We propose a novel model for inpainting occluded offline handwritten Chinese characters, via using the original GAN with self-attention mechanism, pre-trained VGG-19 as perceptual loss, as well as adversarial classification loss and cyclic loss, to implement an end-to-end model, HCCI-GAN.

  • We inpaint occluded handwritten Chinese characters with large occluded blocks effectively and generate more realistic images. Experimental results show that the proposed model outperforms several state-of-the-art HCCI methods.

Section snippets

Handwritten Chinese character recognition

Handwritten Chinese characters recognition (HCCR) is a challenging problem due to the large number of character categories, confusion between similar characters, and distinct handwriting styles across individuals [10], [11]. Some methods have been proposed to solve the problem. For traditional methods, the procedures of HCCR usually include: image normalization, feature extraction, dimension reduction and classifier training. The recognition algorithms include support vector machines (SVM) [12]

Self-attention handwritten Chinese character inpainting model

Fig. 3 illustrates the model of handwritten Chinese character inpainting generative adversarial network (HCCI-GAN). Different from the original GAN, we divide the generative network into three modules. The first one is encoder module (E), which encodes an input image to an intermediate feature map using several convolutional layers to extract features. The second one is inpainter module (I), which contains a self-attention layer and several res-blocks to inpaint the corrupted area of the

Datasets

CASIA-HWDB1.1 handwritten Chinese character library includes 3755 GB2312 first-grade Chinese characters, as shown in Fig. 5. Every character in the dataset was written by 300 writers. We chose 80% of the samples to compose our original training set, and the rest to compose the original testing set. Subsequently, a series of random occlusions are used to generate the final training set and testing set, including black or white occluded blocks, missing pixels and adding noises. The ratio of the

Conclusions

This paper proposes an end-to-end occluded handwritten Chinese character inpainting model (HCCI-GAN) based on self-attention mechanism and generative adversarial network. HCCI-GAN employs three key strategies. It first uses self-attention mechanism to capture global dependencies and inpaints corrupted characters with large occluded blocks. Then, character-VGG-19 is pre-trained to calculate content loss. Finally, adversarial classification loss and cyclic loss are combined with the new content

CRediT authorship contribution statement

Ge Song: Conceptualization, Methodology, Software, Writing - original draft. Jianwu Li: Writing - review & editing, Supervision, Funding acquisition. Zheng Wang: Validation, Formal analysis, Investigation, Data curation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

The authors would like to thank the editor and anonymous reviewers for their constructive suggestions that improved this paper greatly. This work was supported by the Beijing Natural Science Foundation (No. L191004) and the National Natural Science Foundation of China (No. 61271374).

Ge Song Master candidate at the School of Computer Science and Technology, Beijing Institute of Technology, China. Her research interests include machine learning and image processing.

References (35)

  • K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint...
  • F. Kimura et al.

    Modified quadratic discriminant functions and the application to chinese character recognition

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1987)
  • R. Dai et al.

    Chinese character recognition: history, status and prospects

    Front. Comput. Sci. China

    (2007)
  • C. Cortes et al.

    Support-vector networks

    Mach. Learn.

    (1995)
  • D.S. Yeung et al.

    Handwritten chinese character recognition by rule-embedded neocognitron

    Neural Comput. Appl.

    (1994)
  • W. Liu et al.

    A new chinese character recognition approach based on the fuzzy clustering analysis

    Neural Comput. Appl.

    (2014)
  • Y. LeCun et al.

    Deep learning

    Nature

    (2015)
  • Ge Song Master candidate at the School of Computer Science and Technology, Beijing Institute of Technology, China. Her research interests include machine learning and image processing.

    Jianwu Li received the B.S., M.Eng. and Ph.D. degrees from Tianjin University, China, in 1997, 2000 and 2003, respectively. He is currently an Associate Professor with the School of Computer Science and Technology, Beijing Institute of Technology, China. His research interests include machine learning and image processing. Corresponding author of this paper.

    Zheng Wang Master candidate at the School of Computer Science and Technology, Beijing Institute of Technology, China. His research interests include computer vision, machine learning and image processing.

    View full text