A Novel Data Augmentation Method for Chinese Character Spatial Structure Recognition by Normalized Deformable Convolutional Networks

Zhuo, Sheng; Zhang, Jiangshe; Zhang, Chunxia

doi:10.1007/s11063-022-10873-y

A Novel Data Augmentation Method for Chinese Character Spatial Structure Recognition by Normalized Deformable Convolutional Networks

Published: 30 August 2022

Volume 54, pages 5545–5563, (2022)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Sheng Zhuo¹,
Jiangshe Zhang¹ &
Chunxia Zhang¹

222 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

In this paper, we propose a novel data augmentation method and a normalized deformable convolutional network for natural image classification and handwritten Chinese character structure recognition. The spatial structure is the basic characteristics of Chinese character, and it plays a very important role in understanding and learning Chinese character. But the convolutional neural networks are inherently limited to model geometric transformations due to the fixed geometric structures in their building modules. So, we use the deformable convolutional network to deal with this task. Furthermore, we propose a normalized deformable convolutional network to improve the stability and accuracy of the model. Besides, some traditional data augmentation method could change one Chinese character structure to another, we propose a novel data augmentation method named Matt data augmentation (MDA) to improve the recognition performance. The normalized deformable Resnet with MDA achieve the best accuracy (93.62%) on handwritten Chinese character structure data set. Besides, the CapsuleNet with MDA can also improve to 89.41% test accuracy compared to without MDA (87.75%). Extensive experiments validate the performance of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data Augmentation for Handwritten Character Recognition of MODI Script Using Deep Learning Method

Offline Handwritten Hindi Character Recognition Using Deep Learning with Augmented Dataset

Offline Handwritten Chinese Character Recognition Based on New Training Methodology

References

Liu C-L, Yin F, Wang D-H, Wang Q-F (2011) Casia online and offline Chinese handwriting databases. In: 2011 international conference on document analysis and recognition (ICDAR). IEEE, pp 37–41
Ren S, He K, Girshick R et al (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Article Google Scholar
Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot multibox detector. Computer vision. 14th ECCV, Amsterdam. In: Lecture notes in computer science, vol 9905, pp 21–37
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Zeiler MD, Krishnan D, Taylor GW (2010) Deconvolutional networks. 23rd CVPR, San Francisco. In: IEEE conference on computer vision and pattern recognition
Biem A, Katagiri S, Juang B-H (1997) Pattern recognition using discriminative feature extraction. IEEE Trans Signal Process 45(2):500–504
Article Google Scholar
D. Ciresan U, Meie, L, Gambardella J, Schmidhuber A (2011) Convolutional neural network committees for handwritten character classification. In: Proceedings of 11th ICDAR, pp 1135–1139
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
He K, Zhang X, Ren S (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. 13th ECCV, Switzerland. In: Lecture notes in computer science, vol 8691, pp 346–361
Ciresan D, Schmidhuber J (2013) Multi-column deep neural networks for handwritten Chinese character classfication, technical report, no. IDSIA-05-13
Huo Q, Ge Y, Feng Z-D (2001) High performance Chinese OCR based on Gabor features, discriminative feature extraction and model training. In: Proceedings of ICASSP, pp 1517–1520
Wu C, Fan W, He Y, Sun J, Naoi S (2014) Handwritten character recognition by alternately trained relaxation convolutional neural network. In: Proceedings of 14th ICFHR, pp 291–296
Girshick R, Iandola F, Darrell T, et al (2016) Deformable part models are convolutional neural networks. CVPR, Boston. In: IEEE conference on computer vision and pattern recognition, pp 437–446
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on neural information processing systems, vol 25, pp 1097–1105
LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. In: The handbook of brain theory and neural networks
He K et al (2016) Deep residual learning for image recognition. 29th CVPRW, Las Vegas. In: Computer vision and pattern recognition IEEE, pp 770–778
Liu C-L, Koga M, Fujisawa H (2005) Gabor feature extraction for character recognition: comparison with gradient feature. In: Proceedings of 8th ICDAR, pp 121–125
Yin F, Wang Q-F, Zhang X-Y, Liu C-L (2013) ICDAR 2013 Chinese handwriting recognition competition. In: Proceedings of 12th ICDAR, pp 1464–1470
Ghiasi-Shirazi K (2019) Generalizing the convolution operator in convolutional neural networks. Neural Process Lett 50:2627–2646
Article Google Scholar
Szegedy C et al (2015) Going deeper with convolutions. CVPR, Boston. In: IEEE conference on computer vision and pattern recognition, pp 1–9
Wang X, Ding X, Liu C (2002) Optimized Gabor filter based feature extraction for character recognition. In: Proceedings of 16th ICPR, pp 223–226
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 27th CVPR, Columbus. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 580–587
Girshick R (2015) Fast R-CNN. ICCV, Santiago. In: IEEE international conference on computer vision IEEE computer society, pp 1440–1448
Najva N, EdetBijoy K (2016) SIFT and tensor based object detection and classification in videos using deep neural networks. Procedia Comput Sci 93:351–358
Article Google Scholar
Redmon J et al (2016) You only look once: unified, real-time object detection. 29th CVPRW, Las Vegas. In: IEEE conference on computer vision and pattern recognition, pp 779–788
Jain A, Mishra A, Shukla A, Tiwari R (2019) A novel genetically optimized convolutional neural network for traffic sign recognition: a new benchmark on Belgium and Chinese traffic sign datasets. Neural Process Lett 50:3019–3043
Article Google Scholar
Ohn-Bar E, ManubhaiTrivedi M (2017) Multi-scale volumes for deep object detection and localization. Pattern Recognit 61:557–572
Article Google Scholar
Wei Xu, Parvin H, Izadparast H (2020) Deep learning neural network for unconventional images classification. Neural Process Lett 52:169–185
Article Google Scholar
Hariharan B, Arbeláez P, Girshick R et al (2014) Simultaneous detection and segmentation. ECCV. Comput Vis 8695:297–312
Google Scholar
Long J, Shelhamer E, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39:640–651
Article Google Scholar
He K, Sun J (2015) Convolutional neural networks at constrained time cost. In: IEEE conference on computer vision and pattern recognition, vol 37.9, pp 1904–1916
MNIST-fashion. https://github.com/zalandoresearch/fashion-mnist
Xianjun Wu, Chen H, Xiaoli Wu, Shunjun Wu, Huang J (2021) Burn image recognition of medical images based on deep learning: from CNNs to advanced networks. Neural Process Lett 53:2439–2456
Article Google Scholar
Pitchai R, MadhuBabu C, Supraja P, Challa MK (2021) Cerebrum tumor segmentation of high resolution magnetic resonance images using 2D-convolutional network with skull stripping. Neural Process Lett 53:2567–2580
Article Google Scholar
Simonyan K, Zisserman. A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift, pp 448–456
Szegedy C, Vanhoucke V, Ioffe S (2016) Rethinking the inception architecture for computer vision. 29th CVPRW, Las Vegas. In: IEEE conference on computer vision and pattern recognition, pp 2818–2826
Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4, Inception-ResNet and the impact of residual connections on learning, p 12
Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, Lecun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks
臧克和. 结构与意义[J]. 中国文字研究, Zang K (2013) The meaning and structure of Chinese character[J]. Study Chin Charact 1:10
王作新. 汉字结构系统与传统思维方式[M]. 武汉出版社, 1999. Wang Z (1999) The structural system of Chinese characters and traditional thinking mode[M]. Wuhan Press
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. Comput Vis Pattern Recognit. arXiv:1710.09829

Download references

Author information

Authors and Affiliations

School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, 710049, China
Sheng Zhuo, Jiangshe Zhang & Chunxia Zhang

Authors

Sheng Zhuo
View author publications
You can also search for this author in PubMed Google Scholar
Jiangshe Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chunxia Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sheng Zhuo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhuo, S., Zhang, J. & Zhang, C. A Novel Data Augmentation Method for Chinese Character Spatial Structure Recognition by Normalized Deformable Convolutional Networks. Neural Process Lett 54, 5545–5563 (2022). https://doi.org/10.1007/s11063-022-10873-y

Download citation

Accepted: 03 May 2022
Published: 30 August 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11063-022-10873-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Novel Data Augmentation Method for Chinese Character Spatial Structure Recognition by Normalized Deformable Convolutional Networks

Abstract

Access this article

Similar content being viewed by others

Data Augmentation for Handwritten Character Recognition of MODI Script Using Deep Learning Method

Offline Handwritten Hindi Character Recognition Using Deep Learning with Augmented Dataset

Offline Handwritten Chinese Character Recognition Based on New Training Methodology

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Novel Data Augmentation Method for Chinese Character Spatial Structure Recognition by Normalized Deformable Convolutional Networks

Abstract

Access this article

Similar content being viewed by others

Data Augmentation for Handwritten Character Recognition of MODI Script Using Deep Learning Method

Offline Handwritten Hindi Character Recognition Using Deep Learning with Augmented Dataset

Offline Handwritten Chinese Character Recognition Based on New Training Methodology

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation