Skip to main content
Log in

A Novel Data Augmentation Method for Chinese Character Spatial Structure Recognition by Normalized Deformable Convolutional Networks

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

In this paper, we propose a novel data augmentation method and a normalized deformable convolutional network for natural image classification and handwritten Chinese character structure recognition. The spatial structure is the basic characteristics of Chinese character, and it plays a very important role in understanding and learning Chinese character. But the convolutional neural networks are inherently limited to model geometric transformations due to the fixed geometric structures in their building modules. So, we use the deformable convolutional network to deal with this task. Furthermore, we propose a normalized deformable convolutional network to improve the stability and accuracy of the model. Besides, some traditional data augmentation method could change one Chinese character structure to another, we propose a novel data augmentation method named Matt data augmentation (MDA) to improve the recognition performance. The normalized deformable Resnet with MDA achieve the best accuracy (93.62%) on handwritten Chinese character structure data set. Besides, the CapsuleNet with MDA can also improve to 89.41% test accuracy compared to without MDA (87.75%). Extensive experiments validate the performance of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Liu C-L, Yin F, Wang D-H, Wang Q-F (2011) Casia online and offline Chinese handwriting databases. In: 2011 international conference on document analysis and recognition (ICDAR). IEEE, pp 37–41

  2. Ren S, He K, Girshick R et al (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149

    Article  Google Scholar 

  3. Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot multibox detector. Computer vision. 14th ECCV, Amsterdam. In: Lecture notes in computer science, vol 9905, pp 21–37

  4. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  5. Zeiler MD, Krishnan D, Taylor GW (2010) Deconvolutional networks. 23rd CVPR, San Francisco. In: IEEE conference on computer vision and pattern recognition

  6. Biem A, Katagiri S, Juang B-H (1997) Pattern recognition using discriminative feature extraction. IEEE Trans Signal Process 45(2):500–504

    Article  Google Scholar 

  7. D. Ciresan U, Meie, L, Gambardella J, Schmidhuber A (2011) Convolutional neural network committees for handwritten character classification. In: Proceedings of 11th ICDAR, pp 1135–1139

  8. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  9. He K, Zhang X, Ren S (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. 13th ECCV, Switzerland. In: Lecture notes in computer science, vol 8691, pp 346–361

  10. Ciresan D, Schmidhuber J (2013) Multi-column deep neural networks for handwritten Chinese character classfication, technical report, no. IDSIA-05-13

  11. Huo Q, Ge Y, Feng Z-D (2001) High performance Chinese OCR based on Gabor features, discriminative feature extraction and model training. In: Proceedings of ICASSP, pp 1517–1520

  12. Wu C, Fan W, He Y, Sun J, Naoi S (2014) Handwritten character recognition by alternately trained relaxation convolutional neural network. In: Proceedings of 14th ICFHR, pp 291–296

  13. Girshick R, Iandola F, Darrell T, et al (2016) Deformable part models are convolutional neural networks. CVPR, Boston. In: IEEE conference on computer vision and pattern recognition, pp 437–446

  14. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: International conference on neural information processing systems, vol 25, pp 1097–1105

  15. LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. In: The handbook of brain theory and neural networks

  16. He K et al (2016) Deep residual learning for image recognition. 29th CVPRW, Las Vegas. In: Computer vision and pattern recognition IEEE, pp 770–778

  17. Liu C-L, Koga M, Fujisawa H (2005) Gabor feature extraction for character recognition: comparison with gradient feature. In: Proceedings of 8th ICDAR, pp 121–125

  18. Yin F, Wang Q-F, Zhang X-Y, Liu C-L (2013) ICDAR 2013 Chinese handwriting recognition competition. In: Proceedings of 12th ICDAR, pp 1464–1470

  19. Ghiasi-Shirazi K (2019) Generalizing the convolution operator in convolutional neural networks. Neural Process Lett 50:2627–2646

    Article  Google Scholar 

  20. Szegedy C et al (2015) Going deeper with convolutions. CVPR, Boston. In: IEEE conference on computer vision and pattern recognition, pp 1–9

  21. Wang X, Ding X, Liu C (2002) Optimized Gabor filter based feature extraction for character recognition. In: Proceedings of 16th ICPR, pp 223–226

  22. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 27th CVPR, Columbus. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 580–587

  23. Girshick R (2015) Fast R-CNN. ICCV, Santiago. In: IEEE international conference on computer vision IEEE computer society, pp 1440–1448

  24. Najva N, EdetBijoy K (2016) SIFT and tensor based object detection and classification in videos using deep neural networks. Procedia Comput Sci 93:351–358

    Article  Google Scholar 

  25. Redmon J et al (2016) You only look once: unified, real-time object detection. 29th CVPRW, Las Vegas. In: IEEE conference on computer vision and pattern recognition, pp 779–788

  26. Jain A, Mishra A, Shukla A, Tiwari R (2019) A novel genetically optimized convolutional neural network for traffic sign recognition: a new benchmark on Belgium and Chinese traffic sign datasets. Neural Process Lett 50:3019–3043

    Article  Google Scholar 

  27. Ohn-Bar E, ManubhaiTrivedi M (2017) Multi-scale volumes for deep object detection and localization. Pattern Recognit 61:557–572

    Article  Google Scholar 

  28. Wei Xu, Parvin H, Izadparast H (2020) Deep learning neural network for unconventional images classification. Neural Process Lett 52:169–185

    Article  Google Scholar 

  29. Hariharan B, Arbeláez P, Girshick R et al (2014) Simultaneous detection and segmentation. ECCV. Comput Vis 8695:297–312

    Google Scholar 

  30. Long J, Shelhamer E, Darrell T (2017) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39:640–651

    Article  Google Scholar 

  31. He K, Sun J (2015) Convolutional neural networks at constrained time cost. In: IEEE conference on computer vision and pattern recognition, vol 37.9, pp 1904–1916

  32. MNIST-fashion. https://github.com/zalandoresearch/fashion-mnist

  33. Xianjun Wu, Chen H, Xiaoli Wu, Shunjun Wu, Huang J (2021) Burn image recognition of medical images based on deep learning: from CNNs to advanced networks. Neural Process Lett 53:2439–2456

    Article  Google Scholar 

  34. Pitchai R, MadhuBabu C, Supraja P, Challa MK (2021) Cerebrum tumor segmentation of high resolution magnetic resonance images using 2D-convolutional network with skull stripping. Neural Process Lett 53:2567–2580

    Article  Google Scholar 

  35. Simonyan K, Zisserman. A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  36. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift, pp 448–456

  37. Szegedy C, Vanhoucke V, Ioffe S (2016) Rethinking the inception architecture for computer vision. 29th CVPRW, Las Vegas. In: IEEE conference on computer vision and pattern recognition, pp 2818–2826

  38. Szegedy C, Ioffe S, Vanhoucke V (2016) Inception-v4, Inception-ResNet and the impact of residual connections on learning, p 12

  39. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, Lecun Y (2013) Overfeat: integrated recognition, localization and detection using convolutional networks

  40. 臧克和. 结构与意义[J]. 中国文字研究, Zang K (2013) The meaning and structure of Chinese character[J]. Study Chin Charact 1:10

  41. 王作新. 汉字结构系统与传统思维方式[M]. 武汉出版社, 1999. Wang Z (1999) The structural system of Chinese characters and traditional thinking mode[M]. Wuhan Press

  42. Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. Comput Vis Pattern Recognit. arXiv:1710.09829

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheng Zhuo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhuo, S., Zhang, J. & Zhang, C. A Novel Data Augmentation Method for Chinese Character Spatial Structure Recognition by Normalized Deformable Convolutional Networks. Neural Process Lett 54, 5545–5563 (2022). https://doi.org/10.1007/s11063-022-10873-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-022-10873-y

Keywords

Navigation