Abstract
Great successes have been enjoyed in the previous work for Chinese character recognition (CCR), however, few impressive works have been done about the recognition of Chinese characters with complex backgrounds. This paper focuses on the recognition of overlaid Chinese characters - the Chinese characters embedded in images or videos - which are often with complex backgrounds and of diverse typefaces and styles. In this paper, we present a high-performance recognizer based on the deep convolutional neural network (CNN). To train the CNN, a large number of character images are first collected by the synthetic way. By fully considering the input size, depth, width, and filter sizes of a network, we present multiple candidate models with compact network architectures. Comprehensive comparison experiments are carried out to help us select the model, which requires only 13.6M for storage (3.6M parameters) and takes only 0.038 s for recognizing 3755 character images on a GPU. The experimental results shows that the model achieves the recognition rate of 99.77% on the test set, and a good generalization performance is also validated on the dataset of typefaces not included in the training set. Besides, the extensive comparison experiments presented in this paper might give lights into the formation of deep CNN.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bai, J., Chen, Z., Feng, B., Xu, B.: Chinese image text recognition on grayscale pixels. In: ICASSP 2014–2014 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1380–1384 (2014)
He, K., Sun, J.: Convolutional neural networks at constrained time cost, pp. 5353–5360 (2014)
Liu, X., Wang, W.: Robustly extracting captions in videos based on stroke-like edges and spatio-temporal analysis. IEEE Trans. Multimed. 14(2), 482–489 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Comput. Sci. (2014)
Xiao, X., Jin, L., Yang, Y., Yang, W., Sun, J., Chang, T.: Building fast and compact convolutional neural networks for offline handwritten Chinese character recognition. Pattern Recognit. 72, 72–81 (2017)
Yangqing, J., Evan, S., Jeff, D., Sergey, K., Jonathan, L.: Caffe: Convolutional architecture for fast feature embedding. Eprint Arxiv, pp. 675–678 (2014)
Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)
Yin, X.C., Zuo, Z.Y., Tian, S., Liu, C.L.: Text detection, tracking and recognition in video: a comprehensive survey. IEEE Trans. Image Process. 25(6), 1 (2016)
Zhai, C., Chen, Z., Li, J., Xu, B.: Chinese image text recognition with BLSTM-CTC: a segmentation-free method. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds.) CCPR 2016. CCIS, vol. 663, pp. 525–536. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-3005-5_43
Zhang, X.Y., Bengio, Y., Liu, C.L.: Online and offline handwritten chinese character recognition: a comprehensive study and new benchmark. Pattern Recognit. 61, 348–360 (2016)
Zhong, Z., Jin, L., Feng, Z.: Multi-font printed Chinese character recognition using multi-pooling convolutional neural network. In: International Conference on Document Analysis and Recognition, pp. 96–100 (2015)
Zhou, X., Zhou, S., Yao, C., Cao, Z., Yin, Q.: ICDAR 2015 text reading in the wild competition. Comput. Sci. (2015)
Acknowledgments
The work is supported by the National Natural Science Foundation of China under Grant No. 61271434, No. 61232013, and by Beijing Advanced Innovation Center for Imaging Technology under Grant No. BAICIT-2016009.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Li, H., Wang, W. (2018). Overlaid Chinese Character Recognition via a Compact CNN. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10735. Springer, Cham. https://doi.org/10.1007/978-3-319-77380-3_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-77380-3_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77379-7
Online ISBN: 978-3-319-77380-3
eBook Packages: Computer ScienceComputer Science (R0)