Abstract
Recognizing Chinese characters in natural images is a very challenging task, because they usually appear with artistic fonts, different styles, various lighting and occlusion conditions. This paper proposes a novel method named ICBAM (Improved Convolutional Block Attention Module) for Chinese character recognition in the wild. We present the concept of attention disturbance and combine it with CBAM (Convolutional Block Attention Module), which improve the generalization performance of the network and effectively avoid over-fitting. ICBAM is easy to train and deploy due to the ingenious design. Besides, it is worth mentioning that this module does not have any trainable parameters. Experiments conducted on the ICDAR 2019 ReCTS competition dataset demonstrate that our approach significantly outperforms the state-of-the-art techniques. In addition, we also verify the generalization performance of our method on the CTW dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cao, Y., Xu, J., Lin, S., Wei, F., Hu, H.: GCNet: non-local networks meet squeeze-excitation networks and beyond. CoRR, abs/1904.11492 (2019)
Chen, L., et al.: SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Google: Tensorflow. https://github.com/tensorflow/tensorflow
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Hinton, G.F., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580 (2012)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167 (2015)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc. (2012)
Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks
Liu, X., et al.: ICDAR 2019 robust reading challenge on reading Chinese text on signboard (2019)
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with restarts. CoRR, abs/1608.03983 (2016)
Luo, C., Jin, L., Sun, Z.: A multi-object rectified attention network for scene text recognition. CoRR, abs/1901.03003 (2019)
Russakovsky, O., et al.. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)
Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: Aster: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2019)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)
Song, Q., et al.: Reading Chinese scene text with arbitrary arrangement based on character spotting. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 5, pp. 91–96, September 2019
Szegedy, C., et al.: Going deeper with convolutions. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
Wang, F., et al.: Residual attention network for image classification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Wu, Y.-H., Yin, F., Zhang, X.-Y., Liu, L., Liu, C.-L.: SCAN: sliding convolutional attention network for scene text recognition. CoRR, abs/1806.00578 (2018)
Xie, S., Girshick, R., Dollar, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhou, K., Zhou, Y., Zhang, R., Wei, X. (2020). An Improved Convolutional Block Attention Module for Chinese Character Recognition. In: Bai, X., Karatzas, D., Lopresti, D. (eds) Document Analysis Systems. DAS 2020. Lecture Notes in Computer Science(), vol 12116. Springer, Cham. https://doi.org/10.1007/978-3-030-57058-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-57058-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57057-6
Online ISBN: 978-3-030-57058-3
eBook Packages: Computer ScienceComputer Science (R0)