An Improved Convolutional Block Attention Module for Chinese Character Recognition

Zhou, Kai; Zhou, Yongsheng; Zhang, Rui; Wei, Xiaolin

doi:10.1007/978-3-030-57058-3_2

Kai Zhou¹¹,
Yongsheng Zhou¹¹,
Rui Zhang¹¹ &
…
Xiaolin Wei¹¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12116))

Included in the following conference series:

International Workshop on Document Analysis Systems

1342 Accesses
1 Citations

Abstract

Recognizing Chinese characters in natural images is a very challenging task, because they usually appear with artistic fonts, different styles, various lighting and occlusion conditions. This paper proposes a novel method named ICBAM (Improved Convolutional Block Attention Module) for Chinese character recognition in the wild. We present the concept of attention disturbance and combine it with CBAM (Convolutional Block Attention Module), which improve the generalization performance of the network and effectively avoid over-fitting. ICBAM is easy to train and deploy due to the ingenious design. Besides, it is worth mentioning that this module does not have any trainable parameters. Experiments conducted on the ICDAR 2019 ReCTS competition dataset demonstrate that our approach significantly outperforms the state-of-the-art techniques. In addition, we also verify the generalization performance of our method on the CTW dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cao, Y., Xu, J., Lin, S., Wei, F., Hu, H.: GCNet: non-local networks meet squeeze-excitation networks and beyond. CoRR, abs/1904.11492 (2019)
Google Scholar
Chen, L., et al.: SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Google Scholar
Google: Tensorflow. https://github.com/tensorflow/tensorflow
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Google Scholar
Hinton, G.F., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Improving neural networks by preventing co-adaptation of feature detectors. CoRR, abs/1207.0580 (2012)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018
Google Scholar
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167 (2015)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates Inc. (2012)
Google Scholar
Li, X., Wang, W., Hu, X., Yang, J.: Selective kernel networks
Google Scholar
Liu, X., et al.: ICDAR 2019 robust reading challenge on reading Chinese text on signboard (2019)
Google Scholar
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with restarts. CoRR, abs/1608.03983 (2016)
Google Scholar
Luo, C., Jin, L., Sun, Z.: A multi-object rectified attention network for scene text recognition. CoRR, abs/1901.03003 (2019)
Google Scholar
Russakovsky, O., et al.. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Google Scholar
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)
Article Google Scholar
Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: Aster: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2019)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)
Google Scholar
Song, Q., et al.: Reading Chinese scene text with arbitrary arrangement based on character spotting. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 5, pp. 91–96, September 2019
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
Google Scholar
Wang, F., et al.: Residual attention network for image classification. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Google Scholar
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Chapter Google Scholar
Wu, Y.-H., Yin, F., Zhang, X.-Y., Liu, L., Liu, C.-L.: SCAN: sliding convolutional attention network for scene text recognition. CoRR, abs/1806.00578 (2018)
Google Scholar
Xie, S., Girshick, R., Dollar, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017
Google Scholar

Download references

Author information

Authors and Affiliations

Meituan-Dianping Group, Beijing, China
Kai Zhou, Yongsheng Zhou, Rui Zhang & Xiaolin Wei

Authors

Kai Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yongsheng Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Rui Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolin Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Zhou .

Editor information

Editors and Affiliations

Huazhong University of Science and Technology, Wuhan, China
Xiang Bai
Autonomous University of Barcelona, Barcelona, Spain
Dimosthenis Karatzas
Lehigh University, Bethlehem, PA, USA
Daniel Lopresti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, K., Zhou, Y., Zhang, R., Wei, X. (2020). An Improved Convolutional Block Attention Module for Chinese Character Recognition. In: Bai, X., Karatzas, D., Lopresti, D. (eds) Document Analysis Systems. DAS 2020. Lecture Notes in Computer Science(), vol 12116. Springer, Cham. https://doi.org/10.1007/978-3-030-57058-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-57058-3_2
Published: 14 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57057-6
Online ISBN: 978-3-030-57058-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)