Abstract
In text recognition, complex glyphs and tail classes have always been factors affecting model performance. Specifically for Chinese text recognition, the lack of shape-awareness can lead to confusion among close complex characters. Since such characters are often tail classes that appear less frequently in the training-set, making it harder for the model to capture its shape information. Hence in this work, we propose a structure-aware network utilizing the hierarchical composition information to improve the recognition performance of complex characters. Implementation-wise, we first propose an auxiliary radical branch and integrate it into the base recognition network as a regularization term, which distills hierarchical composition information into the feature extractor. A Tree-Similarity-based weighting mechanism is then proposed to further utilize the depth information in the hierarchical representation. Experiments demonstrate that the proposed approach can significantly improve the performances of complex characters and tail characters, yielding a better overall performance. Code is available at https://github.com/Levi-ZJY/SAN.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Baek, J., et al.: What is wrong with scene text recognition model comparisons? Dataset and model analysis. In: ICCV, pp. 4714–4722 (2019)
Cao, Z., Lu, J., Cui, S., Zhang, C.: Zero-shot handwritten Chinese character recognition with hierarchical decomposition embedding. Pattern Recognit. 107, 107488 (2020)
Chanda, S., Baas, J., Haitink, D., Hamel, S., Stutzmann, D., Schomaker, L.: Zero-shot learning based approach for medieval word recognition using deep-learned features. In: 16th International Conference on Frontiers in Handwriting Recognition, ICFHR 2018, Niagara Falls, 5–8 August 2018, pp. 345–350. IEEE Computer Society (2018). https://doi.org/10.1109/ICFHR-2018.2018.00067
Chanda, S., Haitink, D., Prasad, P.K., Baas, J., Pal, U., Schomaker, L.: Recognizing bengali word images - a zero-shot learning perspective. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5603–5610 (2021). https://doi.org/10.1109/ICPR48806.2021.9412607
Chen, J., Li, B., Xue, X.: Scene text telescope: text-focused scene image super-resolution. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12021–12030 (2021)
Chen, J., Li, B., Xue, X.: Zero-shot Chinese character recognition with stroke-level decomposition. In: IJCAI, pp. 615–621 (2021)
Chen, J., et al.: Benchmarking Chinese text recognition: datasets, baselines, and an empirical study. arXiv preprint arXiv:2112.15093 (2021)
Cheng, C., Xu, W., Bai, X., Feng, B., Liu, W.: Maximum entropy regularization and Chinese text recognition. arXiv preprint arXiv:2007.04651 (2020)
Ciresan, D.C., Meier, U.: Multi-column deep neural networks for offline handwritten chinese character classification. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–6 (2013)
Du, J., Wang, Z., Zhai, J.F., Hu, J.: Deep neural network based hidden Markov model for offline handwritten chinese text recognition. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 3428–3433 (2016)
Fang, S., Xie, H., Wang, Y., Mao, Z., Zhang, Y.: Read like humans: autonomous, bidirectional and iterative language modeling for scene text recognition. In: CVPR, pp. 7098–7107 (2021)
He, S., Schomaker, L.: Open set Chinese character recognition using multi-typed attributes. arXiv preprint arXiv:1808.08993 (2018)
Li, H., Wang, P., Shen, C., Zhang, G.: Show, attend and read: a simple and strong baseline for irregular text recognition. In: AAAI, pp. 8610–8617 (2019)
Luo, C., Jin, L., Sun, Z.: MORAN: a multi-object rectified attention network for scene text recognition. Pattern Recognit. 90, 109–118 (2019)
Qiao, Z., Zhou, Y., Yang, D., Zhou, Y., Wang, W.: SEED: semantics enhanced encoder-decoder framework for scene text recognition. In: CVPR, pp. 13525–13534 (2020)
Rai, A., Krishnan, N.C., Chanda, S.: Pho(sc)net: an approach towards zero-shot word image recognition in historical documents. arXiv preprint arXiv:2105.15093 (2021)
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)
Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2019)
Wang, Q.F., Yin, F., Liu, C.L.: Handwritten Chinese text recognition by integrating multiple contexts. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1469–1481 (2012)
Wang, T., Xie, Z., Li, Z., Jin, L., Chen, X.: Radical aggregation network for few-shot offline handwritten Chinese character recognition. Pattern Recognit. Lett. 125, 821–827 (2019)
Wang, W., Shu Zhang, J., Du, J., Wang, Z., Zhu, Y.: Denseran for offline handwritten Chinese character recognition. In: 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 104–109 (2018)
Wang, Z., Du, J.: Joint architecture and knowledge distillation in CNN for Chinese text recognition. Pattern Recognit. 111, 107722 (2019)
Wang, Z., Du, J., Wang, J.: Writer-aware CNN for parsimonious hmm-based offline handwritten Chinese text recognition. arXiv preprint arXiv:1812.09809 (2018)
Wu, C.J., Wang, Z., Du, J., Shu Zhang, J., Wang, J.: Joint spatial and radical analysis network for distorted Chinese character recognition. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 5, pp. 122–127 (2019)
Wu, C., Liang Fan, W., He, Y., Sun, J., Naoi, S.: Handwritten character recognition by alternately trained relaxation convolutional neural network. In: 2014 14th International Conference on Frontiers in Handwriting Recognition, pp. 291–296 (2014)
Wu, Y., Hu, X.: From textline to paragraph: a promising practice for Chinese text recognition. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) FTC 2020. AISC, vol. 1288, pp. 618–633. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-63128-4_48
Xiao, X., Jin, L., Yang, Y., Yang, W., Sun, J., Chang, T.: Building fast and compact convolutional neural networks for offline handwritten Chinese character recognition. Pattern Recognit. 72, 72–81 (2017)
Xiao, Y., Meng, D., Lu, C., Tang, C.K.: Template-instance loss for offline handwritten Chinese character recognition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 315–322 (2019)
Yin, F., Wang, Q.F., Zhang, X.Y., Liu, C.L.: ICDAR 2013 chinese handwriting recognition competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1464–1470 (2013)
Yu, D., et al.: Towards accurate scene text recognition with semantic reasoning networks. In: CVPR, pp. 12110–12119 (2020)
Shu Zhang, J., Zhu, Y., Du, J., Dai, L.: Ran: radical analysis networks for zero-shot learning of chinese characters. arXiv preprint arXiv:1711.01889 (2017)
Acknowledgement
The research is supported by National Key Research and Development Program of China (2020AAA0109700), National Science Fund for Distinguished Young Scholars (62125601), National Natural Science Foundation of China (62076024, 62006018), Interdisciplinary Research Project for Young Teachers of USTB (Fundamental Research Funds for the Central Universities)(FRF-IDRY-21-018).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, J., Liu, C., Yang, C. (2023). SAN: Structure-Aware Network for Complex and Long-Tailed Chinese Text Recognition. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14191. Springer, Cham. https://doi.org/10.1007/978-3-031-41734-4_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-41734-4_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41733-7
Online ISBN: 978-3-031-41734-4
eBook Packages: Computer ScienceComputer Science (R0)