Abstract
Handwritten Chinese text recognition (HCTR) is still a challenging and unsolved problem. The existing recognition methods are mainly categorized into two: explicit vs implicit segmentation-based methods. Explicit segmentation recognition methods use explicit character location information to train the recognizers. However, the widely used weakly supervised training strategy based on pseudo-label makes it difficult to get effective supervised training for difficult character samples. In contrast, the implicit segmentation recognition method use all transcript annotations for supervised training, but it is prone to misalignment problem due to the lack of explicit supervised information of character positions. To take advantage of the complementary nature of explicit and implicit segmentation approaches, we propose a new method, SegCTC, which better integrates these two approaches into a unified to be a more powerful recognizer. Specifically, we designed a hybrid Segmentation-based and Segmentation-free Feature Fusion Module (S\(^2\)FFM) to better fuse the features of both explicit and implicit segmentation-based recognition branches. Moreover, a co-transcription strategy is also proposed to better combine the predictions from different branches. Experiments on four widely used benchmarks including CASIA-HWDB, ICDAR2013, SCUT-HCCDoc and MTHv2 show that our method achieves state-of-the-art performance for the HCTR task under different scenarios.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9365–9374 (2019)
Du, J., Wang, Z.R., Zhai, J.F., Hu, J.S.: Deep neural network based hidden Markov model for offline handwritten Chinese text recognition. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 3428–3433. IEEE (2016)
Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning (ICML), pp. 369–376 (2006)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Huang, Y., Jin, L., Peng, D.: Zero-shot Chinese text recognition via matching class embedding. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12823, pp. 127–141. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86334-0_9
Liu, B., Sun, W., Kang, W., Xu, X.: Searching from the prediction of visual and language model for handwritten Chinese text recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12823, pp. 274–288. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86334-0_18
Liu, C.L., Yin, F., Wang, D.H., Wang, Q.F.: CASIA online and offline Chinese handwriting databases. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 37–41. IEEE (2011)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)
Luo, C., Jin, L., Sun, Z.: Moran: a multi-object rectified attention network for scene text recognition. Pattern Recognit. 90, 109–118 (2019)
Ma, W., Zhang, H., Jin, L., Wu, S., Wang, J., Wang, Y.: Joint layout analysis, character detection and recognition for historical document digitization. In: 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 31–36. IEEE (2020)
Messina, R., Louradour, J.: Segmentation-free handwritten Chinese text recognition with LSTM-RNN. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 171–175. IEEE (2015)
Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR). vol. 3, pp. 850–855. IEEE (2006)
Peng, D., Jin, L., Ma, W., Xie, C., Zhang, H., Zhu, S., Li, J.: Recognition of handwritten chinese text by segmentation: A segment-annotation-free approach. IEEE Trans, Multimedia (2022)
Peng, D., Jin, L., Wu, Y., Wang, Z., Cai, M.: A fast and accurate fully convolutional network for end-to-end handwritten Chinese text segmentation and recognition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 25–30. IEEE (2019)
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)
Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2018)
Su, T.H., Zhang, T.W., Guan, D.J., Huang, H.J.: Off-line recognition of realistic Chinese handwriting using segmentation-free strategy. Pattern Recognit. 42(1), 167–182 (2009)
Tanaka, R., Osada, K., Furuhata, A.: Text-conditioned character segmentation for CTC-based text recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12823, pp. 142–156. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86334-0_10
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (NIPS). vol. 30 (2017)
Wang, D.H., Liu, C.L., Zhou, X.D.: An approach for real-time recognition of online Chinese handwritten sentences. Pattern Recognit. 45(10), 3661–3675 (2012)
Wang, Q.F., Yin, F., Liu, C.L.: Handwritten Chinese text recognition by integrating multiple contexts. IEEE Trans. Pattern Anal. Mach. Intell. 34(8), 1469–1481 (2011)
Wang, S., Chen, L., Xu, L., Fan, W., Sun, J., Naoi, S.: Deep knowledge training and heterogeneous CNN for handwritten Chinese text recognition. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 84–89. IEEE (2016)
Wang, T., et al.: Decoupled attention network for text recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). vol. 34, pp. 12216–12224 (2020)
Wang, Z.X., Wang, Q.F., Yin, F., Liu, C.L.: Weakly supervised learning for over-segmentation based handwritten Chinese text recognition. In: 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 157–162. IEEE (2020)
Wang, Z.R., Du, J., Wang, J.M.: Writer-aware CNN for parsimonious HMM-based offline handwritten Chinese text recognition. Pattern Recognit. 100, 107102 (2020)
Wang, Z.-R., Du, J., Wang, W.-C., Zhai, J.-F., Hu, J.-S.: A comprehensive study of hybrid neural network hidden Markov model for offline handwritten Chinese text recognition. Int. J. Doc. Anal. Recogn. (IJDAR) 21(4), 241–251 (2018). https://doi.org/10.1007/s10032-018-0307-0
Wu, Y.C., Yin, F., Chen, Z., Liu, C.L.: Handwritten Chinese text recognition using separable multi-dimensional recurrent neural network. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). vol. 1, pp. 79–84. IEEE (2017)
Xie, C., Lai, S., Liao, Q., Jin, L.: High performance offline handwritten Chinese text recognition with a new data preprocessing and augmentation pipeline. In: Bai, X., Karatzas, D., Lopresti, D. (eds.) DAS 2020. LNCS, vol. 12116, pp. 45–59. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57058-3_4
Xie, Z., Huang, Y., Zhu, Y., Jin, L., Liu, Y., Xie, L.: Aggregation cross-entropy for sequence recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6538–6547 (2019)
Xing, L., Tian, Z., Huang, W., Scott, M.R.: Convolutional character networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9126–9136 (2019)
Xiu, Y., Wang, Q., Zhan, H., Lan, M., Lu, Y.: A handwritten Chinese text recognizer applying multi-level multimodal fusion network. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1464–1469. IEEE (2019)
Yin, F., Wang, Q.F., Zhang, X.Y., Liu, C.L.: ICDAR 2013 Chinese handwriting recognition competition. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1464–1470. IEEE (2013)
Zeiler, M.D.: Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)
Zhang, H., Liang, L., Jin, L.: SCUT-HCCDoc: a new benchmark dataset of handwritten Chinese text in unconstrained camera-captured documents. Pattern Recognit. 108, 107559 (2020)
Zhu, Z.Y., Yin, F., Wang, D.H.: Attention combination of sequence models for handwritten Chinese text recognition. In: 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 288–294. IEEE (2020)
Acknowledgement
This research is supported in part by NSFC (Grant No.: 61936003), Zhuhai Industry Core and Key Technology Research Project (no. 2220004002350), and Science and Technology Foundation of Guangzhou Huangpu Development District (No. 2020GH17) and GD-NSF (No.2021A1515011870).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Huang, J., Peng, D., Li, H., Ni, H., Jin, L. (2023). SegCTC: Offline Handwritten Chinese Text Recognition via Better Fusion Between Explicit and Implicit Segmentation. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14190. Springer, Cham. https://doi.org/10.1007/978-3-031-41685-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-41685-9_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-41684-2
Online ISBN: 978-3-031-41685-9
eBook Packages: Computer ScienceComputer Science (R0)