Skip to main content

SegCTC: Offline Handwritten Chinese Text Recognition via Better Fusion Between Explicit and Implicit Segmentation

  • Conference paper
  • First Online:
Document Analysis and Recognition - ICDAR 2023 (ICDAR 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14190))

Included in the following conference series:

  • 661 Accesses

Abstract

Handwritten Chinese text recognition (HCTR) is still a challenging and unsolved problem. The existing recognition methods are mainly categorized into two: explicit vs implicit segmentation-based methods. Explicit segmentation recognition methods use explicit character location information to train the recognizers. However, the widely used weakly supervised training strategy based on pseudo-label makes it difficult to get effective supervised training for difficult character samples. In contrast, the implicit segmentation recognition method use all transcript annotations for supervised training, but it is prone to misalignment problem due to the lack of explicit supervised information of character positions. To take advantage of the complementary nature of explicit and implicit segmentation approaches, we propose a new method, SegCTC, which better integrates these two approaches into a unified to be a more powerful recognizer. Specifically, we designed a hybrid Segmentation-based and Segmentation-free Feature Fusion Module (S\(^2\)FFM) to better fuse the features of both explicit and implicit segmentation-based recognition branches. Moreover, a co-transcription strategy is also proposed to better combine the predictions from different branches. Experiments on four widely used benchmarks including CASIA-HWDB, ICDAR2013, SCUT-HCCDoc and MTHv2 show that our method achieves state-of-the-art performance for the HCTR task under different scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9365–9374 (2019)

    Google Scholar 

  2. Du, J., Wang, Z.R., Zhai, J.F., Hu, J.S.: Deep neural network based hidden Markov model for offline handwritten Chinese text recognition. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 3428–3433. IEEE (2016)

    Google Scholar 

  3. Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd International Conference on Machine Learning (ICML), pp. 369–376 (2006)

    Google Scholar 

  4. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

    Google Scholar 

  5. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  6. Huang, Y., Jin, L., Peng, D.: Zero-shot Chinese text recognition via matching class embedding. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12823, pp. 127–141. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86334-0_9

    Chapter  Google Scholar 

  7. Liu, B., Sun, W., Kang, W., Xu, X.: Searching from the prediction of visual and language model for handwritten Chinese text recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12823, pp. 274–288. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86334-0_18

    Chapter  Google Scholar 

  8. Liu, C.L., Yin, F., Wang, D.H., Wang, Q.F.: CASIA online and offline Chinese handwriting databases. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 37–41. IEEE (2011)

    Google Scholar 

  9. Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017)

  10. Luo, C., Jin, L., Sun, Z.: Moran: a multi-object rectified attention network for scene text recognition. Pattern Recognit. 90, 109–118 (2019)

    Article  Google Scholar 

  11. Ma, W., Zhang, H., Jin, L., Wu, S., Wang, J., Wang, Y.: Joint layout analysis, character detection and recognition for historical document digitization. In: 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 31–36. IEEE (2020)

    Google Scholar 

  12. Messina, R., Louradour, J.: Segmentation-free handwritten Chinese text recognition with LSTM-RNN. In: 2015 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 171–175. IEEE (2015)

    Google Scholar 

  13. Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR). vol. 3, pp. 850–855. IEEE (2006)

    Google Scholar 

  14. Peng, D., Jin, L., Ma, W., Xie, C., Zhang, H., Zhu, S., Li, J.: Recognition of handwritten chinese text by segmentation: A segment-annotation-free approach. IEEE Trans, Multimedia (2022)

    Google Scholar 

  15. Peng, D., Jin, L., Wu, Y., Wang, Z., Cai, M.: A fast and accurate fully convolutional network for end-to-end handwritten Chinese text segmentation and recognition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 25–30. IEEE (2019)

    Google Scholar 

  16. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)

    Article  Google Scholar 

  17. Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., Bai, X.: ASTER: an attentional scene text recognizer with flexible rectification. IEEE Trans. Pattern Anal. Mach. Intell. 41(9), 2035–2048 (2018)

    Article  Google Scholar 

  18. Su, T.H., Zhang, T.W., Guan, D.J., Huang, H.J.: Off-line recognition of realistic Chinese handwriting using segmentation-free strategy. Pattern Recognit. 42(1), 167–182 (2009)

    Article  MATH  Google Scholar 

  19. Tanaka, R., Osada, K., Furuhata, A.: Text-conditioned character segmentation for CTC-based text recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds.) ICDAR 2021. LNCS, vol. 12823, pp. 142–156. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86334-0_10

    Chapter  Google Scholar 

  20. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems (NIPS). vol. 30 (2017)

    Google Scholar 

  21. Wang, D.H., Liu, C.L., Zhou, X.D.: An approach for real-time recognition of online Chinese handwritten sentences. Pattern Recognit. 45(10), 3661–3675 (2012)

    Article  Google Scholar 

  22. Wang, Q.F., Yin, F., Liu, C.L.: Handwritten Chinese text recognition by integrating multiple contexts. IEEE Trans. Pattern Anal. Mach. Intell. 34(8), 1469–1481 (2011)

    Article  Google Scholar 

  23. Wang, S., Chen, L., Xu, L., Fan, W., Sun, J., Naoi, S.: Deep knowledge training and heterogeneous CNN for handwritten Chinese text recognition. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 84–89. IEEE (2016)

    Google Scholar 

  24. Wang, T., et al.: Decoupled attention network for text recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). vol. 34, pp. 12216–12224 (2020)

    Google Scholar 

  25. Wang, Z.X., Wang, Q.F., Yin, F., Liu, C.L.: Weakly supervised learning for over-segmentation based handwritten Chinese text recognition. In: 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 157–162. IEEE (2020)

    Google Scholar 

  26. Wang, Z.R., Du, J., Wang, J.M.: Writer-aware CNN for parsimonious HMM-based offline handwritten Chinese text recognition. Pattern Recognit. 100, 107102 (2020)

    Article  Google Scholar 

  27. Wang, Z.-R., Du, J., Wang, W.-C., Zhai, J.-F., Hu, J.-S.: A comprehensive study of hybrid neural network hidden Markov model for offline handwritten Chinese text recognition. Int. J. Doc. Anal. Recogn. (IJDAR) 21(4), 241–251 (2018). https://doi.org/10.1007/s10032-018-0307-0

    Article  Google Scholar 

  28. Wu, Y.C., Yin, F., Chen, Z., Liu, C.L.: Handwritten Chinese text recognition using separable multi-dimensional recurrent neural network. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). vol. 1, pp. 79–84. IEEE (2017)

    Google Scholar 

  29. Xie, C., Lai, S., Liao, Q., Jin, L.: High performance offline handwritten Chinese text recognition with a new data preprocessing and augmentation pipeline. In: Bai, X., Karatzas, D., Lopresti, D. (eds.) DAS 2020. LNCS, vol. 12116, pp. 45–59. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-57058-3_4

    Chapter  Google Scholar 

  30. Xie, Z., Huang, Y., Zhu, Y., Jin, L., Liu, Y., Xie, L.: Aggregation cross-entropy for sequence recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6538–6547 (2019)

    Google Scholar 

  31. Xing, L., Tian, Z., Huang, W., Scott, M.R.: Convolutional character networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9126–9136 (2019)

    Google Scholar 

  32. Xiu, Y., Wang, Q., Zhan, H., Lan, M., Lu, Y.: A handwritten Chinese text recognizer applying multi-level multimodal fusion network. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1464–1469. IEEE (2019)

    Google Scholar 

  33. Yin, F., Wang, Q.F., Zhang, X.Y., Liu, C.L.: ICDAR 2013 Chinese handwriting recognition competition. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1464–1470. IEEE (2013)

    Google Scholar 

  34. Zeiler, M.D.: Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012)

  35. Zhang, H., Liang, L., Jin, L.: SCUT-HCCDoc: a new benchmark dataset of handwritten Chinese text in unconstrained camera-captured documents. Pattern Recognit. 108, 107559 (2020)

    Article  Google Scholar 

  36. Zhu, Z.Y., Yin, F., Wang, D.H.: Attention combination of sequence models for handwritten Chinese text recognition. In: 2020 17th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 288–294. IEEE (2020)

    Google Scholar 

Download references

Acknowledgement

This research is supported in part by NSFC (Grant No.: 61936003), Zhuhai Industry Core and Key Technology Research Project (no. 2220004002350), and Science and Technology Foundation of Guangzhou Huangpu Development District (No. 2020GH17) and GD-NSF (No.2021A1515011870).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lianwen Jin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, J., Peng, D., Li, H., Ni, H., Jin, L. (2023). SegCTC: Offline Handwritten Chinese Text Recognition via Better Fusion Between Explicit and Implicit Segmentation. In: Fink, G.A., Jain, R., Kise, K., Zanibbi, R. (eds) Document Analysis and Recognition - ICDAR 2023. ICDAR 2023. Lecture Notes in Computer Science, vol 14190. Springer, Cham. https://doi.org/10.1007/978-3-031-41685-9_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-41685-9_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-41684-2

  • Online ISBN: 978-3-031-41685-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics