Abstract
Although great success has been achieved in online handwritten Chinese text recognition (OLHCTR), most existing methods based on over-segmentation or long short-term memory are inefficient and not parallelizable. Moreover, n-gram language models and beam search algorithm were commonly adopted by many existing systems as a part of post-processing, resulting in extremely low speed and large footprint. To this end, we propose a fast, accurate and compact approach for OLHCTR. The proposed method consists of a global and local relationship network (GLRNet) and a Transformer-based language model (TransLM). A novel feature extraction mechanism, which alternately learns global and local dependencies of input trajectories, is proposed in GLRNet for the recognition of online texts. Based on the output of GLRNet, TransLM captures contextual information through Transformer encoder and further improves the recognition accuracy. The recognition and language modelling are always treated as two separate parts. However, the two components of our methods are jointly optimized, which ensures the optimal performance of the whole model. Furthermore, the non-recurrence design improves the parallelization and efficiency of our method, and the parameterized TransLM avoids the large footprint to store the probabilities of n-grams. The experiments on CASIA-OLHWDB2.0-2.2 and ICDAR2013 competition dataset show that our method achieves state-of-the-art performances with the fastest speed and the smallest footprint. Especially in the situation with language model, our method exhibits 2 times to 130 times acceleration compared with existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sogou lab data. http://www.sogou.com/labs/resource/cs.php. R&D Center of SOHU
Chinese linguistic data consortium (2009). http://www.chineseldc.org. The Contemporary Corpus developed by State Language Commission P.R. China, Institute of Applied Linguistics
Cai, M., Huo, Q.: Compact and efficient WFST-based decoders for handwriting recognition. In: ICDAR, pp. 143–148 (2017)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020, Part I. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chen, K., et al.: A compact CNN-DBLSTM based character model for online handwritten Chinese text recognition. In: ICDAR, pp. 1068–1073 (2017)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186 (2019)
Gan, J., Wang, W., Lu, K.: A new perspective: recognizing online handwritten Chinese characters via 1-dimensional CNN. Inf. Sci. 478, 375–390 (2019)
Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: ICML, pp. 369–376 (2006)
Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: ICML, pp. 1764–1772 (2014)
Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2008)
Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., Soricut, R.: ALBERT: a lite BERT for self-supervised learning of language representations. In: ICLR (2019)
Liu, C.L., Yin, F., Wang, D.H., Wang, Q.F.: CASIA online and offline Chinese handwriting databases. In: ICDAR, pp. 37–41 (2011)
Liu, M., Xie, Z., Huang, Y., Jin, L., Zhou, W.: Distilling GRU with data augmentation for unconstrained handwritten text recognition. In: ICFHR, pp. 56–61 (2018)
Liu, Y., et al.: RoBERTa: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: ICLR (2018)
Lu, J., Batra, D., Parikh, D., Lee, S.: Vilbert: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. arXiv preprint arXiv:1908.02265 (2019)
Lu, N., Yu, W., Qi, X., Chen, Y., Gong, P., Xiao, R.: MASTER: multi-aspect non-local network for scene text recognition. Pattern Recognit. 117, 107980 (2021)
Sheng, F., Chen, Z., Xu, B.: NRTR: a no-recurrence sequence-to-sequence model for scene text recognition. In: ICDAR, pp. 781–786 (2019)
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2016)
Sun, L., Su, T., Liu, C., Wang, R.: Deep LSTM networks for online Chinese handwriting recognition. In: ICFHR, pp. 271–276 (2016)
Vaswani, A., et al.: Attention is all you need. In: NeuIPS (2017)
Wang, D.H., Liu, C.L., Zhou, X.D.: An approach for real-time recognition of online Chinese handwritten sentences. Pattern Recognit. 45(10), 3661–3675 (2012)
Wang, P., Yang, L., Li, H., Deng, Y., Shen, C., Zhang, Y.: A simple and robust convolutional-attention network for irregular text recognition. arXiv preprint arXiv:1904.01375 6 (2019)
Wang, Q.F., Yin, F., Liu, C.L.: Handwritten Chinese text recognition by integrating multiple contexts. IEEE Trans. Pattern Anal. Mach. Intell. 34(8), 1469–1481 (2011)
Wu, Y.C., Yin, F., Liu, C.L.: Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models. Pattern Recognit. 65, 251–264 (2017)
Wu, Z., Liu, Z., Lin, J., Lin, Y., Han, S.: Lite transformer with long-short range attention. In: ICLR (2019)
Xie, Z., Sun, Z., Jin, L., Feng, Z., Zhang, S.: Fully convolutional recurrent network for handwritten Chinese text recognition. In: ICPR, pp. 4011–4016 (2016)
Xie, Z., Sun, Z., Jin, L., Ni, H., Lyons, T.: Learning spatial-semantic context with fully convolutional recurrent network for online handwritten Chinese text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(8), 1903–1917 (2018)
Yang, L., Wang, P., Li, H., Li, Z., Zhang, Y.: A holistic representation guided attention network for scene text recognition. Neurocomputing 414, 67–75 (2020)
Yin, F., Wang, Q.F., Zhang, X.Y., Liu, C.L.: ICDAR 2013 Chinese handwriting recognition competition. In: ICDAR, pp. 1464–1470 (2013)
Zhang, X.Y., Yin, F., Zhang, Y.M., Liu, C.L., Bengio, Y.: Drawing and recognizing Chinese characters with recurrent neural network. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 849–862 (2017)
Zhou, X.D., Wang, D.H., Tian, F., Liu, C.L., Nakagawa, M.: Handwritten Chinese/Japanese text recognition using semi-Markov conditional random fields. IEEE Trans. Pattern Anal. Mach. Intell. 35(10), 2413–2426 (2013)
Zhou, X.D., Zhang, Y.M., Tian, F., Wang, H.A., Liu, C.L.: Minimum-risk training for semi-Markov conditional random fields with application to handwritten Chinese/Japanese text recognition. Pattern Recognit. 47(5), 1904–1916 (2014)
Acknowledgment
This research is supported in part by NSFC (Grant No.: 61936003, 61771199), GD-NSF (no. 2017A030312006).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Peng, D. et al. (2021). Towards Fast, Accurate and Compact Online Handwritten Chinese Text Recognition. In: Lladós, J., Lopresti, D., Uchida, S. (eds) Document Analysis and Recognition – ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science(), vol 12823. Springer, Cham. https://doi.org/10.1007/978-3-030-86334-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-86334-0_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86333-3
Online ISBN: 978-3-030-86334-0
eBook Packages: Computer ScienceComputer Science (R0)