Skip to main content
Log in

Discovering similar Chinese characters in online handwriting with deep convolutional neural networks

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

A primary reason for performance degradation in unconstrained online handwritten Chinese character recognition is the subtle differences between similar characters. Various methods have been proposed in previous works to address the problem of generating similar characters. These methods are basically comprised of two components—similar character discovery and cascaded classifiers. The goal of similar character discovery is to make similar character pairs/sets cover as many misclassified samples as possible. It is observed that the confidence of convolutional neural network (CNN) is output by an end-to-end manner and it can be understood as one type of probability metric. In this paper, we propose an algorithm by leveraging CNN confidence for discovering similar character pairs/sets. Specifically, a deep CNN is applied to output the top ranked candidates and the corresponding confidence scores, followed by an accumulating and averaging procedure. We experimentally found that the number of similar character pairs for each class is diverse and the confusion degree of similar character pairs is varied. To address these problems, we propose an entropy- based similarity measurement to rank these similar character pairs/sets and reject those with low similarity. The experimental results indicate that by using 30,000 similar character pairs, our method achieves the hit rates of 98.44 and 98.05 % on CASIA-OLHWDB1.0 and CASIA-OLHWDB1.0–1.2 datasets, respectively, which are significantly higher than corresponding results produced by MQDF-based method (95.42 and 94.49 %). Furthermore, recognition of ten randomly selected similar character subsets with a two-stage classification scheme results in a relative error reduction of 30.11 % comparing with traditional single stage scheme, showing the potential usage of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Bai, Z.L., Huo, Q.: A study on the use of 8-directional features for online handwritten Chinese character recognition. In: Document Analysis and Recognition (ICDAR), 2005 International Conference on, pp. 262–266. IEEE (2005)

  2. Chen, K.T.: Integration of paths—a faithful representation of paths by noncommutative formal power series. Trans. Am. Math. Soc. 89(2), 395–407 (1958)

  3. Gao, T.F., Liu, C.L.: Combining quadratic classifier and pair discriminators by pairwise coupling for handwritten Chinese character recognition. In: Pattern Recognition, 2008. ICPR 2008. 19th International Conference on, pp. 1–4. IEEE (2008)

  4. Gao, T.F., Liu, C.L.: High accuracy handwritten Chinese character recognition using LDA-based compound distances. Pattern Recognit. 41(11), 3442–3451 (2008)

    Article  MATH  Google Scholar 

  5. Graham, B.: Sparse Arrays of Signatures for Online Character Recognition. arXiv preprint arXiv:1308.0371 (2013)

  6. Gu, S., Zhang, L., Zuo, W., Feng, X.: Projective dictionary pair learning for pattern classification. In: Advances in Neural Information Processing Systems, pp. 793–801 (2014)

  7. He, M., Zhang, S., Mao, H., Jin, L.: Recognition confidence analysis of handwritten Chinese character with CNN. In: Document Analysis and Recognition (ICDAR), 2015 International Conference on, pp. 61–65. IEEE (2015)

  8. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving Neural Networks by Preventing Co-Adaptation of Feature Detectors. arXiv preprint arXiv:1207.0580 (2012)

  9. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. (CSUR) 31(3), 264–323 (1999)

    Article  Google Scholar 

  10. Jin, L., Gao, Y., Liu, G., Li, Y., Ding, K.: SCUT-COUCH2009—a comprehensive online unconstrained Chinese handwriting database and benchmark evaluation. Int. J. Doc. Anal. Recognit. (IJDAR) 14(1), 53–64 (2011)

    Article  Google Scholar 

  11. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  12. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  13. Leung, K., Leung, C.: Recognition of handwritten Chinese characters by critical region analysis. Pattern Recognit. 43(3), 949–961 (2010)

    Article  MATH  Google Scholar 

  14. Liu, C.L.: Classifier combination based on confidence transformation. Pattern Recognit. 38(1), 11–28 (2005)

    Article  MATH  Google Scholar 

  15. Liu, C.L., Yin, F., Wang, D.H., Wang, Q.F.: Online and offline handwritten Chinese character recognition: benchmarking on new databases. Pattern Recognit. 46(1), 155–162 (2013)

    Article  MathSciNet  Google Scholar 

  16. Moore, A.: K-Means and Hierarchical Clustering (2001). http://www.cs.cmu.edu/afs/cs/user/awm/web/tutorials/kmeans11.pdf. Accessed 15 Mar 2015

  17. Ryu, S., Kim, I.J.: Discrimination of similar characters using nonlinear normalization based on regional importance measure. Int. J. Doc. Anal. Recognit. (IJDAR) 17(1), 79–89 (2014)

    Article  Google Scholar 

  18. Shao, Y., Wang, C., Xiao, B., Zhang, R., Zhang, Y.: Multiple instance learning based method for similar handwritten Chinese characters discrimination. In: Document Analysis and Recognition (ICDAR), 2011 International Conference on, pp. 1002–1006. IEEE (2011)

  19. Suzuki, M., Ohmachi, S., Kato, N., Aso, H., Nemoto, Y.: A discrimination method of similar characters using compound Mahalanobis function. Trans. IEICE Jpn. 80(10), 2752–2760 (1997)

    Google Scholar 

  20. Tao, D., Liang, L., Jin, L., Gao, Y.: Similar handwritten Chinese character recognition using discriminative locality alignment manifold learning. In: Document Analysis and Recognition (ICDAR), 2011 International Conference on, pp. 1012–1016. IEEE (2011)

  21. Wang, D.H., Liu, C.L.: Learning confidence transformation for handwritten Chinese text recognition. Int. J. Doc. Anal. Recognit. (IJDAR) 17(3), 205–219 (2014)

    Article  Google Scholar 

  22. Xu, B., Huang, K., Liu, C.L.: Similar handwritten Chinese characters recognition by critical region selection based on average symmetric uncertainty. In: Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on, pp. 527–532. IEEE (2010)

  23. Yang, W., Jin, L., Xie, Z., Feng, Z.: Improved deep convolutional neural network for online handwritten Chinese character recognition using domain-specific knowledge. In: Document Analysis and Recognition (ICDAR), 2015 International Conference on, pp. 551–555. IEEE (2015)

  24. Yang, Z., Tao, D., Zhang, S., Jin, L.: Similar handwritten Chinese character recognition based on deep neural networks with big data. J. Commun. 35(9), 184–189 (2014)

    Google Scholar 

Download references

Acknowledgments

The authors thank all reviewers for their valuable comments on improving the quality of this paper. This research is supported in part by NSFC (Grant No. 61472144), National Science and Technology Support Plan (Grant Nos. 2013BAH65F01, 2013BAH65F04), GDSTP (Grant Nos. 2013B010202004, 2014A010103012, 2015B010101004, 2015B010130003), GDUPS (2011), Research Fund for the Doctoral Program of Higher Education of China (Grant No. 20120172110023).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lianwen Jin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, S., Jin, L. & Lin, L. Discovering similar Chinese characters in online handwriting with deep convolutional neural networks. IJDAR 19, 237–252 (2016). https://doi.org/10.1007/s10032-016-0268-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-016-0268-0

Keywords

Navigation