Skip to main content
Log in

Fast Chinese calligraphic character recognition with large-scale data

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Chinese calligraphy draws a lot of attention for its beauty and elegance. But due to the complexity of shape and styles of calligraphic characters, it is difficult for common users to recognize them. Thus it would be great if a tool is provided to help users to recognize the unknown calligraphic characters. The well-known OCR (Optical Character Recognition) technology can hardly help people to recognize the unknown characters because of their deformation and complexity. In CADAL, a Calligraphic Character Dictionary (CalliCD) which contains character images labeled with semantic meaning has been constructed and provided to common users to use online. With the help of CalliCD, user can learn more about the unknown calligraphic character by performing similarity based searching. But as with the growth of CalliCD, it takes intolerable time to do the similarity based one-to-one searching. Strategies that can handle large scale data are needed. In this paper, a fast recognition schema based on retrieval is proposed. In addition, a novel shape descriptor, called GIST-SC, is proposed to represent calligraphic character image for efficient and effective retrieval. The schema works in three steps. Firstly approximate nearest neighbors of the character image to be recognized are found quickly. Secondly, one-to-one fine matching between approximate nearest neighbors and the character image to be recognized is performed. Finally the recognition based on semantic probability is given. Our experiments show that the GIST-SC descriptor and the recognition schema are efficient and effective for Chinese calligraphic character recognition with CalliCD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. China Academic Digital Associate Library - http://www.cadal.zju.edu.cn

  2. When teaching a beginner how to write calligraphic characters, the learner is usually asked to write character on the nine grid to formatting the character structure.

References

  1. Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Foundations of Computer Science, 2006. 47th Annual IEEE Symposium on FOCS’06. IEEE pp 459–468

  2. Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(4):509–522

    Article  Google Scholar 

  3. Chen J, Leung M K, Gao Y (2003) Noisy logo recognition using line segment hausdorff distance. Pattern Recog 36(4):943–955

    Article  Google Scholar 

  4. Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on Computational geometry. ACM, pp 253–262

  5. Doermann D, Rivlin E, Weiss I (1996) Applying algebraic and differential invariants for logo recognition. Mach Vis Appl 9(2):73–86

    Article  Google Scholar 

  6. Gdalyahu Y, Weinshall D (1999) Flexible syntactic matching of curves and its application to automatic hierarchical classification of silhouettes. IEEE Trans Pattern Anal Mach Intell 21(12):1312–1328

    Article  Google Scholar 

  7. Gionis A, Indyk P, Motwani R et al (1999) Similarity search in high dimensions via hashing. In: VLDB, vol 99, pp 518–529

  8. Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868

    Article  Google Scholar 

  9. Hinton G E, Salakhutdinov R R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    Article  MathSciNet  MATH  Google Scholar 

  10. Hodge VJ, Hollier G, Austin J, Eakins J (2008) Identifying perceptual structures in trademark images. In: Proceedings 5th IASTED international conference on signal processing, pattern recognition, and applications

  11. Klassen E, Srivastava A, Mio M, Joshi S H (2004) Analysis of planar shapes using geodesic paths on shape spaces. IEEE Trans Pattern Anal Mach Intell 26(3):372–383

    Article  Google Scholar 

  12. Li H, Liu P, Xu S, Lin S (2012) Calligraphy beautification method for chinese handwritings. In: Digital Home (ICDH), Fourth International Conference on 2012. IEEE, pp 122–127

  13. Lu W-m, Wu J-q, Wei B-g, Zhuang Y-t (2011) Efficient shape matching for chinese calligraphic character retrieval. J Zhejiang Univ Sci C 12(11):873–884

    Article  Google Scholar 

  14. Oliva A (2005) Gist of the scene. Neurobiol Atten 696:64

    Google Scholar 

  15. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175

    Article  MATH  Google Scholar 

  16. Porwal U, Shivram A, Ramaiah C, Govindaraju V (2012) Ensemble of biased learners for offline arabic handwriting recognition. In: Document Analysis Systems (DAS), 10th IAPR International Workshop on 2012. IEEE, pp 322–326

  17. Salakhutdinov R, Hinton G (2009) Semantic hashing. Int J Approx Reason 50(7):969–978

    Article  Google Scholar 

  18. Shakhnarovich G (2005) Learning task-specific similarity. PhD thesis, Massachusetts Institute of Technology

  19. Shi D, Damper R I, Gunn S R (2003) Offline handwritten chinese character recognition by radical decomposition. ACM Trans Asian Lang Info Process (TALIP) 2(1):27–48

    Article  Google Scholar 

  20. Wang J, Kumar S, Chang S-F (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406

    Article  Google Scholar 

  21. Wang S-Z, Lee H-J (2001) Dual-binarization and anisotropic diffusion of chinese characters in calligraphy documents. In: Document analysis and recognition, 2001. Proceedings. Sixth International Conference on. IEEE, pp 271–275

  22. Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Advances in neural information processing systems, pp 1753–1760

  23. Wu Y, Zhuang Y, Pan Y, Wu J (2006) Web based chinese calligraphy learning with 3-d visualization method. In: Multimedia and Expo, 2006 IEEE International Conference on. IEEE, pp 2073–2076

  24. Xu S, Lau F C M, Cheung W K, Pan Y (2005) Automatic generation of artistic chinese calligraphy, vol 20

  25. Yu K, Wu J, Zhuang Y (2008) Skeleton-based recognition of chinese calligraphic character image. In: Advances in multimedia information processing-PCM 2008. Springer, pp 228–237

  26. Zhang X, Liu G, Wu J, Luan C (2008) A quick search engine for historical chinese calligraphy character image. In: Image and Signal Processing, 2008. Congress on CISP’08. IEEE, vol 1, pp 355–359

  27. Zhang Z, Wu J, Yu K (2010) Chinese calligraphy specific style rendering system. In: Proceedings of the 10th annual joint conference on Digital libraries. ACM, pp 99–108

  28. Zhuang Y, Zhang X, Wu J, Lu X (2005) Retrieval of chinese calligraphic character image. In: Advances in multimedia information processing-PCM 2004. Springer pp 17–24

Download references

Acknowledgments

This paper is supported by the National Natural Science Foundation of China under Grant No.61379073 and the CADAL Project and Research Center, Zhejiang University. Thank all the reviewers for helping us to improve our work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wu Jiangqin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pengcheng, G., Jiangqin, W., Yuan, L. et al. Fast Chinese calligraphic character recognition with large-scale data. Multimed Tools Appl 74, 7221–7238 (2015). https://doi.org/10.1007/s11042-014-1969-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-1969-3

Keywords

Navigation