Abstract
Chinese calligraphy draws a lot of attention for its beauty and elegance. But due to the complexity of shape and styles of calligraphic characters, it is difficult for common users to recognize them. Thus it would be great if a tool is provided to help users to recognize the unknown calligraphic characters. The well-known OCR (Optical Character Recognition) technology can hardly help people to recognize the unknown characters because of their deformation and complexity. In CADAL, a Calligraphic Character Dictionary (CalliCD) which contains character images labeled with semantic meaning has been constructed and provided to common users to use online. With the help of CalliCD, user can learn more about the unknown calligraphic character by performing similarity based searching. But as with the growth of CalliCD, it takes intolerable time to do the similarity based one-to-one searching. Strategies that can handle large scale data are needed. In this paper, a fast recognition schema based on retrieval is proposed. In addition, a novel shape descriptor, called GIST-SC, is proposed to represent calligraphic character image for efficient and effective retrieval. The schema works in three steps. Firstly approximate nearest neighbors of the character image to be recognized are found quickly. Secondly, one-to-one fine matching between approximate nearest neighbors and the character image to be recognized is performed. Finally the recognition based on semantic probability is given. Our experiments show that the GIST-SC descriptor and the recognition schema are efficient and effective for Chinese calligraphic character recognition with CalliCD.
Similar content being viewed by others
Notes
China Academic Digital Associate Library - http://www.cadal.zju.edu.cn
When teaching a beginner how to write calligraphic characters, the learner is usually asked to write character on the nine grid to formatting the character structure.
References
Andoni A, Indyk P (2006) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Foundations of Computer Science, 2006. 47th Annual IEEE Symposium on FOCS’06. IEEE pp 459–468
Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell 24(4):509–522
Chen J, Leung M K, Gao Y (2003) Noisy logo recognition using line segment hausdorff distance. Pattern Recog 36(4):943–955
Datar M, Immorlica N, Indyk P, Mirrokni VS (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the twentieth annual symposium on Computational geometry. ACM, pp 253–262
Doermann D, Rivlin E, Weiss I (1996) Applying algebraic and differential invariants for logo recognition. Mach Vis Appl 9(2):73–86
Gdalyahu Y, Weinshall D (1999) Flexible syntactic matching of curves and its application to automatic hierarchical classification of silhouettes. IEEE Trans Pattern Anal Mach Intell 21(12):1312–1328
Gionis A, Indyk P, Motwani R et al (1999) Similarity search in high dimensions via hashing. In: VLDB, vol 99, pp 518–529
Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868
Hinton G E, Salakhutdinov R R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Hodge VJ, Hollier G, Austin J, Eakins J (2008) Identifying perceptual structures in trademark images. In: Proceedings 5th IASTED international conference on signal processing, pattern recognition, and applications
Klassen E, Srivastava A, Mio M, Joshi S H (2004) Analysis of planar shapes using geodesic paths on shape spaces. IEEE Trans Pattern Anal Mach Intell 26(3):372–383
Li H, Liu P, Xu S, Lin S (2012) Calligraphy beautification method for chinese handwritings. In: Digital Home (ICDH), Fourth International Conference on 2012. IEEE, pp 122–127
Lu W-m, Wu J-q, Wei B-g, Zhuang Y-t (2011) Efficient shape matching for chinese calligraphic character retrieval. J Zhejiang Univ Sci C 12(11):873–884
Oliva A (2005) Gist of the scene. Neurobiol Atten 696:64
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Porwal U, Shivram A, Ramaiah C, Govindaraju V (2012) Ensemble of biased learners for offline arabic handwriting recognition. In: Document Analysis Systems (DAS), 10th IAPR International Workshop on 2012. IEEE, pp 322–326
Salakhutdinov R, Hinton G (2009) Semantic hashing. Int J Approx Reason 50(7):969–978
Shakhnarovich G (2005) Learning task-specific similarity. PhD thesis, Massachusetts Institute of Technology
Shi D, Damper R I, Gunn S R (2003) Offline handwritten chinese character recognition by radical decomposition. ACM Trans Asian Lang Info Process (TALIP) 2(1):27–48
Wang J, Kumar S, Chang S-F (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406
Wang S-Z, Lee H-J (2001) Dual-binarization and anisotropic diffusion of chinese characters in calligraphy documents. In: Document analysis and recognition, 2001. Proceedings. Sixth International Conference on. IEEE, pp 271–275
Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Advances in neural information processing systems, pp 1753–1760
Wu Y, Zhuang Y, Pan Y, Wu J (2006) Web based chinese calligraphy learning with 3-d visualization method. In: Multimedia and Expo, 2006 IEEE International Conference on. IEEE, pp 2073–2076
Xu S, Lau F C M, Cheung W K, Pan Y (2005) Automatic generation of artistic chinese calligraphy, vol 20
Yu K, Wu J, Zhuang Y (2008) Skeleton-based recognition of chinese calligraphic character image. In: Advances in multimedia information processing-PCM 2008. Springer, pp 228–237
Zhang X, Liu G, Wu J, Luan C (2008) A quick search engine for historical chinese calligraphy character image. In: Image and Signal Processing, 2008. Congress on CISP’08. IEEE, vol 1, pp 355–359
Zhang Z, Wu J, Yu K (2010) Chinese calligraphy specific style rendering system. In: Proceedings of the 10th annual joint conference on Digital libraries. ACM, pp 99–108
Zhuang Y, Zhang X, Wu J, Lu X (2005) Retrieval of chinese calligraphic character image. In: Advances in multimedia information processing-PCM 2004. Springer pp 17–24
Acknowledgments
This paper is supported by the National Natural Science Foundation of China under Grant No.61379073 and the CADAL Project and Research Center, Zhejiang University. Thank all the reviewers for helping us to improve our work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pengcheng, G., Jiangqin, W., Yuan, L. et al. Fast Chinese calligraphic character recognition with large-scale data. Multimed Tools Appl 74, 7221–7238 (2015). https://doi.org/10.1007/s11042-014-1969-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-1969-3