Abstract
As historical Chinese calligraphy works are being digitized, the problem of retrieval becomes a new challenge. But, currently no OCR technique can convert calligraphy character images into text, nor can the existing Handwriting Character Recognition approach does not work for it. This paper proposes a novel approach to efficiently retrieving Chinese calligraphy characters on the basis of similarity: calligraphy character image is represented by a collection of discriminative features, and high retrieval speed with reasonable effectiveness is achieved. First, calligraphy characters that have no possibility similar to the query are filtered out step by step by comparing the character complexity, stroke density and stroke protrusion. Then, similar calligraphy characters are retrieved and ranked according to their matching cost produced by approximate shape match. In order to speed up the retrieval, we employed high dimensional data structure — PK-tree. Finally, the efficiency of the algorithm is demonstrated by a preliminary experiment with 3012 calligraphy character images.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Chi-Wing Lo, Qin Lu, Kwun-Tak Ng. Character-image search engine. IEEE International Conference on Systems, Man and Cybernetics, Hammamet, Tunisia, 2002, 4: 6.
Lavrenko V, Rath T M, Manmatha R. Holistic word recognition for handwritten historical documents. In Proc. the Int. Workshop on Document Image Analysis for Libraries, Palo Alto, CA, January 23∼24, 2004, pp.278∼287.
Zhuang Y T, Zhang X F, Wu J Q et al. Retrieval of Chinese calligraphic character image. In Proc. IEEE 2004 Pacific-Rim Conference on Multimedia, Tokyo, Japan, 2004, pp.17∼24.
Toni M Rath, R Manmatha, Victor Lavrenko. A search engine for historical manuscript images. In Proc. 27th Annual Int. Research and Development in Information Retrieval (SIGIR'04), Sheffield, United Kingdom, 2004, pp.369∼376.
Shi Baile, Zhang Liang, Wang Yong et al. Content-based Chinese script retrieval through visual similarity criteria. Journal of Software, 2001, 12(9): 1336∼1342.
Xu Zhiming, Wang Xiaolong. A new linguistic decoding method for online handwritten Chinese character recognition. Journal of Computer Science and Technology, 2000, 15(6): 597∼603.
Wing Ho Leung, Tsuhan Chen. Hierarchical matching for retrieval of hand-drawn sketches. In Proc. Int. Conf. Multimedia and Expo, Maryland, US, 2003, 2: 29∼32.
Rumelhart D E, Zipser D. Feature discovery by competitive learning. Cognitive Science, 1985, 9(1): 75∼112.
Hsin-Hung Chen. A feasibility study of using color indexing for reef fish identification. In Proc. OCEANS, 2003, 5: 256.
Manjunath B S, Ma W Y. Texture features for browsing and retrieval of image data. IEEE Trans. Pattern Anal, Machine Intell., Aug. 1996, 18(8): 837∼842.
Berretti S, Bimbo A D, Pala P. Retrieval by shape similarity with perceptual distance and effective indexing. IEEE Trans. Multimedia, 2000, 2: 225∼239.
Weissberg N, Sagi S, Shaked D. Shape indexing by dynamic programming. In Proc. 21st IEEE Convention of the Electrical and ELectronic Engineers, Israel, 2000, pp.114∼117.
Suganthan P N. Shape indexing using self-organizing maps. IEEE Trans. Neural Networks, 2002, 13(5): 835∼840.
Turk M, Pentland A. Face recognition using eigenfaces. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, Maui, USA, 1991, pp.586∼591.
Tenenbaum J B, de Silva V, Langford J C. A global geometric framework for nonlinear dimensionality reduction. Science, 2000, 290(5500): 2319∼2323.
Jihai Zhao, Chen Huang. Technical issues on the China-US million book digital library project. In Proc. 7th Int. Conf. Asian Digital Libraries, Shanghai, China, 2004, pp.220∼226.
Manmatha R, Chengfeng Han, E M Riseman et al. Indexing handwriting using word matching. In Proc. 1st ACM Int. Conf. Digital Libraries, Maryland, USA, 1996, pp.151∼159.
Wu You-Shou, Ding Xiao-Qing. Chinese Character Recognition: The Principles and the Implementations. Beijing: Higher Education Press, 1992.
Lau K K, Yuen P C, Tang Y Y. Stroke extraction and stroke sequence estimation on signatures. In Proc. 16th Int. Conf. Pattern Recognition, Quebec, Canada, 2002, 3: 119∼122.
Wenwei Wang, Brakensiek A, Rigoll G. Combination of multiple classifiers for handwritten word recognition. In Proc. 8th Int. Workshop on Frontiers in Handwriting Recognition, Ontario, US, 2002, pp.117∼122.
Wei Wang, Jiong Yang, Richard Muntz. PK-tree: A spatial index structure for high dimensional point data. Information Organization and Database, Tanaka K, Ghandeharizadeh S, Kambayashi Y (eds.), Boston/Dordrecht/London: Kluwer Academic Publishers, 2000, pp.281∼293.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the National Natural Science Foundation of China (Grant Nos. 60533090, 60525108), the National Grand Fundamental Research 973 Program of China (Grant No. 2002CB312101), the Science and Technology Project of Zhejiang Province (2005C13032, 2005C11001-05), and the China-US Million Book Digital Library Project (www.cadal.zju.edu.cn).
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Zhang, XF., Zhuang, YT., Wu, JQ. et al. Hierarchical Approximate Matching for Retrieval of Chinese Historical Calligraphy Character. J Comput Sci Technol 22, 633–640 (2007). https://doi.org/10.1007/s11390-007-9077-8
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-007-9077-8