Decomposition of Chinese character into strokes using mathematical morphology

https://doi.org/10.1016/S0167-8655(98)00147-0Get rights and content

Abstract

In off-line character recognition, reliable extraction of strokes greatly influences the overall performance of the system. We propose a novel method of decomposing a Chinese character into a set of strokes, based on the concepts of mathematical morphology. A character is segmented after being thinned using morphological operation. Each segment is expanded by morphological operations such as elongation, fattening and isotropic expansion then intersected with the original input shape. The expanded segments are merged using a convexity measure at the final stage. The final merged segments are regarded as basic strokes of a character. To show the effectiveness of the method, we perform experiments and the results show the validity of the proposed method.

Introduction

The popularity of automatic character recognition has attracted many researchers to the problem of Chinese character recognition. Compared with recognition of other characters, recognition of Chinese character is a difficult problem since the character set is very large, and there are many similar characters in shape (Chen et al., 1993). Moreover a character is constructed by combining several strokes in two dimensions. Therefore, Chinese characters are much harder to recognize than any other languages.

While many successful research results for recognizing printed Chinese characters have been presented in the literature, there has been little progress for recognizing handprinted Chinese characters (Wang and Shiqu, 1973; Gu et al., 1983). One of the ideas for increasing the recognition rate is the structure analysis. In this approach, character decomposition into strokes is a fundamental process and the recognition rate is greatly influenced by how correctly the elemental strokes can be extracted. However, reliable stroke extraction is not an easy problem.

Most of typical approaches perform thinning procedure on a character and extract feature points (Liao and Huang, 1990; Ogawa and Taniguchi, 1982; Xie and Suk, 1987; Kim et al., 1996). Though these methods are simple, they cannot deal with various shape distortion. Moreover, the recognition rates are greatly influenced by the side effects caused by thinning results.

For reliable stroke extraction, considering overall shape of stroke is good. But, considering overall shape of stroke is not an easy problem and requires much computation time.

In this paper, we propose a new stroke extraction method using mathematical morphology. This approach also uses a thinning procedure but it can cope with the side effects generated by the thinning procedure since the overall shape of the input stroke is also considered. A straight line segment is a good recognition unit if a character is composed mainly of straight strokes. Chinese characters are composed of relatively simple straight strokes (Hsieh and Lee, 1992; Boccignone et al., 1993) and we regard the straight stroke as a convex shape so we use convexity measure to extract convex shaped straight strokes. A character is first thinned by morphological operation and divided into many segments. Each segment is expanded by morphological operations such as elongation, fattening and isotropic expansion then intersected with the original input image. The expanded segments are merged using a convexity measure at the final stage. The final merged segments are regarded as basic strokes of a character.

The rest of this paper is organized as follows. In Section 2, we explain some preliminary knowledge which is needed to understand the behavior of our method. In Section 3, an explanation of character decomposition steps is given. Experimental results are shown in Section 4and finally conclusions are given in Section 5.

Section snippets

Preliminaries

The main routine of the proposed method uses morphological operations in the character thinning stage and the segment expansion stage. In the merging stage, a convexity measure is used to check whether two expanded segments can be merged or not. Hence, we briefly introduce some preliminary knowledge about morphological operation and convexity criterion.

Decomposition of a character into strokes

For decomposition of a character into strokes, a set of skeleton segments is obtained by thinning and critical point detection on input character. Each skeleton segment is selectively expanded or elongated. Elongated segments are fattened which dilate the segment orthogonal to the elongation axis. Finally, fattened or expanded segments are merged to get a set of strokes as shown in Fig. 3. The degree of elongation, fattening, isotropic expansion and merging is based on morphological convexity

Experimental results

For the evaluation of the proposed method, we experimented our character decomposition system with 5 sets of 1800 basic handprinted Chinese characters used daily in Korea. The proposed method has been implemented on IBM PC with Pentium 100 processor under Windows 95 environment and programmed in Visual C++ language. The input character images were obtained through HP ScanJet 4C scanner. The experimental images are 128 × 128 pixel planes.

Fig. 7Fig. 8 present examples of decomposition of character `

Conclusions

Reliable stroke extraction is a very crucial factor in off-line character recognition system. For this purpose, we have proposed a stroke extraction method using mathematical morphology. This method extracts straight line segments considering overall shape of a stroke. A character is thinned using morphological operation and segmented into many segments. A segment is expanded using elongation and fattening or isotropic expansion according to its length. Each expanded segment is intersected with

References (14)

There are more references available in the full text version of this article.

Cited by (18)

  • Graphic complexity in writing systems

    2021, Cognition
    Citation Excerpt :

    Lower visual complexity correlates with easier learning, processing and use (Pelli, Burns, Farell, & Moore-Page, 2006). In addition to being easier to perceive, complex shapes arguably require more motor effort to produce, since they tend to involve a greater number of distinct strokes (distinct hand movements typically separated by a lifting of the inscribing instrument — Chang, Plaut, & Perfetti, 2016; Kim, Kim, Choi, & Kim, 1999; Rovenchak, Mačutek, & Riley, 2009). Simpler letters are easier on the eye and easier on the hand.

View all citing articles on Scopus
View full text