Elsevier

Pattern Recognition

Volume 34, Issue 2, February 2001, Pages 187-201
Pattern Recognition

Hierarchical random graph representation of handwritten characters and its application to Hangul recognition

https://doi.org/10.1016/S0031-3203(99)00222-8Get rights and content

Abstract

A hierarchical random graph (HRG) representation for handwritten character modeling is presented. Based on the HRG, a Hangul, Korean scripts, recognition system also has been developed. In the HRG, the bottom layer is constructed with extended random graphs to describe various strokes, while the next upper layers are constructed with random graphs (Wong and Ghahraman, IEEE Trans. Pattern Anal. Mach. Intell. 2(4) (1980) 341) to model spatial and structural relationships between strokes and between sub-characters. As the proposed HRG is a stochastic model, the recognition is formulated into the problem that chooses a model producing maximum probability given an input data. In this context, a matching score is acquired not by any heuristic similarity function, but by a probabilistic measure. The recognition process starts from converting an input character image into an attributed graph through the preprocessing and the graph representation. Matching between an attributed graph and the hierarchical graph model is performed bottom-up. Since the hierarchical structure in an attributed graph is decided after the recognition ends depending on the best interpretation of the graph matching, we can avoid incorrect sub-character segmentation. Model parameters of the hierarchical graph have been estimated automatically from the training data by EM algorithm (Dempster et al., J. Roy. Stat. Soc. 39 (1977) 1) and embedded training. The recognition experiments conducted with unconstrained handwritten Hangul characters show the usefulness and the effectiveness of the proposed HRG.

Introduction

A handwritten character is generated by a series of trajectories of a writing instrument. This implies that the shapes of trajectories and the spatial relationship among them contain all the information needed to decode a handwritten character. The others varying the handwritten instances can be regarded as noises added on them during the information transmission process, caused by writers, writing devices, input devices, etc. In this sense, modeling the handwritten characters in trajectories and their spatial relationship seems to be fit for their origin, and also effective to remove unnecessary variations from width of a pen and other noises. Most structural systems for the handwritten character recognition have been developed under the assumption that all the information necessary to identify a handwritten character can be obtained from the trajectories, or strokes.

The main obstacle in the trajectory- or stroke-based methods is that it is difficult to extract strokes robustly. This is because the touching and the crossing of trajectories give rise to the distortion of stroke shapes and produce an ambiguous connection of strokes. As a consequence, it often becomes unable to recover original strokes from a character image. In fact, most structural systems have suffered from stroke extraction errors, which result from fixing the ambiguous strokes with their local shapes.

The stroke extraction errors can be reduced if strokes are extracted in a global view. Rocha and Pavlidis [1], for example, proposed a method with emphasis on identification of structural descriptions of character shapes. They postpone the decision whether a pixel is black or white until they try forming a stroke. Similarly, strokes are not interpreted as straight or curved until the final matching, and also a number of shape transformations and a gap-handling procedure are introduced so that strokes of a handwritten character can be interpreted in a global view. However, because of the restriction that an input character should be a connected component, the method cannot be applied to more complex character sets such as Korean characters and Chinese characters. In addition, it may not work for some alphabets having more than one connected component such as `i’ and `j'. This infers that only stroke modeling is not enough to describe handwritten characters, but we also need another modeling mechanism with hierarchical representation. Even though Rocha and Pavlidis made an extension of their method to the unsegmented word recognition [2], it handles only one-dimensional relationships of character positions.

There are many studies on the hierarchical representation of handwritten characters. In the Chen and Lieh's method [3], simple descriptions of strokes and their relationships are adopted to make hierarchical graphs. They construct a 2-layer attributed graph to represent a handwritten Chinese character. One layer is for the strokes and the other is for the components of a character. By synthesizing 2-layer attributed graphs, a 2-layer random graph is constructed as a reference model. Another method [4] based on the structural representation, called the hierarchical attributed graph representation (HAGR), was introduced. In this method, the strokes are represented by attributed sets, and then grouped into branches to construct the HAGR.

Although the hierarchical models reflect hierarchical nature of handwritten characters, especially of Chinese characters, stroke descriptions in the systems are too simple to specify various strokes. Such stroke descriptions may be enough to specify Chinese characters, which are mainly composed of straight lines, but, in general, more sophisticated representation mechanism is essential to distinguish similar characters. Furthermore, as Rocha and Pavlidis [1] pointed out, no effort has been made in these studies to address group features that have been divided by spurious points. Instead, strokes are merged and grouped to make a hierarchical graph without regard to global correspondence of them, and then matched against features of reference models by one-to-one mapping. This kind of methods often generates grouping errors that cannot be recovered at matching step.

To tackle these problems, a new hierarchical random graph (HRG) representation for handwritten characters is proposed in this paper. Since there are two different information sources of pen-down movement and pen-up movement, we introduced two different modeling mechanisms: one is for trajectory modeling, and the other is for relationship modeling between trajectories. The former is focused on describing trajectories approximated with a stroke or connected strokes, while the latter is focused on representing their relationships on 2D plane. These two modeling mechanisms are incorporated in a hierarchical graph representation.

At the bottom layer of the HRG, the extended random graphs called chain graphs are introduced so that they can model strokes stochastically. An arc of a chain graph represents a chain of points, or a chain of features, and a vertex represents an ending point or a contact point of chains. In the chain graph, multiple-to-one mapping of features from input strokes to an arc of a chain graph is possible according to model parameters. The correspondence between features of input strokes and model arcs is not determined until the best interpretation of the mapping is found. Not imposing any constraints, all the points in strokes may be a segmentation point. In other words, a vertex of a chain graph can be matched with any point in strokes of an input character.

The upper layers of the HRG, representing relationships between strokes, are constructed with random graphs introduced by Wong et al. [5]. For the character sets with simple structure such as numerals and English alphabet, one layer is enough to represent the relations of strokes. But, to represent the character sets having more complex structure, composed of sub-characters such as Chinese characters and Korean characters, more than two layers are used to represent their hierarchical structure.

Based on the hierarchical representation, the handwritten Korean character (Hangul) recognition system has been developed. One of the principal advantages of the system is that it is a hybrid system combining advantages of both structural and statistical pattern systems. In general, structural pattern recognition systems are easy to represent structural information of characters and find distinctive features well, but they are sensitive to noise and hard to train. So, adjustment of the system to different writing styles is difficult. On the other hand, statistical systems are trainable and less sensitive to noise, but they incline to miss distinctive local features, which results in confusing similar categories. We have attempted to make a stochastic model that has the advantages of both systems. In the proposed system, structural aspects of strokes are represented with the structure of graphs, and their variations are modeled statistically with probability distributions of random variables in graphs. Model parameters of the hierarchical graph have been estimated automatically from the training data by EM algorithm [6] and embedded training technique.

In the proposed system, as in many stochastic methods, the recognition is formulated into a problem to find a model that produces maximum probability given an input data. To get a matching probability, similarity measures and cost functions, which seem to be somewhat arbitrarily defined, are not used. Instead, a matching probability estimated by using probability distributions of features in a model is used. In summary, the recognition is carried out in a probabilistic framework, which is a salient characteristic of our system compared with the previous systems.

The rest of this paper is organized as follows. Section 2 presents the hierarchical graph representation. A hierarchical Hangul model, an application of the hierarchical graph, is explained in Section 3. In Section 4, the procedure to make an attributed graph from an input character image is described, including stroke extraction, feature encoding, gap filling to handle broken characters, and attributed graph generation. The Hangul recognition system and its parameter estimation method are explained in Section 5 and in Section 6, respectively. Some experimental results are given in Section 7, followed by concluding remarks in Section 8.

Section snippets

Hierarchical graph representation of handwritten characters

As mentioned in the previous section, all the information necessary to interpret a handwritten character is originated from the trajectories of a writing instrument and their relationships on 2D space. Thus, a handwritten character can be effectively represented with the hierarchical random graph (HRG) representation layered by stroke models for trajectory modeling and relation models for their relationship modeling. In the proposed HRG, stroke models are made of chain graphs, while relation

Hierarchical Hangul model

As explained in many previous studies [10], [11], a character in Hangul, which we call a syllable, is composed of two or more graphemes on 2D plane. (see Fig. 4). Even though the number of basic graphemes is small, which is 24, the number of syllables generated by the combination of these graphemes is in the order of 10,000. As shown in Fig. 5, all the syllables are classified into six types according to the construction rule of graphemes, and each grapheme can be divided into the primitive

Construction of attributed graph

In this section, the procedure making an attributed graph from a handwritten character image is explained, including stroke extraction, feature encoding, gap filling, and attributed graph generation. Apart from the previous studies, where a hierarchy of an attributed graph is constructed by grouping strokes before recognition, our system does not make any decision about the hierarchy of an attributed graph, which is just a by-product obtained by matching with the hierarchical model.

Recognition system

The recognition in statistical approaches can be formulated into the problem that finds a model producing maximum probability given an input data. Let Mi be a model and X be an input graph. Then, the recognition means finding a model to maximize the a posteriori probability given the observation graph X, i.e.,M̂=argmaxMiP(Mi|X).Using the Bayes’ Rule, the a posteriori probability P(Mi|X) can be written asP(Mi|X)=P(X|Mi)P(Mi)P(X).Since P(X) is independent of Mi, the maximum a posteriori(MAP)

Parameter estimation

It has been studied to generate models automatically by synthesizing attributed graphs. However, in many cases it is ambiguous to determine the number of models and their structure automatically by synthesizing various attributed graphs that contain noise. The generated models tend to be too generalized or specialized to describe the attributed graphs. Thus, we think, when we have prior knowledge enough to design the structure, manual design of the structure works better than automatic

Experimental results

As an application of the proposed HRG, we have developed Hangul recognition system, where a hierarchical Hangul model is introduced. In the hierarchical Hangul model, the number of models for a grapheme is determined depending on its shape variation. While only one model is sufficient to represent simple graphemes such as

, etc., other graphemes containing great varieties such as `
’ and `
’ require more than two models. The total number of grapheme models used is 73, which is less than three

Concluding remarks

Handwritten characters generated from trajectories can be effectively described with trajectories, or strokes, and their relationship. Reflecting such characteristics of handwritten characters, hierarchical random graph (HRG) representation for handwritten character modeling was presented in this paper. The proposed HRG differs from the previous methods in which the shape of strokes is modeled stochastically as well as hierarchical relationship of them so that the recognition can be thoroughly

About the Author—HO YON KIM received the B.S. degree in computer science from Yonsei University in 1992, and M.S. and Ph.D. degree in computer science from KAIST in 1994 and 1999, respectively. He was a visiting researcher at NHK research center for two months in 1997, a visiting researcher at SIEMENS for three months in 1999, and has been working for ETRI since 1999. His research interests include pattern recognition, neural networks, machine learning, etc.

References (13)

There are more references available in the full text version of this article.

Cited by (39)

  • Graph matching and clustering using kernel attributes

    2013, Neurocomputing
    Citation Excerpt :

    Our main goal in this paper is to unsupervisedly learn the prototypes describing a set of non-attributed graphs, with no more information than the structure itself. Previous graph clustering approaches have mainly dealt with attributed relational graphs (ARGs), such as random graphs [10–12], the function-described graph (FDG) model [13], and hierarchical random graphs (HRG) [14]. However, when there are no attributes and only structural information is available, these previous approaches cannot be applied.

  • Bayesian network modeling of strokes and their relationships for on-line handwriting recognition

    2004, Pattern Recognition
    Citation Excerpt :

    Stroke relationships are also incorporated in various ways. They are encoded as features such as distances and angles between strokes [12]. Their symbolic descriptions like intersection and parallel relationships are also incorporated [13].

  • Private Hierarchical Clustering in Federated Networks

    2021, Proceedings of the ACM Conference on Computer and Communications Security
View all citing articles on Scopus

About the Author—HO YON KIM received the B.S. degree in computer science from Yonsei University in 1992, and M.S. and Ph.D. degree in computer science from KAIST in 1994 and 1999, respectively. He was a visiting researcher at NHK research center for two months in 1997, a visiting researcher at SIEMENS for three months in 1999, and has been working for ETRI since 1999. His research interests include pattern recognition, neural networks, machine learning, etc.

About the Author—JIN H. KIM received the B.S. degree in engineering from Seoul National University in 1971, and M.S. and Ph.D. degree in computer science from University of California, Los Angeles in 1979 and 1983, respectively. He was a research engineer at Korea Institute of Science and Technology (KIST) from 1973 to 1976, and engineering programmer at County of Orange, California, USA, from 1976 to 1977, and a senior staff member in computer science at Hughes Artificial Intelligence center, Calabasas, California, USA, from 1981 to 1985. He joined the faculty of KAIST in 1985. He was a visiting scientist at IBM Watson Research Center for the 1990. From 1995 to 1999, he was the president of Korea R & D Information Center (KORDIC). He was on several editorial boards such as International Journal of Information Processing & Management, International Journal of Autonomous Systems and International Journal of Chinese and Oriental Language Processing. His research interests include pattern recognition, intelligent man machine interface and AI for Education.

View full text