Elsevier

Pattern Recognition

Volume 40, Issue 6, June 2007, Pages 1840-1854
Pattern Recognition

Fuzzy model based recognition of handwritten numerals

https://doi.org/10.1016/j.patcog.2006.08.014Get rights and content

Abstract

This paper presents the recognition of handwritten Hindi and English numerals by representing them in the form of exponential membership functions which serve as a fuzzy model. The recognition is carried out by modifying the exponential membership functions fitted to the fuzzy sets. These fuzzy sets are derived from features consisting of normalized distances obtained using the Box approach. The membership function is modified by two structural parameters that are estimated by optimizing the entropy subject to the attainment of membership function to unity. The overall recognition rate is found to be 95% for Hindi numerals and 98.4% for English numerals.

Introduction

The styles of writing numbers are highly different and they come in various sizes, shapes and fonts. The ability to identify these handwritten numbers in an automated or semi-automated manner has led to the development of an entirely different field of research known as the optical character recognition (OCR).

Handwritten input usually occurs as connected or partially connected strings of characters, and the first stage in the recognition process usually involves “chopping” the strings into locally separate entities. This process is known as segmentation. Recognition is then performed using these entities. Since segmentation is a hard problem within itself, the assumption of pre-segmented inputs eases the task and schemes have to be developed for the isolated input characters.

Offline numeral recognition has many practical applications. Among the most quoted practical applications of offline numeral recognition is the reading of postal zip codes in addresses written or typed on envelopes. The benefits of implementing such systems in post offices are enormous. Such systems would make possible automatic sorting and routing of the millions of mails that flow through the postal system every day, reducing the human workload and speeding up the whole process. However, despite extensive research, the current techniques do not produce results that meet the desired accuracy. A trusted and reliable numeral recognition system needs to have very high accuracy.

We will now briefly describe the features of Indian scripts. Most of the Indian scripts are distinguished by the presence of matras (or, character modifiers) in addition to main characters as against the English script that has no matras. Therefore, algorithms developed for them are not directly applicable to Indian scripts. Many OCRs for Indian scripts have been reported [1], [2], [3], [4], [5]. However, none of these has attempted the handwritten Hindi text consisting of composite characters that involve both the main characters and matras. In this paper, we present a recognition system specifically addressing the handwritten Hindi numerals. However, the proposed recognition scheme is applicable to Hindi words as well after their decomposition into individual components.

We will now briefly review OCRs of Indian languages and English for a bird's eye view of the literature.

Printed Devanagari character recognition is attempted using Kohenen neural network (KNN) and other types of neural networks [1], [4], [5].These results are extended to Bangla [5], which also has the header line like Hindi. Structural features like concavities and intersections are used as features. A similar approach is tried for Gujarati in Ref. [3] with limited success. Reasonable results are reported for Gurumukhi script [4]. Preliminary results are also available in the literature on recognition of two popular scripts in south India—Tamil and Kannada [2].

Sinha et al. [6], [7] have reported various aspect ratios (ARs) for Devanagari script recognition. Sethi and Chatterjee [8] have described Devanagari numeral recognition using the structural approach. The primitives used are horizontal and vertical line segments, right and left slants. A decision tree is employed to perform the analysis based on the presence/absence of these primitives and their interconnection. A similar strategy is applied to the constrained hand-printed Devanagari characters in Ref. [9]. Neural network approach for isolated characters is also reported in Ref. [10].

A method for recognizing handwritten Hindi numerals based on the structural descriptors of a numeral's shape is given in Ref. [11]. Density, segment and moment features are used as inputs to the classifiers with feed-forward superstructure of the Kohonen modules in Ref. [2]. The density features consist of density of the normalized samples. The moment features consist of moments of orders 2, 3 and 4 of the numeral. The feature vector for the segments consists of: (1) normalized length of the segment; (2) average directivity of the segment; (3) average directivity strength of the segment; (4) average directivity change of the segment. These segments of the numeral are computed based on Ref. [12]. The three individual classifiers are then integrated using meta-pi network. Zernike moments are used as features and neural network as classifier on the hand printed Devanagari characters in Ref. [13]. A contour-following based algorithm for extracting features is presented in Ref. [14]. For classification of the encoded patterns by the nearest neighbor (NN) classifiers, an iterative clustering algorithm is developed to obtain a reduced, but efficient number of prototypes. A new feature extraction technique is presented in Ref. [15] and applied to the printed Hindi numerals. Classification is done using the neural networks.

Handwritten English numeral recognition using fuzzy logic is first attempted by Siy and Chen [16]. Here the handwritten numeral is decomposed into straight lines, portions of a circle and circles. A more flexible scheme is proposed in Ref. [17] where the decomposition is based on the detection of a set of feature points: terminal, intersection and bend points. The perturbations due to writing habits and instruments are taken into account in the recognition of off-line handwritten English numerals in Ref. [18].

Handwritten numeral recognition using self-organizing maps (SOM) and fuzzy rules is presented in Ref. [19]. During the learning phase prototypes are produced, which together with the corresponding variances, are used to determine fuzzy regions and membership functions. In the recognition stage, a fuzzy rule based classifier is employed to classify an input pattern. An unsure pattern is re-entered into the SOM classifier.

A technique in Ref. [20] generates fuzzy rules, based on ID-3 approach by optimizing the defuzzification parameters with the help of a two-layer perceptron. This technique overcomes the difficulties in a conventional syntactic approach for handwritten character recognition including the problems of choosing a starting or reference point scaling and learning by machines.

A new scheme for offline recognition of totally unconstrained handwritten characters using a simple multi-layer cluster neural network trained with the back propagation algorithm is presented in Ref. [21] and it is shown that the use of genetic algorithms avoids the problem of finding local minima that arise from training the multi-layer cluster neural network with gradient descent technique.

Modified Hough transform method is used in Ref. [22] to extract features, which are fed as the input to a linear classifier (discriminant analysis) and a non-linear classifier (NN). Two neural architectures along with two different feature extraction techniques are investigated in Ref. [23] along with a novel technique for character feature extraction. A new technique for local decision combination is presented in Ref. [24]. The technique uses a genetic algorithm to determine the optimal weight vector to balance the local decision in the combination process. All these methods demonstrate their effectiveness on CEDAR database.

The benchmarking of the state of the art techniques for the digit recognition from different databases, viz., CEDAR, MNIST and CENPARMI is undertaken in Ref. [25]. Eight classifiers, viz., k-NN (k-neural networks), MLP (multi layer perceptron), RBF (radial basis function), PC (polynomial classifier), two variants of SVC (support vector classifier), LVQ (learning vector quantization) and LQDF (learning quadratic discriminant function) are employed. The 10 features include : 4-orientation and 8-direction chain code features, 4-orientation plus 133D and 8-direction chain code features plus 233D with 11 crossing counts and 22 concavity measurements, 4-orientation and 8-direction gradient features, 4-orientation and 8-direction gradient features from pseudo gray scale images, 4-orientation Kirsh features and 4-orientation PDC features. The 10 features and eight classifiers are combined to give 80 accuracies to the test dataset of three databases. The recognition rates are found to be higher than the best in the literature. We will now discuss the methods that yield the best results for the CEDAR database.

In Ref. [26] two types of 256-dimensional feature the vectors: Contour based gradient distribution (CGD) and directional distance distribution (DDD) are extracted. Using non-parametric method the features are evaluated in terms of their class separation and recognition capabilities. From GCD and DDD two types of new features are derived: Class-common and Class-dependent. Using the separating power of the class-common features and discriminating power of the class-dependent features, modular network architectures are trained to classify the numerals. The latter features are found to give the recognition rate of 98.73%.

In Ref. [27], statistical information is modeled by microstates using HMMs and structural information is modeled by singletons and relationships between macrostates where each macrostate is a collection of individual microstates. The orientations that constitute the statistical information are encoded into discrete codebooks and distributions of locations that constitute the structural information are modeled by the joint Gaussian distribution functions. The recognition process is measured by the degree of matching. The state-duration adapted transition probability gives this degree as state-duration indicates the number of feature segments modeled by the state.

We will now discuss some of our earlier work in the character recognition. The back-propagation neural network is used in Ref. [28] for the recognition of handwritten characters. In that, feature extraction is done using three different approaches, namely, ring, sector and hybrid. The features consist of normalized vector distances and angles. The hybrid approach, which combines the ring and sector approaches, is found to yield the best results. The same features are adopted in Ref. [29].

We follow the methodology of Ref. [28] for feature extraction in the present work. But the recognition approach of Ref. [29] which is solely based on the membership function is extended to include entropy and the membership functions are modified by way of the structural parameters necessitated because of large variations in the means and variances of samples. In order to avoid this problem; consistent samples have been used in Ref. [29].

The proposed recognition approach is briefly described here. Each feature yields a fuzzy set when it is gathered over all the training samples. The database is divided into training and testing samples. The scanned numeral is partitioned into 24 boxes from which we extract 24 features. Thus, we will have 24 fuzzy sets. Each fuzzy set is represented by the exponential membership function whose parameters are the mean and variance. The means and variances of all training samples form the knowledge base (KB). But the means and variances vary too much over the sample space. Hence we devise the modified membership function containing structural parameters in addition to mean and variance. For estimating these parameters we define an objective function consisting of the entropy constrained with another term which is the square of the error between the average membership function and the unity. When an unknown numeral comes for recognition, its features are found and then these are fitted to the membership functions of the known (i.e. reference) numerals using the optimized parameters. Whichever numeral yields the minimum objective function is the identity of the unknown numeral.

Our approach differs from the state of the art techniques as in that we fix the standard size of numerals based on the AR, extract features from the cells of the scanned numeral, then associate a distribution to each feature by gathering it over several samples and provide a modification of this distribution in the context of the unknown numeral features. In the literature features are extracted from the thinned numeral but we grid the thinned numeral and then extract features from its cells thus considering the variability in the cell. Only one type of feature, i.e., normalized distance, is used in our approach unlike a large number of types in the literature [25]. Our classifier is based on the objective function consisting of the entropy function constrained upon the square of the error between the average membership function and the unity. In the literature standard classifiers like neural, SVM based, HMM based classifiers etc. are employed in which objective functions are defined implicitly [25].

The paper is organized as follows. Section 2 describes the process overview. Section 3 gives the preprocessing techniques specifically thinning and smoothing. Section 4 deals with the feature extraction. In Section 5 the recognition strategy is presented. Section 6 briefly reports the results of both Hindi and English Numeral recognition and Section 7 gives the conclusions.

Section snippets

Process overview

The character recognition system is usually validated by running them on independent test sets, on which the systems have not been trained. For these tests to be conclusive, the validation sets should include a fairly large number of samples to reflect the variety of writing styles that are found in real-life applications. In this work we have followed this basic theme.

Preprocessing

The steps of preprocessing are briefly discussed in the following:

Feature extraction

Feature extraction is the crucial phase in numeral identification as each numeral is unique in its own way, thus distinguishing itself from other numerals. Hence, it is very important to extract features in such a way that the recognition of different numerals becomes easier on the basis of the individual features of each numeral.

Recognition

In order to recognize the unknown numeral set using fuzzy logic, an exponential membership function is selected as it is found to be more suitable for recognition. The fuzzy membership function is constructed using the normalized vector distance as the feature. In this context, we will explain the formation of a fuzzy set from a feature.

The concept of a fuzzy set arising from a set of features is now explained. If there are ‘n’ possible features for each numeral and if there are ‘m’ such

Results

The choice of the number of boxes originally found by experimentation in Ref. [29] is now verified by plotting the entropy of the average membership function of all the features of boxes as a function of the number of boxes for each numeral. As an example, we depict the plot of Entropy vs. the number of boxes for the numeral 3 in Fig. 8 from which we can observe that 24 boxes is the right choice. Next, we have gone for the choice of suitable membership function. Specifically, we have compared

Conclusions

An improved preprocessing is devised to eliminate barbs in the thinned numerals. The normalized distance features fuzzified by the exponential membership function are found to be effective for the selected box size and the number of the boxes. The structural parameters used in this membership function are able to capture the variation in the writing styles of numerals. The AR of sample image is considered in deciding between the two window sizes for two categories of Hindi numerals. Because of

Acknowledgment

The authors gratefully acknowledge the support of Department of Science & Technology, Government of India for this work.

About the Author—MADASU HANMANDLU received the B.E. degree in Electrical Engineering from Osmania University, Hyderabad, India, in 1973, the M.Tech degree in power systems from R.E.C. Warangal, Jawaharlal Nehru Technological University, India, in 1976, and the Ph.D. degree in control systems from Indian Institute of Technology, Delhi, India, in 1981. From 1979 to 1981, he was a senior scientific officer in Applied Systems Research program of the Department of Electrical Engineering, IIT Delhi.

References (32)

  • B.B. Chaudhuri, U. Pal, An OCR system to read two Indian language scripts: Bangla and Devanagari (Hindi), in:...
  • S.S. Marwah, S.K. Mullick, R.M.K. Sinha, Character recognition of devanagari characters using a hierarchial binary...
  • R.M.K. Sinha et al.

    Machine recognition of Devanagari script

    IEEE Trans. Syst. Man Cybern.

    (1979)
  • I.K. Sethi et al.

    Machine recognition of handprinted Devanagari numerals

    J. Inst. Electron. Telecommun. Eng.

    (1976)
  • J.C. Sant, S.K. Mullick, Handwritten Devanagari script recognition using CTNNSE algorithm, International Conference on...
  • A. Elnagar et al.

    Recognition of handwritten Hindu numerals using structural descriptors

    J. Exp. Theor. Artif. Intell.

    (2003)
  • Cited by (89)

    • Multilingual handwritten numeral recognition using a robust deep network joint with transfer learning

      2021, Information Sciences
      Citation Excerpt :

      A multilayer perceptron (MLP) was designed as a classifier for this purpose in which the classification rate of 91.37% was achieved for digits recognition. One of the pioneering works in handwritten numeral recognition was conducted in [14] based on a fuzzy model. The authors derived some fuzzy sets from features consisting of normalised distances obtained by using the Box approach.

    • Optimizing Handwritten Numeral Recognition for English and Devanagari Using MNIST and CPAR Data

      2023, ISAS 2023 - 7th International Symposium on Innovative Approaches in Smart Technologies, Proceedings
    View all citing articles on Scopus

    About the Author—MADASU HANMANDLU received the B.E. degree in Electrical Engineering from Osmania University, Hyderabad, India, in 1973, the M.Tech degree in power systems from R.E.C. Warangal, Jawaharlal Nehru Technological University, India, in 1976, and the Ph.D. degree in control systems from Indian Institute of Technology, Delhi, India, in 1981. From 1979 to 1981, he was a senior scientific officer in Applied Systems Research program of the Department of Electrical Engineering, IIT Delhi. He joined the EE department as a lecturer in 1981 and became a Professor in 1997. He was with Machine Vision Group, City University, London, in 1988, and Robotics Research Group, Oxford University, in 1993, as part of the Indo-UK research collaboration. He was a Visiting Professor with the Faculty of Engineering, Multimedia University, Malaysia from March 2001 to March 2003. He worked in the areas of Power Systems, Control, Robotics and Computer Vision, before shifting to fuzzy theory. His current research interests mainly include fuzzy modeling of dynamic systems and applications of fuzzy logic to image processing, document processing, bio-medical imaging and intelligent control. He has authored a book on Computer Graphics and also has over 160 publications to his credit. He is an associate editor of Pattern Recognition Journal and also reviews for several other journals such as IEEE Transactions on Fuzzy Systems, Image Processing and SMC. He is a member of IEEE and is listed in Reference Asia; Asia's who's who of Men and Women of achievement; 5000 Personalities of the World (1998), American Biographical Institute.

    About the Author—O.V. RAMANA MURTHY did his BE in Electrical and Electronics Engineering from Andhra University, Visakhapatnam in the year 1999 and M.S. (Research) in Control Systems from IIT Delhi, New Delhi in the year 2001. He has been working as Project scientist in IIT Delhi since then. He is also currently pursuing his PhD on part-time basis in IIT Delhi. He has two Journal papers and 7 IEEE/International conference papers to his credit. His areas of interest include: Applications of soft computing techniques to Document processing and Image processing.

    View full text