Improving OCR performance using character degradation models and boosting algorithm

doi:10.1016/S0167-8655(97)00137-2

Pattern Recognition Letters

Volume 18, Issues 11–13, November 1997, Pages 1415-1419

https://doi.org/10.1016/S0167-8655(97)00137-2 Get rights and content

Abstract

We introduce three character degradation models in a boosting algorithm for training an ensemble of character classifiers. We also compare the boosting ensemble with the standard ensemble of networks trained independently with character degradation models. An interesting discovery in our comparison is that although the boosting ensemble is slightly more accurate than the standard ensemble at zero reject rate, the advantage of the boosting training over independent training quickly disappears as more patterns are rejected. Eventually the standard ensemble outperforms the boosting ensemble at high reject rates. Explanation of such a phenomenon is provided in the paper.

Introduction

In this paper, we study the effectiveness of a boosting algorithm (Drucker et al., 1993) in improving the performance of OCR. The original theoretical work on the boosting algorithm was done by Schapire (1990). He showed that it is in principle possible for a combination of weak classifiers (whose performances are a little better than random guessing) to achieve an arbitrarily low error (on the training data set). Drucker et al. (1993) applied the boosting algorithm to character recognition. They produced a large number of training patterns by deforming the original character images by various degrees. It was shown that the performance of character recognition was dramatically improved over that of the single network which was used as the first network in the boosting hierarchy. However, it remains to be answered whether the boosting ensemble outperforms the standard ensemble of independently trained networks. In this paper, we provide a comparative study of the boosting ensemble and the standard ensemble. We also introduce three character degradation models in the boosting algorithm.

Section snippets

Boosting algorithm

In the boosting algorithm, the weak classifiers are trained hierarchically to learn harder and harder parts of a classification problem. The algorithm requires an oracle to produce a large number of independent training patterns. The basic boosting algorithm works as follows.

1.
Generate a set of training data and train the first classifier.
2.
Generate a set of training data for training the second classifier in the following manner: Flip a coin. If it heads up, the oracle generates a pattern and

Character degradation models

We introduce three document degradation models in the boosting algorithm: (i) affine transformation, (ii) an image deformation model used in (Jain et al., 1996), and (iii) a probabilistic model for document degradation (Kanungo, 1996).

The affine model is a linear transformation of coordinate systems which take into consideration the following operations: (i) translation, (ii) scaling, (iii) rotation, and (iv) shearing. In our OCR system, character features are invariant to translation and

Experiments and discussions

We used the lower-case alphabets of the NIST (National Institute of Standards and Technology) Special Database 3 (SD3: 39,636 segmented characters) and Test Data 1 (TD1: 12,000 segmented characters); these databases consist of pre-segmented characters used in the 1992 comparative study (Wilkinson et al., 1992). In our experiments, the SD3 data set was further partitioned into a training data set (SD3-train) with 27,636 characters and a validation data set (SD3-valid) with 12,000 characters.

The

Conclusions

We have introduced three character degradation models in the boosting training. We compare the boosting ensemble with the standard ensemble of networks trained independently with character degradation models. Both the ensembles outperform the single network trained using the character degradation models. An interesting discovery in our comparison is that although the boosting ensemble has a slightly higher accuracy than the standard ensemble at zero reject rate, the advantage of the boosting

References (6)

Drucker, H., Schapire, R., Simard, P., 1993. Improving performance in neural networks using boosting algorithm. In:...
Jain, A., Zhong, Y., Lakshmanan, S., 1996. Object matching using deformable templates. IEEE Trans. Pattern Anal....
Kanungo, T., 1996. Document degradations models and a methodology for degradation model validation. Ph.D. Thesis....

There are more references available in the full text version of this article.

Cited by (21)

Efficient character segmentation approach for machine-typed documents
2017, Expert Systems with Applications
Citation Excerpt :
Diverse approaches for character segmentation have been presented in the past (Casey & Lecolinet, 1996 ). Usually, the process of character segmentation and its importance in complex systems, such as OCR systems (Bourbakis, Pereira, & Mertoguno, 1996; Grafmüller & Beyerer, 2013; Mao & Mohiuddin, 1997; Vamvakas, Gatos, Stamatopoulos, & Perantonis, 2008), is unduly underestimated (Lu, 1995; Lu & Shridhar, 1996). Related works can be classified based on a couple of characteristics.
In this paper an efficient approach for segmentation of the individual characters from scanned documents typed on old typewriters is proposed. The approach proposed in this paper is primarily intended for processing of machine-typed documents, but can be used for machine-printed documents as well. The proposed character segmentation approach uses the modified projection profiles technique which is based on using the sliding window for obtaining the information about the document image structure. This is followed by histogram processing in order to determine the spaces between lines, words and characters in the document image. The decision-making logic used in the process of character segmentation is describes and represents the most an integral aspect of the proposed technique. Beside the character segmentation approach, the ultra-fast architecture for geometrical image transformations, which is used for image rotation in the process of skew correction, is presented, and its fast implementation using pointer arithmetic and a highly optimized low-level machine routine is provided. The proposed character segmentation approach is semi-automatic and uses threshold values to control the segmentation process. Provided results for segmentation accuracy show that the proposed approach outperforms the state-of-the-art approaches in most cases. Also, the results from the aspect of the time complexity show that the new technique performs faster than state-of-the-art approaches and can process even very large document images in less than one second, which makes this approach suitable for real-time tasks. Finally, visual demonstration of the proposed approach performances is achieved using original documents authored by Nikola Tesla.
Recognition of handwritten Chinese characters by critical region analysis
2010, Pattern Recognition
The problem of recognizing offline handwritten Chinese characters has been investigated extensively. One difficulty is due to the existence of characters with very similar shapes. In this paper, we propose a “critical region analysis” technique which highlights the critical regions that distinguish one character from another similar character. The critical regions are identified automatically based on the output of the Fisher's discriminant. Additional features are extracted from these regions and contribute to the recognition process. By incorporating this technique into the character recognition system, a record high recognition rate of 99.53% on the ETL-9B database is obtained.
Application of Dempster-Shafer theory in condition monitoring applications: A case study
2001, Pattern Recognition Letters
This paper is concerned with the use of Dempster–Shafer theory in `fusion' classifiers. We argue that the use of predictive accuracy for basic probability assignments can improve the overall system performance when compared to `traditional' mass assignment techniques. We demonstrate the effectiveness of this approach in a case study involving the detection of static thermostatic valve faults in a diesel engine cooling system.
Offline handwriting recognition using synthetic training data produced by means of a geometrical distortion model
2004, International Journal of Pattern Recognition and Artificial Intelligence
Intelligent technologies in manufacturing cyber-physical systems: Evolution and future development
2023, Advanced Signal Processing for Industry 4.0
Research and development of neural network ensembles: a survey
2018, Artificial Intelligence Review

View all citing articles on Scopus

View full text