Paper
7 March 1996 Intelligent form removal with character stroke preservation
Author Affiliations +
Proceedings Volume 2660, Document Recognition III; (1996) https://doi.org/10.1117/12.234713
Event: Electronic Imaging: Science and Technology, 1996, San Jose, CA, United States
Abstract
A new technique for intelligent form removal has been developed along with a new method for evaluating its impact on optical character recognition (OCR). All the dominant lines in the image are automatically detected using the Hough line transform and intelligently erased while simultaneously preserving overlapping character strokes by computing line width statistics and keying off of certain visual cues. This new method of form removal operates on loosely defined zones with no image deskewing. Any field in which the writer is provided a horizontal line to enter a response can be processed by this method. Several examples of processed fields are provided, including a comparison of results between the new method and a commercially available forms removal package. Even if this new form removal method did not improve character recognition accuracy, it is still a significant improvement to the technology because the requirement of a priori knowledge of the form's geometric details has been greatly reduced. This relaxes the recognition system's dependence on rigid form design, printing, and reproduction by automatically detecting and removing some of the physical structures (lines) on the form. Using the National Institute of Standards and Technology (NIST) public domain form-based handprint recognition system, the technique was tested on a large number of fields containing randomly ordered handprinted lowercase alphabets, as these letters (especially those with descenders) frequently touch and extend through the line along which they are written. Preserving character strokes improves overall lowercase recognition performance by 3%, which is a net improvement, but a single performance number like this doesn't communicate how the recognition process was really influenced. There is expected to be trade- offs with the introduction of any new technique into a complex recognition system. To understand both the improvements and the trade-offs, a new analysis was designed to compare the statistical distributions of individual confusion pairs between two systems. As OCR technology continues to improve, sophisticated analyses like this are necessary to reduce the errors remaining in complex recognition problems.
© (1996) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Michael D. Garris "Intelligent form removal with character stroke preservation", Proc. SPIE 2660, Document Recognition III, (7 March 1996); https://doi.org/10.1117/12.234713
Lens.org Logo
CITATIONS
Cited by 4 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

Statistical analysis

Matrices

Error analysis

Hough transforms

Tolerancing

Chlorine

RELATED CONTENT


Back to Top