Abstract
This paper describes an intelligent forms processing system (IFPS) which provides capabilities for automatically indexing form documents for storage/retrieval to/from a document library and for capturing information from scanned form images using intelligent character recognition (ICR). The system also provides capabilities for efficiently storing form images. IFPS consists of five major processing components: (1) An interactive document analysis stage that analyzes a blank form in order to define a model of each type of form to be accepted by the system; the parameters of each model are stored in a form library. (2) A form recognition module that collects features of an input form in order to match it against one represented in the form library; the primary features used in this step are the pattern of lines defining data areas on the form. (3) A data extraction component that registers the selected model to the input form, locates data added to the form in fields of interest, and removes the data image to a separate image area. A simple mask defining the center of the data region suffices to initiate the extraction process; search routines are invoked to track data that extends beyond the masks. Other special processing is called on to detect lines that intersect the data image and to delete the lines with minimum distortion to the rest of the image. (4) An ICR unit that converts the extracted image data to symbol code for input to data base or other conventional processing systems. Three types of ICR logic have been implemented in order to accommodate monospace typing, proportionally spaced machine text, and handprinted alphanumerics. (5) A forms dropout module that removes the fixed part of a form and retains only the data filled in for storage. The stored data can be later combined with the fixed form to reconstruct the original form. This provides for extremely efficient storage of form images, thus making possible the storage of very large number of forms in the system. IFPS is implemented as part of a larger image management system called Image and Records Management system (IRM). It is being applied in forms data management in several state government applications.
Similar content being viewed by others
References
Casey RG, Ferguson DR (1990) Intelligent Forms Processing. IBM Systems Journal 29(3):435–450
Casey RG, Jih CR (1983) A Processor-based OCR System. IBM Journal of Research and Development 27(4):386–399
Clement TP (1981) The extraction of line-structured data from engineering drawings. Pattern Recognition 14(1–6):43–52
International Business Machines (1991) Image and Records Management (IRM) System—General Information Guide, Version 1 Release 2
Kasturi R, Shih C (1989) Extraction of graphic primitives from images of paper based line drawings. Machine Vision and Applications 2:103–113
Mitchell B, Gillies A (1989) A model-based computer vision system for recognizing handwritten zip codes. Machine Vision and Applications 2:231–243
Nagy G, Viswanathan M (1991) Dual representation of segmented technical documents, First International Conference on Document Analysis and Recognition, St. Malo, France, September, pp 141–151
Nielsen DE, Arps RB, Morrin TH (1973) Evaluation of Scanner Spectral Response for Insurance Industry Documents. 6/A44 NCI Program, Working Paper 2
Walach E, Chevion D, Karnin E (1989a) Efficient Compression and Decompression of Forms by Means of the Very Large Symbol Matching. Docket. No. SZ-9-88-001, filed August 4 1989, Israel Appln. 91220, European Appln. 89810765.1
Walach E, Chevion D, Karnin E (1989b) Method for High Quality Compression of Binary Text Images. Docket SZ-9-88-002, filed August 4 1989, Israel Appln. 91221, European Appln. 89810766.9
Wood FB, Arps RB (1973) Evaluation of Scanner Spectral Response for Insurance Industry Documents. 16/A44 NCI Program, Working Paper 4
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Casey, R., Ferguson, D., Mohiuddin, K. et al. Intelligent forms processing system. Machine Vis. Apps. 5, 143–155 (1992). https://doi.org/10.1007/BF02626994
Issue Date:
DOI: https://doi.org/10.1007/BF02626994