Image-based form document retrieval
Introduction
Form processing is an essential operation in business and government organizations. Although the percentage of electronic filings is increasing, paper-based forms will continue to be widely used in the foreseeable future. Because of the many advantages offered by on-line processing, much effort has been devoted to image-based form analysis and recognition. This includes practical systems for machine reading of form data from scanned images [1], [2], [3], form dropout [4], [5], [6], [7] and form layout description [8]. Other investigations on various aspects of form analysis and recognition can be found in Refs. [9], [10], [11], [12], [13], [14], [15].
In this paper we consider the following problem: Given a form image database and a query image, how do we retrieve form images in the database with the same or similar layout structure as the query? This problem is similar to the form recognition work investigated in Ref. [1]. However, the difference lies in that our query images are not prepared for data entry, but for providing sufficient clues for similarity-based retrieval. So, the query images can be at a different scale (resolution) or have relatively poor quality compared to database images. The challenges in form image analysis arise from the large variety of form formats (layouts). Although form formats can be very flexible, we assume that the forms have the following layout characteristics: (i) a form has two horizontal frame lines serving as top and bottom boundaries. It is recommended that the forms have two vertical frame lines serving as left and right boundaries, and (ii) most of the useful data items in a form are enclosed by rectangles formed by frame lines.
We address the problem of image-based retrieval for forms that have the above characteristics. Our goal is to propose a similarity measure for forms that is insensitive to translation, scaling, moderate skew (<5°), and image quality fluctuations. Sometimes, forms of the same type that come from different printing sources differ slightly in their final layout, so we do not assume the invariance of the geometrical proportion of the frame structure. Although we only address the retrieval problem in this paper, it is obvious that this similarity measure can also be used to automatically determine form type for form data entry systems and automatically organize images in constructing form image databases.
Section snippets
Form signature
Form signature is the feature used to evaluate the similarity between forms in our system. Among the many components contained in a form, such as lines, pre-printed text fields, filled-in text fields, check boxes and circles, form identification number, logos, etc., frame lines reflect the essential layout structure of the form and are sufficient to uniquely identify the form type in most cases. Further, frame lines can be detected more reliably than other components. So, we define form
Similarity measure for forms
In order to define the similarity measure, we first introduce the concept of a grid on the form image plane.
Form retrieval experiments
In designing our system, we assume that forms with the same layout structure in the database have already been organized into the same group. So, before the system accepts a query, the signature of every type of form in the database can be obtained and be used to represent the relevant group of form images. The retrieval procedure only needs to compute the signature of the query image and then compare it with all the form signatures in the database. The block diagram of the retrieval system is
Conclusions and future work
We have addressed the problem of image-based form document retrieval. The central issue of this problem is the definition of a similarity measure between a pair of forms. Based on the definition of form signature, we have proposed a similarity measure that is insensitive to translation, scaling, moderate skew (<5°) and variations of the geometrical proportion of the form layout. A larger amount of skew can be handled by first applying a deskewing procedure [16]. This similarity measure also has
Acknowledgements
We would like to thank Dr. Jianchang Mao and Dr. Moidin Mohiuddin of IBM Almaden Research Center for their support of this work.
About the Author—JINHUI LIU received the B.S., M.S. and Ph.D. degrees in electronic engineering from Tsinghua University, Beijing, China, in 1990, 1992 and 1997, respectively. He was a postdoctoral researcher in the Department of Computer Science and Engineering and Center for Microbial Ecology at Michigan State University from 1997 to 1999 and was a visiting scientist at the IBM Almaden Research Center from July to October 1998. He is now a postdoctoral researcher in the Department of Computer
References (16)
- et al.
Extraction of characters from form documents by feature point clustering
Pattern Recognition Lett.
(1995) - et al.
An efficient algorithm for form structure extraction using strip projection
Pattern Recognition
(1998) - et al.
Recognition and data extraction of form documents based on three types of line segments
Pattern Recognition
(1998) - et al.
A robust and fast skew detection algorithm for generic document
Pattern Recognition
(1996) - et al.
Automated forms-processing software and services
IBM J. Res. Develop.
(1996) - J. Mao, M. Abayan, K. Mohiuddin, A model-based form processing sub-system, Proceedings of ICPR'96, 1996, pp....
- et al.
Intelligent forms processing system
Mach. Vision Appl.
(1992) - et al.
A generic system for form dropout
IEEE Trans. Pattern Anal. Mach. Intell.
(1996)
Cited by (20)
Model-based ruling line detection in noisy handwritten documents
2014, Pattern Recognition LettersCitation Excerpt :Line processing is needed in various document analysis applications, e.g., forms/invoice processing (Liu et al., 1995; Chhabra et al., 1995; Yu and Jain, 1996; Cesarini et al., 1998; Hori and Doermann, 1995; Chen and Lee, 1998; Tseng and Chen, 1998; Ting and Leung, 1999; Liu and Jain, 2000; Fan et al., 1998; Zheng et al., 2001), engineering drawing processing (Dori et al., 1993; Arias et al., 1995, 1997; Dori and Liu, 1999), music score analysis (Roach and Tatem, 1988; Carter and Bacon, 1992), and off-line handwriting analysis (Zheng and Doermann, 2003; Arvind et al., 2007; Abd-Almageed et al., 2009; Cao and Govindaraju, 2007; Cao et al., 2007).
Extraction of reference lines and items from form document images with complicated background
2005, Pattern RecognitionDocument retrieval from compressed images
2003, Pattern RecognitionCitation Excerpt :They built word equivalence classes by using a rank blur hit–miss transform to compare word images and use a statistical classifier to determine the likelihood of each sentence being a summary sentence. Liu and Jain [13] addressed an approach to image-based form document retrieval. They proposed a similarity measure for forms that is insensitive to translation, scaling, moderate skew and image quality fluctuations, and developed a prototype form retrieval system based on the proposed similarity measure.
Document Image Retrieval: Issues and Future Directions
2021, 2021 International Conference on Computational Intelligence and Computing Applications, ICCICA 2021Table images classification based on structure approach
2018, GraphiCon 2018 - 28th International Conference on Computer Graphics and VisionPage similarity and classification
2014, Handbook of Document Image Processing and Recognition
About the Author—JINHUI LIU received the B.S., M.S. and Ph.D. degrees in electronic engineering from Tsinghua University, Beijing, China, in 1990, 1992 and 1997, respectively. He was a postdoctoral researcher in the Department of Computer Science and Engineering and Center for Microbial Ecology at Michigan State University from 1997 to 1999 and was a visiting scientist at the IBM Almaden Research Center from July to October 1998. He is now a postdoctoral researcher in the Department of Computer Engineering and Computer Science at the University of Missouri–Columbia. His research interests include document image understanding, pattern recognition, and machine intelligence. He has performed research on form processing, handprinted Chinese character recognition, handwritten numeral recognition, postal address recognition and bacterial image analysis.
About the Author—ANIL JAIN is a University Distinguished Professor and Chair of the Department of Computer Science and Engineering at Michigan State University. His research interests include statistical pattern recognition, Markov random fields, texture analysis, neural networks, documents image analysis, fingerprint matching and 3D object recognition. He received the best paper awards in 1987 and 1991 and certificates for outstanding contributions in 1976, 1979, 1992, and 1997 from the Pattern recognition Society. He also received the 1996 IEEE Trans. Neural Networks Outstanding Paper Award. He was the Editor-in-Chief of the IEEE Trans. on Pattern Analysis and Machine Intelligence (1990–1994). He is the co-author of Alogorithms for clustering Data, Prentice-Hall, 1988, has edited the book Real-time Object Measurement and Classification, Springer-Verlag, 1988, and co-edited the books, Analysis and Interpretation of Range Images, Springer-Verlag, 1989, Markov Random Fields, Academic Press, 1992, Artificial Neural Networks and Pattern Recognition, Elsevier, 1993, 3D Object Recognition, Elsevier, 1993, and BIOMETRICS: Personal Identification in Networked Society, Kluwer in 1998. He is a Fellow of the IEEE and IAPR. He received a Fulbright research award in 1998.