Elsevier

Pattern Recognition

Volume 33, Issue 3, March 2000, Pages 503-513
Pattern Recognition

Image-based form document retrieval

https://doi.org/10.1016/S0031-3203(99)00066-7Get rights and content

Abstract

We address the problem of image-based form document retrieval. The essential element of this problem is the definition of a similarity measure that is applicable in real situations, where query images are allowed to differ from the database images. Based on the definition of form signature, we have proposed a similarity measure that is insensitive to translation, scaling, moderate skew (<5°) and variations in the geometrical proportion of the form layout. This similarity measure also has a good tolerance to line detection errors. We have developed a prototype form retrieval system based on the proposed similarity measure. Preliminary experimental results on a database containing 100 different kinds of forms are encouraging.

Introduction

Form processing is an essential operation in business and government organizations. Although the percentage of electronic filings is increasing, paper-based forms will continue to be widely used in the foreseeable future. Because of the many advantages offered by on-line processing, much effort has been devoted to image-based form analysis and recognition. This includes practical systems for machine reading of form data from scanned images [1], [2], [3], form dropout [4], [5], [6], [7] and form layout description [8]. Other investigations on various aspects of form analysis and recognition can be found in Refs. [9], [10], [11], [12], [13], [14], [15].

In this paper we consider the following problem: Given a form image database and a query image, how do we retrieve form images in the database with the same or similar layout structure as the query? This problem is similar to the form recognition work investigated in Ref. [1]. However, the difference lies in that our query images are not prepared for data entry, but for providing sufficient clues for similarity-based retrieval. So, the query images can be at a different scale (resolution) or have relatively poor quality compared to database images. The challenges in form image analysis arise from the large variety of form formats (layouts). Although form formats can be very flexible, we assume that the forms have the following layout characteristics: (i) a form has two horizontal frame lines serving as top and bottom boundaries. It is recommended that the forms have two vertical frame lines serving as left and right boundaries, and (ii) most of the useful data items in a form are enclosed by rectangles formed by frame lines.

We address the problem of image-based retrieval for forms that have the above characteristics. Our goal is to propose a similarity measure for forms that is insensitive to translation, scaling, moderate skew (<5°), and image quality fluctuations. Sometimes, forms of the same type that come from different printing sources differ slightly in their final layout, so we do not assume the invariance of the geometrical proportion of the frame structure. Although we only address the retrieval problem in this paper, it is obvious that this similarity measure can also be used to automatically determine form type for form data entry systems and automatically organize images in constructing form image databases.

Section snippets

Form signature

Form signature is the feature used to evaluate the similarity between forms in our system. Among the many components contained in a form, such as lines, pre-printed text fields, filled-in text fields, check boxes and circles, form identification number, logos, etc., frame lines reflect the essential layout structure of the form and are sufficient to uniquely identify the form type in most cases. Further, frame lines can be detected more reliably than other components. So, we define form

Similarity measure for forms

In order to define the similarity measure, we first introduce the concept of a grid on the form image plane.

Form retrieval experiments

In designing our system, we assume that forms with the same layout structure in the database have already been organized into the same group. So, before the system accepts a query, the signature of every type of form in the database can be obtained and be used to represent the relevant group of form images. The retrieval procedure only needs to compute the signature of the query image and then compare it with all the form signatures in the database. The block diagram of the retrieval system is

Conclusions and future work

We have addressed the problem of image-based form document retrieval. The central issue of this problem is the definition of a similarity measure between a pair of forms. Based on the definition of form signature, we have proposed a similarity measure that is insensitive to translation, scaling, moderate skew (<5°) and variations of the geometrical proportion of the form layout. A larger amount of skew can be handled by first applying a deskewing procedure [16]. This similarity measure also has

Acknowledgements

We would like to thank Dr. Jianchang Mao and Dr. Moidin Mohiuddin of IBM Almaden Research Center for their support of this work.

About the Author—JINHUI LIU received the B.S., M.S. and Ph.D. degrees in electronic engineering from Tsinghua University, Beijing, China, in 1990, 1992 and 1997, respectively. He was a postdoctoral researcher in the Department of Computer Science and Engineering and Center for Microbial Ecology at Michigan State University from 1997 to 1999 and was a visiting scientist at the IBM Almaden Research Center from July to October 1998. He is now a postdoctoral researcher in the Department of Computer

References (16)

There are more references available in the full text version of this article.

Cited by (20)

  • Model-based ruling line detection in noisy handwritten documents

    2014, Pattern Recognition Letters
    Citation Excerpt :

    Line processing is needed in various document analysis applications, e.g., forms/invoice processing (Liu et al., 1995; Chhabra et al., 1995; Yu and Jain, 1996; Cesarini et al., 1998; Hori and Doermann, 1995; Chen and Lee, 1998; Tseng and Chen, 1998; Ting and Leung, 1999; Liu and Jain, 2000; Fan et al., 1998; Zheng et al., 2001), engineering drawing processing (Dori et al., 1993; Arias et al., 1995, 1997; Dori and Liu, 1999), music score analysis (Roach and Tatem, 1988; Carter and Bacon, 1992), and off-line handwriting analysis (Zheng and Doermann, 2003; Arvind et al., 2007; Abd-Almageed et al., 2009; Cao and Govindaraju, 2007; Cao et al., 2007).

  • Document retrieval from compressed images

    2003, Pattern Recognition
    Citation Excerpt :

    They built word equivalence classes by using a rank blur hit–miss transform to compare word images and use a statistical classifier to determine the likelihood of each sentence being a summary sentence. Liu and Jain [13] addressed an approach to image-based form document retrieval. They proposed a similarity measure for forms that is insensitive to translation, scaling, moderate skew and image quality fluctuations, and developed a prototype form retrieval system based on the proposed similarity measure.

  • Document Image Retrieval: Issues and Future Directions

    2021, 2021 International Conference on Computational Intelligence and Computing Applications, ICCICA 2021
  • Table images classification based on structure approach

    2018, GraphiCon 2018 - 28th International Conference on Computer Graphics and Vision
  • Page similarity and classification

    2014, Handbook of Document Image Processing and Recognition
View all citing articles on Scopus

About the Author—JINHUI LIU received the B.S., M.S. and Ph.D. degrees in electronic engineering from Tsinghua University, Beijing, China, in 1990, 1992 and 1997, respectively. He was a postdoctoral researcher in the Department of Computer Science and Engineering and Center for Microbial Ecology at Michigan State University from 1997 to 1999 and was a visiting scientist at the IBM Almaden Research Center from July to October 1998. He is now a postdoctoral researcher in the Department of Computer Engineering and Computer Science at the University of Missouri–Columbia. His research interests include document image understanding, pattern recognition, and machine intelligence. He has performed research on form processing, handprinted Chinese character recognition, handwritten numeral recognition, postal address recognition and bacterial image analysis.

About the Author—ANIL JAIN is a University Distinguished Professor and Chair of the Department of Computer Science and Engineering at Michigan State University. His research interests include statistical pattern recognition, Markov random fields, texture analysis, neural networks, documents image analysis, fingerprint matching and 3D object recognition. He received the best paper awards in 1987 and 1991 and certificates for outstanding contributions in 1976, 1979, 1992, and 1997 from the Pattern recognition Society. He also received the 1996 IEEE Trans. Neural Networks Outstanding Paper Award. He was the Editor-in-Chief of the IEEE Trans. on Pattern Analysis and Machine Intelligence (1990–1994). He is the co-author of Alogorithms for clustering Data, Prentice-Hall, 1988, has edited the book Real-time Object Measurement and Classification, Springer-Verlag, 1988, and co-edited the books, Analysis and Interpretation of Range Images, Springer-Verlag, 1989, Markov Random Fields, Academic Press, 1992, Artificial Neural Networks and Pattern Recognition, Elsevier, 1993, 3D Object Recognition, Elsevier, 1993, and BIOMETRICS: Personal Identification in Networked Society, Kluwer in 1998. He is a Fellow of the IEEE and IAPR. He received a Fulbright research award in 1998.

View full text