Skip to main content

Data extraction from form images

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 1995)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 978))

Included in the following conference series:

Abstract

In this paper, we describe a system capable of extracting textual information from images of structured documents. In particular the model and the algorithms we described are used to process forms in which the information fields can not be located only by their position on the page, but can also be identified after locating the corresponding instruction fields. The proposed model is based on attributed relational graphs and performs form registration and location of information fields using algorithms based on the hypothesize-and-verify paradigm. The location of instruction fields is carried out in an holistic way, by using connectionist models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Bianchini, P. Frasconi and M. Gori. Learning in Multilayered Networks Used as Autoassociators. IEEE Transaction on Neural Networks 1995, Vol. 6, No. 2 pp. 512–515.

    Google Scholar 

  2. F. Cesarini, M. Gori, S. Marinai, G. Soda. A Hybrid System for Locating Low Level Graphic Items. To appear in Proceedings of the First IAPR Workshop on Graphic Recognition, Pen State University, 1995.

    Google Scholar 

  3. F. Cesarini, M. Gori, S. Marinai, G. Soda. A System for Data Extraction from Forms of Known Class. To appear in Proceedings of the 3th International Conference on Document Analysis and Recognition, Montreal 1995.

    Google Scholar 

  4. D. S. Doermann, A. Rosenfeld The Processing of Form Documents. Proceedings of International Conference on Document Analysis and Recognition, 1993, pp. 497–501.

    Google Scholar 

  5. M.A. Eshera and K.S. Fu. An Image Understanding System using Attributed Symbolic Representation and Inexact Graph-matching. IEEE Transaction on PAMI 1986, Vol. 8, No. 5 pp. 604–617.

    Google Scholar 

  6. M. D. Garris et als. NIST Form-based Handprint Recognition System. NISTIR 5469. U.S. Department of Commerce. Technology Administration. National Institute of Standards and Technology. July 1994.

    Google Scholar 

  7. W.E.L. Grimson. Object Recognition by Computer, the Role of Geometric Constraints. Cambridge. MIT Press, 1990.

    Google Scholar 

  8. S.W. Lam, S.N. Srihari. Multi-domain Document Layout Understanding. Proceedings of International Conference on Document Analysis and Recognition, 1991, pp. 112–120.

    Google Scholar 

  9. S.W. Lam. An Adaptive Approach to Document Classification and Understanding. Proceedings of the IAPR Workshop on Document Analysis Systems Kaiserslautern, Germany, October 1994.

    Google Scholar 

  10. Y.Y. Tang, C.De Yan, C.Y. Suen. Document Processing for Automatic Knowledge Acquisition. IEEE Transaction on Knowledge and Data Engineering 1994, Vol. 6, No. 1 pp. 3–20.

    Google Scholar 

  11. C.D. Yan, Y.Y. Tang, C.Y. Suen. Form Understanding System Based on Form Description Language. Proceedings of International Conference on Document Analysis and Recognition, 1991, pp. 283–293.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Norman Revell A Min Tjoa

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cesarini, F., Gori, M., Marinai, S., Soda, G. (1995). Data extraction from form images. In: Revell, N., Tjoa, A.M. (eds) Database and Expert Systems Applications. DEXA 1995. Lecture Notes in Computer Science, vol 978. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0049141

Download citation

  • DOI: https://doi.org/10.1007/BFb0049141

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-60303-0

  • Online ISBN: 978-3-540-44790-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics