Skip to main content
Log in

Multiscale document description using rectangular granulometries

  • Published:
Document Analysis and Recognition Aims and scope Submit manuscript

Abstract.

When comparing document images based on visual similarity it is difficult to determine the correct scale and features for document representation. We report on a new form of multivariate granulometries based on rectangles of varying size and aspect ratio. These rectangular granulometries are used to probe the layout structure of document images, and the rectangular size distributions derived from them are used as descriptors for document images. Feature selection is used to reduce the dimensionality and redundancy of the size distributions while preserving the essence of the visual appearance of a document. Experimental results indicate that rectangular size distributions are an effective way to characterize visual similarity of document images and provide insightful interpretation of classification and retrieval results in the original image space rather than the abstract feature space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Antonacopoulos A (1998) Page segmentation using the description of the background. Comput Vision Image Understand 70(3):350-369

  2. Batman S, Dougherty ER, Sand F (2000) Heterogeneous morphological granulometries. Patt Recog 33:1047-1057

  3. Breuel T (2000) Layout analysis by exploring the space of segmentation parameters. In: Proceedings of the 4th international workshop on document analysis systems (DAS’2000), Rio de Janeiro, 10-13 December 2000

  4. Chandler D (2001) Semiotics: the basics. Routledge, London

  5. Doermann DS (1998) The indexing and retrieval of document images: a survey. Comput Vision Image Understand 70(3):287-298

  6. Dougherty ER, Pelz J, Sand F, Lent A (1992) Morphological image segmentation by local granulometric size distributions. J Electron Imag 1:46-60

  7. Haralick RM, Katz PL, Dougherty ER (1995) Model-based morphology: the opening spectrum. Graph Models Image Process 57(1):1-12

  8. Maragos P (1989) Pattern spectrum and multiscale shape representation. IEEE Trans Patt Analysis Mach Intell 11:701-716

  9. Matheron G (1975) Random sets and integral geometry. Wiley, New York

  10. Serra J (1982) Image analysis and mathematical morphology. Academic, New York

  11. Shin CK, Doermann DS (2000) Classification of document page images based on visual similarity of layout structures. In: Proceedings of SPIE Document Recognition and Retrieval VII, San Jose, 26-27 January 2000, pp 182-190

  12. Vincent L (2000) Granulometries and opening trees. Fundamenta Informatica 41(1-2):57-90

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew D. Bagdanov.

Additional information

Received: 16 November 2002, Accepted: 20 June 2003, Published online: 12 September 2003

Correpondence to: Andrew D. Bagdanov

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bagdanov, A.D., Worring, M. Multiscale document description using rectangular granulometries. IJDAR 6, 181–191 (2003). https://doi.org/10.1007/s10032-003-0112-1

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-003-0112-1

Keywords:

Navigation