Skip to main content

Retrieval of Document Images Based on Page Layout Similarity

  • Conference paper
Adaptive Multimedia Retrieval: User, Context, and Feedback (AMR 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4398))

Included in the following conference series:

  • 437 Accesses

Abstract

In this paper, we address the problem of document image retrieval in digital libraries. As an essential element of this problem we have proposed a measure of spatial layout similarity with importance to category of components in document images. We have tested the method on MediaTeam document image database that provides diverse collection of document images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Appiani, E., et al.: Automatic document classification and indexing in high-volume applications. Int’l Journal on Document Analysis and Recognition 4, 69–83 (2001)

    Article  Google Scholar 

  2. Chalechale, A., Naghdy, G., Mertins, A.: Signature-based Document Retrieval. In: Proc. of 3rd IEEE Int’l Symposium on Signal Processing and Information Technology, pp. 597–600. IEEE Computer Society Press, Los Alamitos (2003)

    Google Scholar 

  3. Cullen, J.F., Hull, J.J., Hart, P.E.: Document Image Database Retrieval and Browsing using Texture Analysis. In: Proc. Fourth Int’l Conf. Document Analysis and Recognition, pp. 718–721 (1997)

    Google Scholar 

  4. Das, A.K., Chanda, B.: A fast algorithm for skew detection of document images using morphology. Int’l Journal on Document Analysis and Recognition 4, 109–114 (2001)

    Article  Google Scholar 

  5. Das, A.K., Saha, S.K., Chanda, B.: An empirical measure of the performance of a document image segmentation algorithm. Int’l Journal on Document Analysis and Recognition 4, 183–190 (2002)

    Article  Google Scholar 

  6. Doermann, D.: The Retrieval of Document Images: A Brief Survey. In: Proc. Fourth Int’l Conf. Document Analysis and Recognition, pp. 945–949 (1997)

    Google Scholar 

  7. Doermann, D., et al.: The development of a general framework for intelligent document image retrieval. In: Proc. of Document Analysis sytems workshop, pp. 605–632 (1996)

    Google Scholar 

  8. Eglin, V., Bres, S.: Document page similarity based on layout based on layout visual saliency: Application to query by example and document classification. In: Proc. Seventh Int’l Conf. Document Analysis and Recognition, pp. 1208–1212 (2003)

    Google Scholar 

  9. Guru, D.S., Punitha, P., Mahesh, S.: Skew Estimation in Digitized Documents: A Novel Approach. In: Proc. Forth Indian Conf. on Computer Vision, Graphics & Image Processing, pp. 314–319 (2004)

    Google Scholar 

  10. Hu, J., Kashi, R., Wilfong, G.: Document Image Layout Comparison and Classification. In: Proc. Fifth Int’l Conf. on Document Analysis and Recognition, pp. 285–289 (1999)

    Google Scholar 

  11. Jain, A.K., Liu, J.: Image-Based Form Document Retrieval. In: 14th International Conference on Pattern Recognition, vol. 1, pp. 626–629 (1998)

    Google Scholar 

  12. Jain, A.K., Yu, B.: Document Representation and Its Application to Page Decomposition. IEEE Trans. on Pattern Analysis and Machine Intelligence 20(3), 294–308 (1998)

    Article  Google Scholar 

  13. Jaisimha, M.Y., Bruce, A., Nguyen, T.: Docbrowse: A system for textual and graphical querying on degraded document image data. In: DAS, pp. 581–604 (1996)

    Google Scholar 

  14. Lee, S.-W., Ryu, D.-S.: Parameter-Free Geometric Document Layout Analysis. IEEE Trans. on Pattern Analysis and Machine Intelligence 23(11), 1240–1256 (2001)

    Article  Google Scholar 

  15. Marinai, S., et al.: A General System for the Retrieval of Document Images from Digital Libraries. In: First Int’l Workshop on Document Image Analysis for Libraries(DIAL’04), pp. 150–173 (2004)

    Google Scholar 

  16. Sauvola, J., Kauniskangas, H.: MediaTeam Document Database II, a CD-ROM collection of document images, University of Oulu, Finland (1999)

    Google Scholar 

  17. Shin, C., Doermann, D., Rosenfeld, A.: Classification of document pages using structure-based features. Int’l Journal on Document Analysis and Recognition 3, 232–247 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Stéphane Marchand-Maillet Eric Bruno Andreas Nürnberger Marcin Detyniecki

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Naveen, Guru, D.S. (2007). Retrieval of Document Images Based on Page Layout Similarity. In: Marchand-Maillet, S., Bruno, E., Nürnberger, A., Detyniecki, M. (eds) Adaptive Multimedia Retrieval: User, Context, and Feedback. AMR 2006. Lecture Notes in Computer Science, vol 4398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71545-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71545-0_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71544-3

  • Online ISBN: 978-3-540-71545-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics