Skip to main content

Ancient Printed Documents Indexation: A New Approach

  • Conference paper
Pattern Recognition and Data Mining (ICAPR 2005)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3686))

Included in the following conference series:

Abstract

Based on the study of the specificity of historical printed books and on the main error sources of classical methods of page layout analysis, this paper presents a new way to achieve an indexation of ancient printed documents. We have developed an approach based on the extraction and the quantification of the various orientations that are present in printed document images. The documents are initially splitted into homogenous areas in which we analyze significant orientations with a directional rose. Each kind of information (textual or graphical) is typically identified and labelled according to its orientation distribution. This choice of characterization allows us to separate textual regions from graphical ones by minimizing the a priori knowledge. The evaluation of our proposition lies on a document image retrieval using layout extraction criteria and can also be used to precisely localize graphical parts in various types of documents. The system has been tested with success over several ancient printed books of the Renaissance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Martin, H.J.: La naissance du livre moderne, Editions du Cercle de la Librairie (2000)

    Google Scholar 

  2. Belaid, A.: Computer aided design of models of page for their use in recognition of documents. In: Workshop one Electronic Page Models, LAMPE 1997 (1997)

    Google Scholar 

  3. O’Gorman, L.: The Document Spectrum for Page Analysis Layout. Trans. IEEE One PAMI 15(11), P1162–P1173 (1993)

    Google Scholar 

  4. Lebourgeois, F., Emptoz, H., Trinh, E.: Compression and accessibility with the images of digitized documents – Application to the Debora project. Numerical Document, Flight 7(3-4), 103–127 (2003)

    Article  Google Scholar 

  5. Xi, J., Hu, J., Wu, L.: Page segmentation of chinese newspaper. Pattern recognition, 2695–2704 (2002)

    Google Scholar 

  6. Malerba, D., Esposito, F., Altamura, O.: Adaptive Layout Analysis of document. In: Hacid, M.-S., Raś, Z.W., Zighed, D.A., Kodratoff, Y. (eds.) ISMIS 2002. LNCS (LNAI), vol. 2366, p. 526. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Duygulu, P., Atalay, V.: A Hierarchical Representation of Form Documents for Identification and Retrieval. International Journal on Document Analysis and Recognition IJDAR 5(1), 17–27 (2002)

    Article  MATH  Google Scholar 

  8. Bres, S.: Contributions à la quantification des critères de transparence et d’anisotropie par une approche globale. PhD Thesis (1994)

    Google Scholar 

  9. Pratt, W.K.: Digital Image Processing, 2nd edn., p. 230. Wiley, New York (1991)

    MATH  Google Scholar 

  10. mathworld.wolfram.com/PlancherelsTheorem.html

  11. Shin, D.D.: Classification of document page images based on visual similarity of layout structures. Language and Media Processing Laboratory Center for Automation Research University of Maryland (2000)

    Google Scholar 

  12. Maderlechner, G., Suda, P., Bruckner, T.: Classification of documents by form and content, Siemens AG, Corporate Research and DeÍelopment, Otto-Hahn-Ring 6, D-81730 Munchen, Germany

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Journet, N., Mullot, R., Ramel, JY., Eglin, V. (2005). Ancient Printed Documents Indexation: A New Approach. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds) Pattern Recognition and Data Mining. ICAPR 2005. Lecture Notes in Computer Science, vol 3686. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551188_64

Download citation

  • DOI: https://doi.org/10.1007/11551188_64

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28757-5

  • Online ISBN: 978-3-540-28758-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics