Automatic Feature Extraction and Recognition for Digital Access of Books of the Renaissance

Muge, F.; Granado, I.; Mengucci, M.; Pina, P.; Ramos, V.; Sirakov, N.; Caldas Pinto, J. R.; Marcolino, A.; Ramalho, Mário; Vieira, P.; Maia do Amaral, A.

doi:10.1007/3-540-45268-0_1

Automatic Feature Extraction and Recognition for Digital Access of Books of the Renaissance

F. Muge³,
I. Granado³,
M. Mengucci³,
P. Pina³,
V. Ramos³,
N. Sirakov³,
J. R. Caldas Pinto⁴,
A. Marcolino⁴,
Mário Ramalho⁴,
P. Vieira⁴ &
…
A. Maia do Amaral⁵

Conference paper
First Online: 17 November 2000

947 Accesses
2 Citations
1 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1923))

Abstract

Antique printed books constitute a heritage that should be preserved and used. With novel digitising techniques is now possible to have these books stored in digital format and accessible to a wider public. However it remains the problem of how to use them. DEBORA (Digital accEss to BOoks of the RenAissance) is a European project that aims to develop a system to interact with these books through world-wide networks. The main issue is to build a database accessible through client computers. That will require to built accompanying metadata that should characterise different components of the books as illuminated letters, banners, figures and key words in order to simplify and speed up the remote access. To solve these problems, digital image analysis algorithms regarding filtering, segmentation, separation of text from non-text, lines and word segmentation and word recognition were developed. Some novel ideas are presented and illustrated through examples.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agam G., Dinstein I., 1996, Adaptive Directional Morphology with Application to Document Analysis, in Maragos P., Schafer R.W., Butt M.A. (eds.), Mathematical Morphology and its Applications to Image and Signal Processing, 401–xxx, Kluwer Academic Publishers, Boston.
Chapter Google Scholar
Beucher S., 1996, Pré-traitement morphologique d’images de plis postaux, 4 ^éme Colloque National Sur L’ecrit Et Le Document-Cned’96, Nantes.
Google Scholar
Bhat D., 1998, An Evolutionary Measure for Image Matching, in ICPR’98 — Proc. 14th Int. Conf. On Pattern Recognition, vol. I, 850–852, Brisbane, Australia.
Google Scholar
Cinque L, Lombardi, L, Manzini G., 1998, A multiresolution approach to page segmentation, Pattern Recognition Letters, 19, pp 217–2225.
Article Google Scholar
Cumplido M., Montolio P., Gasull A., 1996, Morphological Preprocessing and Binarization for OCR Systems, in Maragos P., Schafer R.W., Butt M.A. (eds.), Mathematical Morphology and its Applications to Image and Signal Processing, 393–400, Kluwer Academic Publishers, Boston.
Chapter Google Scholar
Guillevic D., Suen C.Y., 1997, HMM Word Recognition Engine, in ICDAR’97 — Proc. 4th Int. Conf. on Document Analysis and Recognition, vol. 2, 544–547, Ulm, Germany
Google Scholar
He S., Abe N., 1996, A Clustering-Based Approach to the Separation of Text Strings from Mixed Text/Graphics Documents, Proceedings of ICPR’ 96, Vienna.
Google Scholar
Jain A.K., Yu B, Document Representation and its application to page decomposition, IEEE Pattern Analysis and Machine Intelligence, 20(3), pp 294–308, March 1998
Article Google Scholar
Marcolino A., Ramos V., Ramalho M., Caldas Pinto J., 2000, Line and Word Matching in Old Documents, submitted to SIARP’2000 — V Ibero-American Symposium on Pattern Recognition, Lisboa.
Google Scholar
Mengucci M., Granado I., Muge F., Caldas Pinto J.R., 2000, A Methodology Based on Mathematical Morphology for the Extraction of Text and Figures from Ancient Books, RecPad 2000, pp 471–476 Porto, 11-12 May 2000, Portugal.
Google Scholar
Parodi P., Piccioli G., 1996, An Efficient Pre-Processing of Mixed-Content Document Images for OCR Systems, Proceedings of ICPR’ 96, Vienna.
Google Scholar
Ramos V., 2000, An Evolutionary Measure for Image Matching — Extensions to Binary Image Matching, Internal Technical Report, CVRM/IST, Lisboa.
Google Scholar
Serra J., 1982, Image Analysis and Mathematical Morphology, Academic Press, London.
Google Scholar
Soille P., 1999, Morphological Image Analysis, Springer; Berlin.
Book Google Scholar
Spitz A., 1999, Shape-based word Recognition, International Journal on Document Analysis and Recognition, vol 1, no. 4, 178–190.
Article Google Scholar
Srihari, et al, Document Image Understanding, http://www.cedar.buffalo.edu/ Publications/TechReps/Survey/, CEDAR-TR-92-1, 1992.
Tang Y.Y., Lee S.W., Suen C.Y., 1996, Automatic Document Processing: A survey; Pattern Recognition, 29(12), 1931-1952.
Article Google Scholar

Download references

Author information

Authors and Affiliations

CVRM / Centro de Geo-Sistemas, Instituto Superior Técnico, 1049-001, Av. Rovisco Pais, Lisboa, Portugal
F. Muge, I. Granado, M. Mengucci, P. Pina, V. Ramos & N. Sirakov
IDMEC, Instituto Superior Técnico, 1049-001, Av. Rovisco Pais, Lisboa, Portugal
J. R. Caldas Pinto, A. Marcolino, Mário Ramalho & P. Vieira
Biblioteca Geral da Universidade de Coimbra, 3000, Largo da Porta, Coimbra, Portugal
A. Maia do Amaral

Authors

F. Muge
View author publications
You can also search for this author in PubMed Google Scholar
I. Granado
View author publications
You can also search for this author in PubMed Google Scholar
M. Mengucci
View author publications
You can also search for this author in PubMed Google Scholar
P. Pina
View author publications
You can also search for this author in PubMed Google Scholar
V. Ramos
View author publications
You can also search for this author in PubMed Google Scholar
N. Sirakov
View author publications
You can also search for this author in PubMed Google Scholar
J. R. Caldas Pinto
View author publications
You can also search for this author in PubMed Google Scholar
A. Marcolino
View author publications
You can also search for this author in PubMed Google Scholar
Mário Ramalho
View author publications
You can also search for this author in PubMed Google Scholar
P. Vieira
View author publications
You can also search for this author in PubMed Google Scholar
A. Maia do Amaral
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Library of Portugal, Campo Grande, 83, 1749-081, Lisboa, Portugal
José Borbinha
GMD Library, Schloss Birlinghoven, 53754, Sankt Augustin, Germany
Thomas Baker

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Muge, F. et al. (2000). Automatic Feature Extraction and Recognition for Digital Access of Books of the Renaissance. In: Borbinha, J., Baker, T. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2000. Lecture Notes in Computer Science, vol 1923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45268-0_1

Download citation

DOI: https://doi.org/10.1007/3-540-45268-0_1
Published: 17 November 2000
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41023-2
Online ISBN: 978-3-540-45268-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics