Retrieval of historical documents by word spotting

Nikoleta Doulgeri; Ergina Kavallieratou

doi:10.1117/12.805602

19 January 2009 Retrieval of historical documents by word spotting

Nikoleta Doulgeri, Ergina Kavallieratou

Proceedings Volume 7247, Document Recognition and Retrieval XVI; 724706 (2009) https://doi.org/10.1117/12.805602
Event: IS&T/SPIE Electronic Imaging, 2009, San Jose, California, United States

Abstract

The implementation of word spotting is not an easy procedure and it gets even worse in the case of historical documents since it requires character recognition and indexing of the document images. A general technique for word spotting is presented, independent of OCR, using automatic representation of the text queries of the user by word images and comparing them with the word images extracted from the document images. The proposed system does not require training. The only required preprocessing task is the alphabet determination. Global shape features are used to describe the words. They are very general in order to capture the form of the word and appropriately normalized in order to face the usual problems of variance in resolution, width of words and fonts. A novel technique that makes use of the interpolation method is presented. In our experiments, we analyze the system dependence on its parameters and we prove that its performance is similar to the trainable systems.

Citation Download Citation

Nikoleta Doulgeri and Ergina Kavallieratou "Retrieval of historical documents by word spotting", Proc. SPIE 7247, Document Recognition and Retrieval XVI, 724706 (19 January 2009); https://doi.org/10.1117/12.805602

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available