Paper
19 January 2009 Retrieval of historical documents by word spotting
Nikoleta Doulgeri, Ergina Kavallieratou
Author Affiliations +
Proceedings Volume 7247, Document Recognition and Retrieval XVI; 724706 (2009) https://doi.org/10.1117/12.805602
Event: IS&T/SPIE Electronic Imaging, 2009, San Jose, California, United States
Abstract
The implementation of word spotting is not an easy procedure and it gets even worse in the case of historical documents since it requires character recognition and indexing of the document images. A general technique for word spotting is presented, independent of OCR, using automatic representation of the text queries of the user by word images and comparing them with the word images extracted from the document images. The proposed system does not require training. The only required preprocessing task is the alphabet determination. Global shape features are used to describe the words. They are very general in order to capture the form of the word and appropriately normalized in order to face the usual problems of variance in resolution, width of words and fonts. A novel technique that makes use of the interpolation method is presented. In our experiments, we analyze the system dependence on its parameters and we prove that its performance is similar to the trainable systems.
© (2009) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Nikoleta Doulgeri and Ergina Kavallieratou "Retrieval of historical documents by word spotting", Proc. SPIE 7247, Document Recognition and Retrieval XVI, 724706 (19 January 2009); https://doi.org/10.1117/12.805602
Lens.org Logo
CITATIONS
Cited by 4 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Optical character recognition

Feature extraction

Image retrieval

Image analysis

Image processing

Image resolution

Back to Top