Paper
4 February 2013 WFST-based ground truth alignment for difficult historical documents with text modification and layout variations
Author Affiliations +
Proceedings Volume 8658, Document Recognition and Retrieval XX; 865818 (2013) https://doi.org/10.1117/12.2003134
Event: IS&T/SPIE Electronic Imaging, 2013, Burlingame, California, United States
Abstract
This work proposes several approaches that can be used for generating correspondences between real scanned books and their transcriptions which might have different modifications and layout variations, also taking OCR errors into account. Our approaches for the alignment between the manuscript and the transcription are based on weighted finite state transducers (WFST). In particular, we propose adapted WFSTs to represent the transcription to be aligned with the OCR lattices. The character-level alignment has edit rules to allow edit operations (insertion, deletion, substitution). Those edit operations allow the transcription model to deal with OCR segmentation and recognition errors, and also with the task of aligning with different text editions. We implemented an alignment model with a hyphenation model, so it can adapt the non-hyphenated transcription. Our models also work with Fraktur ligatures, which are typically found in historical Fraktur documents. We evaluated our approach on Fraktur documents from Wanderungen durch die Mark Brandenburg" volumes (1862-1889) and observed the performance of those models under OCR errors. We compare the performance of our model for three different scenarios: having no information about the correspondence at the word (i), line (ii), sentence (iii) or page (iv) level.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Mayce Al Azawi, Marcus Liwicki, and Thomas M. Breuel "WFST-based ground truth alignment for difficult historical documents with text modification and layout variations", Proc. SPIE 8658, Document Recognition and Retrieval XX, 865818 (4 February 2013); https://doi.org/10.1117/12.2003134
Lens.org Logo
CITATIONS
Cited by 13 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

Image segmentation

Neodymium

Picosecond phenomena

Information technology

Performance modeling

Databases

Back to Top