Paper
23 March 1994 Automatic benchmarking scheme for page segmentation
Sabine Randriamasy, Luc M. Vincent, Ben S. Wittner
Author Affiliations +
Proceedings Volume 2181, Document Recognition; (1994) https://doi.org/10.1117/12.171109
Event: IS&T/SPIE 1994 International Symposium on Electronic Imaging: Science and Technology, 1994, San Jose, CA, United States
Abstract
An automatic bitmap-level, set-based benchmarking scheme for page segmentation, comparing results with predefined `ground truth files' containing all the possible correct solutions, is presented. A successful page segmentation is a necessary precondition for a document recognition process to be successful. The problems addressed here are: design methods to describe all possible correct segmentations for a given page and design methods to compare two segmentations. The proposed segmentation ground truth representation scheme defines ground truth text regions as non-mergeable maximal sets of text lines, merged in a language- dependent direction. It includes the other possible correct segmentations in that authorized cuts in the region are explicitly specified. At this low-level stage, quality criteria for a page segmentation are mainly defined as providing correct input for region ordering and classification. The qualitative and quantitative evaluation method tests the overlap between the two sets of regions. In fact, the regions are defined as being the black pixels contained in the derived polygons.
© (1994) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Sabine Randriamasy, Luc M. Vincent, and Ben S. Wittner "Automatic benchmarking scheme for page segmentation", Proc. SPIE 2181, Document Recognition, (23 March 1994); https://doi.org/10.1117/12.171109
Lens.org Logo
CITATIONS
Cited by 6 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Optical character recognition

Halftones

Detection and tracking algorithms

Roentgenium

Visualization

Error analysis

RELATED CONTENT

Non-Manhattan layout extraction algorithm
Proceedings of SPIE (March 21 2013)
Document structure analysis algorithms: a literature survey
Proceedings of SPIE (January 13 2003)
Detection of text strings from mixed text/graphics images
Proceedings of SPIE (December 21 2000)
Graph-based table recognition system
Proceedings of SPIE (March 07 1996)
Benchmarking of document page segmentation
Proceedings of SPIE (December 22 1999)

Back to Top