Abstract.
In this paper we consider a statistical approach to augment a limited database of groundtruth documents for use in evaluation of optical character recognition software. A modified moving-blocks bootstrap procedure is used to construct surrogate documents for this purpose which prove to serve effectively and, in some regards, indistinguishably from groundtruth. The proposed method is validated through a rigorous statistical procedure.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
Received: March 30, 2000 / Revised: September 14, 2001
About this article
Cite this article
Brundick, F., Brodeen, A. & Taylor, M. A statistical approach to the generation of a database for evaluating OCR software. IJDAR 4, 170–176 (2002). https://doi.org/10.1007/s100320200067
Issue Date:
DOI: https://doi.org/10.1007/s100320200067