Abstract.
We present a useful method for assessing the quality of a typewritten document image and automatically selecting an optimal restoration method based on that assessment. We use five quality measures that assess the severity of background speckle, touching characters, and broken characters. A linear classifier uses these measures to select a restoration method. On a 139-document corpus, our methodology reduced the corpus OCR character error rate from 20.27% to 12.60%.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
Received November 10, 1998 / Revised October 27, 1999
Rights and permissions
About this article
Cite this article
Cannon, M., Hochberg, J. & Kelly, P. Quality assessment and restoration of typewritten document images. IJDAR 2, 80–89 (1999). https://doi.org/10.1007/s100320050039
Issue Date:
DOI: https://doi.org/10.1007/s100320050039