Abstract
This work presents a solution to the problem of the segmentation of digits in forms characterized by its low quality, as well as the existence of breaks and touching digits. We propose a new function of segmentation that adds to two traditional techniques (vertical projections and Tsujimoto metric) information of background of the digit. Unlike other techniques reported in the literature, ours obtains a near-optimum number of break points in fields containing broken, blurred and touching characters, leading to high accuracy in the global OCR system. The accuracy obtained in the segmentation of the forms fields is of 99,74% on a sample of 11,283 fields of 144 forms of low quality, which provides a final accuracy to the automatic recognition process of 99,42% of digits correctly classified.
Chapter PDF
References
R.G. Casey, E. Lecolinet: A Survey of Methods and Strategies in Character Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 7, pp. 690–706, 1996.
A.C. Downton, R.W.S. Tregidgo, E. Kabir: Recognition and Verification of Handwritten and Hand-printed British Postal Addresses. Character & Handwriting Recognition, Ed: P.S.P. Wang, pp. 265–291, World Scientific series in Computer Science Vol. 30, 1991.
Y. Lu: Machine Printed Character Segmentation — An overview. Pattern Recognition, Vol. 28, No. 1, pp. 67–80, 1995.
J. Muguerza: Una Solución al Reconocimiento Automático de Dígitos Imprecisos en Formularios. Doctoral Thesis, Basque Country University, Spain, January 1996.
C. Rodriguez, J. Muguerza, M. Navarro, A. Zárate, J.1. Martin, J.M. Pérez: A Two-Stage Classifier for Broken and Blurred Digits in Forms. Accepted for presentation in the 2nd International Workshop on Statistical Techniques in Pattern Recognition, Sydney, Australia, 1998.
S. Tsujimoto, H. Asada: Resolving Ambiguity in Segmenting Touching Characters. The First International Conference on Document Analysis and Recognition, pp. 701–709, 1991.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rodríguez, C., Muguerza, J., Navarro, M., Zárate, A., Martín, J.I., Pérez, J.M. (1998). A new cost function for typewritten digits segmentation. In: Amin, A., Dori, D., Pudil, P., Freeman, H. (eds) Advances in Pattern Recognition. SSPR /SPR 1998. Lecture Notes in Computer Science, vol 1451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0033327
Download citation
DOI: https://doi.org/10.1007/BFb0033327
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64858-1
Online ISBN: 978-3-540-68526-5
eBook Packages: Springer Book Archive