Abstract.
This paper proposes a method to recognize digits in a natural scene, such as telephone numbers on a signboard. Candidate regions of digits are extracted from an image through contrast enhancement, edge extraction, and labeling. Since the target text patterns are in a 3D space, unlike traditional character recognition problems, we have to deal with the image transformation effect due to the orientation in the 3D space and projection. We have to cancel the effect as much as possible before digit recognition. In our method, the image transformation effect is modeled as skew and slant. In the proposed method, simplified Hough transform is used for the skew normalization. After the skew normalization, the remaining effect of image transformation is corrected by circumscribing digit patterns with tilted rectangles and affine transformation. In experiments, we tested a total of 1,332 images of signboards with 11,939 digits. We obtained a digit extraction rate of 99.2% and a correct digit recognition rate of 98.8%.
Similar content being viewed by others
References
Doermann D, Liang J, Li H (2003) Progress in camera-based document image analysis. In: Proc. ICDAR, pp 606-617
Clark P, Mirmehdi M (2002) Recognising text in real scenes. Int J Doc Anal Recog 4(4):243-257
Clark P, Mirmehdi M (2001) Estimating the orientation and recovery of text planes in a single image. In: Proc. 12th BMVC, pp 421-430
Kim KI, Jung K, Kim J (2002) Color texture-based object detection: an application to license plate localization. In: Proc. SVM, pp 293-309
Fujisawa H, Sako H, Okada Y, Lee S-W (1999) Information capturing camera and developmental Issues. In: Proc. ICDAR, pp 205-208
Yamaguchi T, Nakano Y (2002) Extraction of place-name from natural scenes. In: Proc. IWFHR, pp 239-243
Chaudhuri BB, Pal U (1997) Skew angle detection of digitized Indian script documents. IEEE Trans Pattern Anal Mach Intell 19(2):182-186
Sun C, Si D (1997) Skew and slant correction for document images using gradient direction. In: Proc. ICDAR, pp 142-146
Mori S, Suen CY, Yamamoto K (1992) Historical review of OCR research and development. Proc IEEE 87(7):1029-1058
Yasuda M, Yamamoto K, Yamada H (1997) Effect of the perturbed correlation method for optical character recognition. Pattern Recog 30(8):1315-1320
Maruyama K, Maruyama M, Miyao H, Nakano Y (2004) A method to make multiple hypotheses with high cumulative recognition rate using SVMs. J Pattern Recog 37(2):241-251
Yamaguchi T, Maruyama M (2004) Character extraction from natural scene images by hierarchical classifiers. In: Proc. ICPR2004, 2:687-690
Marr D (1982) Vision. Freeman, New York
Author information
Authors and Affiliations
Corresponding author
Additional information
Received: 15 December 2003, Accepted: 21 October 2004, Published online: 2 February 2005
Rights and permissions
About this article
Cite this article
Yamaguchi, T., Maruyama, M., Miyao, H. et al. Digit recognition in a natural scene with skew and slant normalization. IJDAR 7, 168–177 (2005). https://doi.org/10.1007/s10032-004-0136-1
Issue Date:
DOI: https://doi.org/10.1007/s10032-004-0136-1