Skip to main content
Log in

Digit recognition in a natural scene with skew and slant normalization

  • Published:
International Journal of Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract.

This paper proposes a method to recognize digits in a natural scene, such as telephone numbers on a signboard. Candidate regions of digits are extracted from an image through contrast enhancement, edge extraction, and labeling. Since the target text patterns are in a 3D space, unlike traditional character recognition problems, we have to deal with the image transformation effect due to the orientation in the 3D space and projection. We have to cancel the effect as much as possible before digit recognition. In our method, the image transformation effect is modeled as skew and slant. In the proposed method, simplified Hough transform is used for the skew normalization. After the skew normalization, the remaining effect of image transformation is corrected by circumscribing digit patterns with tilted rectangles and affine transformation. In experiments, we tested a total of 1,332 images of signboards with 11,939 digits. We obtained a digit extraction rate of 99.2% and a correct digit recognition rate of 98.8%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Doermann D, Liang J, Li H (2003) Progress in camera-based document image analysis. In: Proc. ICDAR, pp 606-617

  2. Clark P, Mirmehdi M (2002) Recognising text in real scenes. Int J Doc Anal Recog 4(4):243-257

    Google Scholar 

  3. Clark P, Mirmehdi M (2001) Estimating the orientation and recovery of text planes in a single image. In: Proc. 12th BMVC, pp 421-430

  4. Kim KI, Jung K, Kim J (2002) Color texture-based object detection: an application to license plate localization. In: Proc. SVM, pp 293-309

  5. Fujisawa H, Sako H, Okada Y, Lee S-W (1999) Information capturing camera and developmental Issues. In: Proc. ICDAR, pp 205-208

  6. Yamaguchi T, Nakano Y (2002) Extraction of place-name from natural scenes. In: Proc. IWFHR, pp 239-243

  7. Chaudhuri BB, Pal U (1997) Skew angle detection of digitized Indian script documents. IEEE Trans Pattern Anal Mach Intell 19(2):182-186

    Google Scholar 

  8. Sun C, Si D (1997) Skew and slant correction for document images using gradient direction. In: Proc. ICDAR, pp 142-146

  9. Mori S, Suen CY, Yamamoto K (1992) Historical review of OCR research and development. Proc IEEE 87(7):1029-1058

    Google Scholar 

  10. Yasuda M, Yamamoto K, Yamada H (1997) Effect of the perturbed correlation method for optical character recognition. Pattern Recog 30(8):1315-1320

    Google Scholar 

  11. Maruyama K, Maruyama M, Miyao H, Nakano Y (2004) A method to make multiple hypotheses with high cumulative recognition rate using SVMs. J Pattern Recog 37(2):241-251

    Google Scholar 

  12. Yamaguchi T, Maruyama M (2004) Character extraction from natural scene images by hierarchical classifiers. In: Proc. ICPR2004, 2:687-690

  13. Marr D (1982) Vision. Freeman, New York

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takuma Yamaguchi.

Additional information

Received: 15 December 2003, Accepted: 21 October 2004, Published online: 2 February 2005

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yamaguchi, T., Maruyama, M., Miyao, H. et al. Digit recognition in a natural scene with skew and slant normalization. IJDAR 7, 168–177 (2005). https://doi.org/10.1007/s10032-004-0136-1

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-004-0136-1

Keywords:

Navigation