Skip to main content

Recognizing Natural Scene Characters by Convolutional Neural Network and Bimodal Image Enhancement

  • Conference paper
Camera-Based Document Analysis and Recognition (CBDAR 2011)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7139))

Abstract

In this paper, a natural scene character recognition method using convolutional neural network(CNN) and bimodal image enhancement is proposed. CNN based grayscale character recognizer has strong tolerance to degradations in natural scene images. Since character image is bimodal pattern image in essence, bimodal image enhancement is adopted to improve the performance of CNN classifier. Firstly, a maximum separability based color-to-gray method is used to strengthen the discriminative power in grayscale image space. Secondly, grayscale distribution normalization based on histogram alignment is performed. Through increasing the data consistency among grayscale training and test samples, it leads to a better CNN classifier. Thirdly, a shape holding grayscale character image normalization is adopted. Based on these measures, a high performance natural scene character recognizer is constructed. The recognition rate of 85.96% on ICDAR 2003 robust OCR dataset is higher than existing works, which verified the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: 7th International Conference on Document Analysis and Recognition, Edinburgh, Scotland, vol. 2, pp. 682–687 (2003)

    Google Scholar 

  2. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. IEEE 86, 2278–2324 (1998)

    Article  Google Scholar 

  3. Simard, P.Y., Steinkraus, D., Platt, J.C.: Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis. In: 7th International Conference on Document Analysis and Recognition, pp. 958–962 (2003)

    Google Scholar 

  4. Weinman, J.J., Learned-Miller, E., Hanson, A.R.: Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation. IEEE Trans. on Pattern Analysis and Machine Intelligence 10(31), 1733–1746 (2009)

    Article  Google Scholar 

  5. Yokobayashi, M., Wakahara, T.: Segmentation and recognition of characters in scene images using selective binarization in color space and gat correlation. In: 8th International Conference on Document Analysis and Recognition, vol. 1, pp. 167–171 (2005)

    Google Scholar 

  6. Yokobayashi, M., Wakahara, T.: Binarization and recognition of degraded characters using a maximum separability axis in color space and gat correlation. In: 18th International Conference on Pattern Recognition, Hongkong, China, vol. 2, pp. 885–888 (2006)

    Google Scholar 

  7. Clark, P., Mirmehdi, M.: Recognising text in real scenes. International Journal Document Analysis and Recognition 4(4), 243–257 (2004)

    Article  Google Scholar 

  8. Chen, D., Odobez, J., Bourlard, H.: Text detection and recognition in images and video frames. Pattern Recognition 3(37), 595–608 (2004)

    Article  Google Scholar 

  9. Kopf, S., Haenselmann, T., Effelsberg, W.: Robust character recognition in low-resolution images and videos. Technical report, Department for Mathematics and Computer Science, University of Mannheim (2005)

    Google Scholar 

  10. Sun, J., Hotta, Y., Katsuyama, Y., Naoi, S.: Camera based Degraded Text Recognition Using Grayscale Feature. In: 8th International Conference on Document Analysis and Recognition, pp. 182–186 (2005)

    Google Scholar 

  11. de Campos, T., Babu, B., Varma, M.: Character Recognition in Natural Images. In: International Conference on Computer Vision Theory and Applications, Lisbon, Portugal (2009)

    Google Scholar 

  12. Saidane, Z., Garcia, C.: Automatic scene text recognition using a convolutional neural network. In: 2nd International Workshop on Camera-Based Document Analysis and Recognition, pp. 100–106 (2007)

    Google Scholar 

  13. Jacobs, C., Simard, P.Y., Viola, P., Rinker, J.: Text Recognition of Low-resolution Document Images. In: 8th International Conference on Document Analysis and Recognition (ICDAR 2005), pp. 695–699 (2005)

    Google Scholar 

  14. Deng, H., Stathopoulos, G., Suen, C.Y.: Error-Correcting Output Coding for the Convolutional Neural Network for Optical Character Recognition. In: 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, pp. 581–585 (2009)

    Google Scholar 

  15. Garcia, C., Delakis, M.: Convolutional Face Finder: A Neural Architecture for Fast and Robust Face Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(11), 1408–1423 (2004)

    Article  Google Scholar 

  16. Otsu, N.: A Thresholding Selection Method from Gray-level Histogram. IEEE Transactions on System, Man, and Cybernetics 9(1), 62–66 (1978)

    Google Scholar 

  17. LeCun, Y.: The MNIST database of handwriting digits, http://yann.lecun.com/exdb/mnist

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhu, Y., Sun, J., Naoi, S. (2012). Recognizing Natural Scene Characters by Convolutional Neural Network and Bimodal Image Enhancement. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2011. Lecture Notes in Computer Science, vol 7139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29364-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29364-1_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29363-4

  • Online ISBN: 978-3-642-29364-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics