Abstract
Reading text from natural images is much more difficult than from scanned text documents since the text may appear in all colors, different sizes and types, often with distorted geometry or textures applied. The paper presents the idea of high-speed image preprocessing algorithms utilizing the quasi-local histogram based methods such as binarization, ROI filtering, line and corners detection, etc. which can be helpful for this task. Their low computational cost is provided by a reduction of the amount of processed information carried out by means of a simple random sampling. The approach presented in the paper allows to minimize some problems with the implementation of the OCR algorithms operating on natural images on devices with low computing power (e.g. mobile or embedded). Due to relatively small computational effort it is possible to test multiple hypotheses e.g. related to the possible location of the text in the image. Their verification can be based on the analysis of images in various color spaces. An additional advantage of the discussed algorithms is their construction allowing an efficient parallel implementation further reducing the computation time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
International Telecommunication Union recommendation BT.709-5—parameter values for the HDTV standards for production and international programme exchange (2001)
International Telecommunication Union recommendation BT.601-7—studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios (2011)
Bissacco, A., Cummins, M., Netzer, Y., Neven, H.: PhotoOCR: Reading text in uncontrolled conditions. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 785–792 (2013)
de Campos, T.E., Babu, B.R., Varma, M.: Character recognition in natural images. In: Proceedings of the International Conference on Computer Vision Theory and Applications (2009)
Chen, H., Tsai, S., Schroth, G., Chen, D., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions. In: Proceedings of the 18th IEEE International Conference on Image Processing (ICIP), pp. 2609–2612 (2011)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970 (2010)
Forczmański, P., Frejlichowski, D.: Robust stamps detection and classification by means of general shape analysis. In: Bolc, L., Tadeusiewicz, R., Chmielewski, L., Wojciechowski, K. (eds.) Computer Vision and Graphics. Lecture Notes in Computer Science, vol. 6374, pp. 360–367. Springer, Berlin (2010)
Gooch, A.A., Olsen, S.C., Tumblin, J., Gooch, B.: Color2Gray: salience-preserving color removal. ACM Trans. Graph. 24(3), 634–639 (2005)
Grundland, M., Dodgson, N.A.: Decolorize: fast, contrast enhancing, color to grayscale conversion. Pattern Recogn. 40(11), 2891–2896 (2007)
Ikica, A., Peer, P.: Swt voting-based color reduction for text detection in natural scene images. EURASIP J. Adv. Sig. Process. 2013(1), Article ID 95 (2013)
Kapur, J., Sahoo, P., Wong, A.: A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Graph. Image Process. 29(3), 273–285 (1985)
Milyaev, S., Barinova, O., Novikova, T., Kohli, P., Lempitsky, V.: Image binarization for end-to-end text understanding in natural images. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 128–132 (2013)
Nagy, R., Dicker, A., Meyer-Wegener, K.: NEOCR: A configurable dataset for natural image text recognition. In: Iwamura, M., Shafait, F. (eds.) Camera-Based Document Analysis and Recognition. Lecture Notes in Computer Science, vol. 7139, pp. 150–163. Springer, Berlin (2012)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Roubtsova, N.S., Wijnhoven, R.G.J., de With, P.H.N.: Integrated text detection and recognition in natural images. In: Image Processing: Algorithms and Systems X and Parallel Processing for Imaging Applications II. Proceedings of SPIE, vol. 8295, pp. 829507–829521 (2012)
Smith, R.: An overview of the Tesseract OCR engine. In: Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 629–633 (2007)
Su, B., Lu, S., Tian, S., Lim, J.H., Tan, C.L.: Character recognition in natural scenes using convolutional co-occurrence HOG. In: Proceedings of 22nd International Conference on Pattern Recognition (ICPR), pp. 2926–2931 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Lech, P., Okarma, K. (2016). Methods of Natural Image Preprocessing Supporting the Automatic Text Recognition Using the OCR Algorithms. In: Choraś, R. (eds) Image Processing and Communications Challenges 7. Advances in Intelligent Systems and Computing, vol 389. Springer, Cham. https://doi.org/10.1007/978-3-319-23814-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-23814-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23813-5
Online ISBN: 978-3-319-23814-2
eBook Packages: EngineeringEngineering (R0)