Abstract
In this paper, a hybrid model for detecting text regions from scene images as well as document image is presented. At first, background is suppressed to isolate foreground regions. Then, morphological operations are applied on isolated foreground regions to ensure appropriate region boundary of such objects. Statistical features are extracted from these objects to classify them as text or non-text using a multi-layer perceptron. Classified text components are localized, and non-text ones are ignored. Experimenting on a data set of 227 camera captured images, it is found that the object isolation accuracy is 0.8638 and text non-text classification accuracy is 0.9648. It may be stated that for images with near homogenous background, the present method yields reasonably satisfactory accuracy for practical applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4159–4167, (2016).
Chen, X., Yuille, A. L.: Detecting and reading text in natural scenes. In. IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. II-II. (2004).
Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In. IEEE Conference on Computer Vision and Pattern Recognition pp. 1083–1090, (2012).
Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. In. IEEE Transactions on Image Processing, pp. 2594–2605, (2011).
Neumann, L., Matas, J.: Real-time scene text localization and recognition., In. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3538–3545. IEEE, (2012).
Huang, W., Lin, Z., Yang, J., Wang, J.: Text localization in natural images using stroke feature transform and text covariance descriptors. In. IEEE International Conference on Computer Vision, pp. 1241–1248, (2013).
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970, (2010).
Zhao, Y., Lu, T. and Liao, W.: A robust color-independent text detection method from complex videos. In International Conference on Document Analysis and Recognition (ICDAR), (pp. 374–378). IEEE, (2011).
Kim, K. I., Jung, K., Kim, J. H.: Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. In. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1631–1639 (2003).
Taravat, A., Del Frate, F., Cornaro, C., Vergari, S.: Neural networks and support vector machine algorithms for automatic cloud classification of whole-sky ground-based images. In. IEEE Geoscience and remote sensing letters, pp. 666–670 (2015).
Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Ng, A. Y.: Text detection and character recognition in scene images with unsupervised feature learning. In. IEEE International Conference on Document Analysis and recognition (ICDAR), pp. 440–445, (2011).
Shi, Z., Setlur, S., Govindaraju, V.: A steerable directional local profile technique for extraction of handwritten arabic text lines. In. IEEE 10th International Conference on Document Analysis and Recognition (ICDAR), pp. 176–180, IEEE, (2009).
Pan, Y. F., Hou, X., Liu, C. L.: A hybrid approach to detect and localize texts in natural scene images. In. IEEE Transactions on Image Processing, pp. 800–813, (2011).
Dalal, N. and Triggs, B.: Histograms of oriented gradients for human detection. In. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (Vol. 1, pp. 886–893). IEEE, (2005).
Minetto, R., Thome, N., Cord, M., Leite, N.J. and Stolfi, J.: T-HOG: An effective gradient-based descriptor for single line text regions. Pattern recognition, 46(3), pp. 1078–1090, (2013).
Tian, S., Bhattacharya, U., Lu, S., Su, B., Wang, Q., Wei, X., Lu, Y. and Tan, C.L.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognition, 51, pp. 125–134, (2016).
Ojala, T., Pietikäinen, M. and Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern recognition, 29(1), pp. 51–59, (1996).
Mäenpää, T. and Pietikäinen, M.: Multi-scale binary patterns for texture analysis. Image analysis, pp. 267–275, (2003).
Goto, H. and Tanaka, M.: Text-tracking wearable camera system for the blind. In 10th International Conference on Document Analysis and Recognition, ICDAR’09. (pp. 141–145). IEEE, (2009).
Ye, Q., Huang, Q., Gao, W. and Zhao, D.: Fast and robust text detection in images and video frames. Image and Vision Computing, 23(6), pp. 565–576, (2005).
Ye, Q. and Doermann, D.: Text detection and recognition in imagery: A survey. IEEE transactions on pattern analysis and machine intelligence, 37(7), pp. 1480–1500, (2015).
Liang, J., Doermann, D. and Li, H.: Camera-based analysis of text and documents: a survey. International journal on document analysis and recognition, 7(2), pp. 84–104, (2005).
Song, Y., Liu, A., Pang, L., Lin, S., Zhang, Y., Tang, S.: A novel image text extraction method based on k-means clustering. In. 7th IEEE/ACIS International Conference on Computer and Information Science, pp. 185–190, IEEE, (2008).
Lu, S., Chen, T., Tian, S., Lim, J. H., Tan, C. L.: Scene text extraction based on edges and support vector regression. In. International Journal on Document Analysis and Recognition (IJDAR), pp. 125–135, (2015).
Hsieh, J. W., Yu, S. H., Chen, Y. S.: Morphology-based license plate detection from complex scenes. In. 16th IEEE International Conference on Pattern Recognition, Vol. 3, pp. 176–179, (2002).
Mollah, A. F., Basu, S., Nasipuri, M.: Text detection from camera captured images using a novel fuzzy-based technique. In. 3rd IEEE International Conference on Emerging Applications of Information Technology (EAIT), pp. 291–294, (2012).
Otsu, N.: A threshold selection method from gray-level histograms. Automatica, pp. 23–27, (1979).
Acknowledgements
The authors are thankful to the Department of Computer Science and Engineering of Aliah University for providing every support for carrying out this work. The first author is also thankful to Aliah University for providing research fellowship.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Khan, T., Mollah, A.F. (2018). A Novel Text Localization Scheme for Camera Captured Document Images. In: Chaudhuri, B., Kankanhalli, M., Raman, B. (eds) Proceedings of 2nd International Conference on Computer Vision & Image Processing . Advances in Intelligent Systems and Computing, vol 703. Springer, Singapore. https://doi.org/10.1007/978-981-10-7895-8_20
Download citation
DOI: https://doi.org/10.1007/978-981-10-7895-8_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7894-1
Online ISBN: 978-981-10-7895-8
eBook Packages: EngineeringEngineering (R0)