A Novel Text Localization Scheme for Camera Captured Document Images

Khan, Tauseef; Mollah, Ayatullah Faruk

doi:10.1007/978-981-10-7895-8_20

Tauseef Khan¹⁷ &
Ayatullah Faruk Mollah¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 703))

627 Accesses
6 Citations

Abstract

In this paper, a hybrid model for detecting text regions from scene images as well as document image is presented. At first, background is suppressed to isolate foreground regions. Then, morphological operations are applied on isolated foreground regions to ensure appropriate region boundary of such objects. Statistical features are extracted from these objects to classify them as text or non-text using a multi-layer perceptron. Classified text components are localized, and non-text ones are ignored. Experimenting on a data set of 227 camera captured images, it is found that the object isolation accuracy is 0.8638 and text non-text classification accuracy is 0.9648. It may be stated that for images with near homogenous background, the present method yields reasonably satisfactory accuracy for practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 199.00; Price excludes VAT (USA)

Softcover Book: USD 259.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4159–4167, (2016).
Google Scholar
Chen, X., Yuille, A. L.: Detecting and reading text in natural scenes. In. IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. II-II. (2004).
Google Scholar
Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In. IEEE Conference on Computer Vision and Pattern Recognition pp. 1083–1090, (2012).
Google Scholar
Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. In. IEEE Transactions on Image Processing, pp. 2594–2605, (2011).
Google Scholar
Neumann, L., Matas, J.: Real-time scene text localization and recognition., In. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3538–3545. IEEE, (2012).
Google Scholar
Huang, W., Lin, Z., Yang, J., Wang, J.: Text localization in natural images using stroke feature transform and text covariance descriptors. In. IEEE International Conference on Computer Vision, pp. 1241–1248, (2013).
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970, (2010).
Google Scholar
Zhao, Y., Lu, T. and Liao, W.: A robust color-independent text detection method from complex videos. In International Conference on Document Analysis and Recognition (ICDAR), (pp. 374–378). IEEE, (2011).
Google Scholar
Kim, K. I., Jung, K., Kim, J. H.: Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. In. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1631–1639 (2003).
Google Scholar
Taravat, A., Del Frate, F., Cornaro, C., Vergari, S.: Neural networks and support vector machine algorithms for automatic cloud classification of whole-sky ground-based images. In. IEEE Geoscience and remote sensing letters, pp. 666–670 (2015).
Google Scholar
Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Ng, A. Y.: Text detection and character recognition in scene images with unsupervised feature learning. In. IEEE International Conference on Document Analysis and recognition (ICDAR), pp. 440–445, (2011).
Google Scholar
Shi, Z., Setlur, S., Govindaraju, V.: A steerable directional local profile technique for extraction of handwritten arabic text lines. In. IEEE 10th International Conference on Document Analysis and Recognition (ICDAR), pp. 176–180, IEEE, (2009).
Google Scholar
Pan, Y. F., Hou, X., Liu, C. L.: A hybrid approach to detect and localize texts in natural scene images. In. IEEE Transactions on Image Processing, pp. 800–813, (2011).
Google Scholar
Dalal, N. and Triggs, B.: Histograms of oriented gradients for human detection. In. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (Vol. 1, pp. 886–893). IEEE, (2005).
Google Scholar
Minetto, R., Thome, N., Cord, M., Leite, N.J. and Stolfi, J.: T-HOG: An effective gradient-based descriptor for single line text regions. Pattern recognition, 46(3), pp. 1078–1090, (2013).
Google Scholar
Tian, S., Bhattacharya, U., Lu, S., Su, B., Wang, Q., Wei, X., Lu, Y. and Tan, C.L.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognition, 51, pp. 125–134, (2016).
Google Scholar
Ojala, T., Pietikäinen, M. and Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern recognition, 29(1), pp. 51–59, (1996).
Google Scholar
Mäenpää, T. and Pietikäinen, M.: Multi-scale binary patterns for texture analysis. Image analysis, pp. 267–275, (2003).
Google Scholar
Goto, H. and Tanaka, M.: Text-tracking wearable camera system for the blind. In 10th International Conference on Document Analysis and Recognition, ICDAR’09. (pp. 141–145). IEEE, (2009).
Google Scholar
Ye, Q., Huang, Q., Gao, W. and Zhao, D.: Fast and robust text detection in images and video frames. Image and Vision Computing, 23(6), pp. 565–576, (2005).
Google Scholar
Ye, Q. and Doermann, D.: Text detection and recognition in imagery: A survey. IEEE transactions on pattern analysis and machine intelligence, 37(7), pp. 1480–1500, (2015).
Google Scholar
Liang, J., Doermann, D. and Li, H.: Camera-based analysis of text and documents: a survey. International journal on document analysis and recognition, 7(2), pp. 84–104, (2005).
Google Scholar
Song, Y., Liu, A., Pang, L., Lin, S., Zhang, Y., Tang, S.: A novel image text extraction method based on k-means clustering. In. 7^th IEEE/ACIS International Conference on Computer and Information Science, pp. 185–190, IEEE, (2008).
Google Scholar
Lu, S., Chen, T., Tian, S., Lim, J. H., Tan, C. L.: Scene text extraction based on edges and support vector regression. In. International Journal on Document Analysis and Recognition (IJDAR), pp. 125–135, (2015).
Google Scholar
Hsieh, J. W., Yu, S. H., Chen, Y. S.: Morphology-based license plate detection from complex scenes. In. 16^th IEEE International Conference on Pattern Recognition, Vol. 3, pp. 176–179, (2002).
Google Scholar
Mollah, A. F., Basu, S., Nasipuri, M.: Text detection from camera captured images using a novel fuzzy-based technique. In. 3^rd IEEE International Conference on Emerging Applications of Information Technology (EAIT), pp. 291–294, (2012).
Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. Automatica, pp. 23–27, (1979).
Google Scholar

Download references

Acknowledgements

The authors are thankful to the Department of Computer Science and Engineering of Aliah University for providing every support for carrying out this work. The first author is also thankful to Aliah University for providing research fellowship.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Aliah University, Kolkata, 700156, India
Tauseef Khan & Ayatullah Faruk Mollah

Authors

Tauseef Khan
View author publications
You can also search for this author in PubMed Google Scholar
Ayatullah Faruk Mollah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tauseef Khan .

Editor information

Editors and Affiliations

Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, India
Bidyut B. Chaudhuri
School of Computing, National University of Singapore, Singapore, Singapore
Mohan S. Kankanhalli
Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Balasubramanian Raman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khan, T., Mollah, A.F. (2018). A Novel Text Localization Scheme for Camera Captured Document Images. In: Chaudhuri, B., Kankanhalli, M., Raman, B. (eds) Proceedings of 2nd International Conference on Computer Vision & Image Processing . Advances in Intelligent Systems and Computing, vol 703. Springer, Singapore. https://doi.org/10.1007/978-981-10-7895-8_20

Download citation

DOI: https://doi.org/10.1007/978-981-10-7895-8_20
Published: 12 April 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7894-1
Online ISBN: 978-981-10-7895-8
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics