Skip to main content

A Novel Text Localization Scheme for Camera Captured Document Images

  • Conference paper
  • First Online:
Proceedings of 2nd International Conference on Computer Vision & Image Processing

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 703))

Abstract

In this paper, a hybrid model for detecting text regions from scene images as well as document image is presented. At first, background is suppressed to isolate foreground regions. Then, morphological operations are applied on isolated foreground regions to ensure appropriate region boundary of such objects. Statistical features are extracted from these objects to classify them as text or non-text using a multi-layer perceptron. Classified text components are localized, and non-text ones are ignored. Experimenting on a data set of 227 camera captured images, it is found that the object isolation accuracy is 0.8638 and text non-text classification accuracy is 0.9648. It may be stated that for images with near homogenous background, the present method yields reasonably satisfactory accuracy for practical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 199.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 259.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4159–4167, (2016).

    Google Scholar 

  2. Chen, X., Yuille, A. L.: Detecting and reading text in natural scenes. In. IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2, pp. II-II. (2004).

    Google Scholar 

  3. Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In. IEEE Conference on Computer Vision and Pattern Recognition pp. 1083–1090, (2012).

    Google Scholar 

  4. Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. In. IEEE Transactions on Image Processing, pp. 2594–2605, (2011).

    Google Scholar 

  5. Neumann, L., Matas, J.: Real-time scene text localization and recognition., In. IEEE Conference on Computer Vision and Pattern Recognition, pp. 3538–3545. IEEE, (2012).

    Google Scholar 

  6. Huang, W., Lin, Z., Yang, J., Wang, J.: Text localization in natural images using stroke feature transform and text covariance descriptors. In. IEEE International Conference on Computer Vision, pp. 1241–1248, (2013).

    Google Scholar 

  7. Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970, (2010).

    Google Scholar 

  8. Zhao, Y., Lu, T. and Liao, W.: A robust color-independent text detection method from complex videos. In International Conference on Document Analysis and Recognition (ICDAR), (pp. 374–378). IEEE, (2011).

    Google Scholar 

  9. Kim, K. I., Jung, K., Kim, J. H.: Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. In. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1631–1639 (2003).

    Google Scholar 

  10. Taravat, A., Del Frate, F., Cornaro, C., Vergari, S.: Neural networks and support vector machine algorithms for automatic cloud classification of whole-sky ground-based images. In. IEEE Geoscience and remote sensing letters, pp. 666–670 (2015).

    Google Scholar 

  11. Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Ng, A. Y.: Text detection and character recognition in scene images with unsupervised feature learning. In. IEEE International Conference on Document Analysis and recognition (ICDAR), pp. 440–445, (2011).

    Google Scholar 

  12. Shi, Z., Setlur, S., Govindaraju, V.: A steerable directional local profile technique for extraction of handwritten arabic text lines. In. IEEE 10th International Conference on Document Analysis and Recognition (ICDAR), pp. 176–180, IEEE, (2009).

    Google Scholar 

  13. Pan, Y. F., Hou, X., Liu, C. L.: A hybrid approach to detect and localize texts in natural scene images. In. IEEE Transactions on Image Processing, pp. 800–813, (2011).

    Google Scholar 

  14. Dalal, N. and Triggs, B.: Histograms of oriented gradients for human detection. In. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, (Vol. 1, pp. 886–893). IEEE, (2005).

    Google Scholar 

  15. Minetto, R., Thome, N., Cord, M., Leite, N.J. and Stolfi, J.: T-HOG: An effective gradient-based descriptor for single line text regions. Pattern recognition, 46(3), pp. 1078–1090, (2013).

    Google Scholar 

  16. Tian, S., Bhattacharya, U., Lu, S., Su, B., Wang, Q., Wei, X., Lu, Y. and Tan, C.L.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recognition, 51, pp. 125–134, (2016).

    Google Scholar 

  17. Ojala, T., Pietikäinen, M. and Harwood, D.: A comparative study of texture measures with classification based on featured distributions. Pattern recognition, 29(1), pp. 51–59, (1996).

    Google Scholar 

  18. Mäenpää, T. and Pietikäinen, M.: Multi-scale binary patterns for texture analysis. Image analysis, pp. 267–275, (2003).

    Google Scholar 

  19. Goto, H. and Tanaka, M.: Text-tracking wearable camera system for the blind. In 10th International Conference on Document Analysis and Recognition, ICDAR’09. (pp. 141–145). IEEE, (2009).

    Google Scholar 

  20. Ye, Q., Huang, Q., Gao, W. and Zhao, D.: Fast and robust text detection in images and video frames. Image and Vision Computing, 23(6), pp. 565–576, (2005).

    Google Scholar 

  21. Ye, Q. and Doermann, D.: Text detection and recognition in imagery: A survey. IEEE transactions on pattern analysis and machine intelligence, 37(7), pp. 1480–1500, (2015).

    Google Scholar 

  22. Liang, J., Doermann, D. and Li, H.: Camera-based analysis of text and documents: a survey. International journal on document analysis and recognition, 7(2), pp. 84–104, (2005).

    Google Scholar 

  23. Song, Y., Liu, A., Pang, L., Lin, S., Zhang, Y., Tang, S.: A novel image text extraction method based on k-means clustering. In. 7th IEEE/ACIS International Conference on Computer and Information Science, pp. 185–190, IEEE, (2008).

    Google Scholar 

  24. Lu, S., Chen, T., Tian, S., Lim, J. H., Tan, C. L.: Scene text extraction based on edges and support vector regression. In. International Journal on Document Analysis and Recognition (IJDAR), pp. 125–135, (2015).

    Google Scholar 

  25. Hsieh, J. W., Yu, S. H., Chen, Y. S.: Morphology-based license plate detection from complex scenes. In. 16th IEEE International Conference on Pattern Recognition, Vol. 3, pp. 176–179, (2002).

    Google Scholar 

  26. Mollah, A. F., Basu, S., Nasipuri, M.: Text detection from camera captured images using a novel fuzzy-based technique. In. 3rd IEEE International Conference on Emerging Applications of Information Technology (EAIT), pp. 291–294, (2012).

    Google Scholar 

  27. Otsu, N.: A threshold selection method from gray-level histograms. Automatica, pp. 23–27, (1979).

    Google Scholar 

Download references

Acknowledgements

The authors are thankful to the Department of Computer Science and Engineering of Aliah University for providing every support for carrying out this work. The first author is also thankful to Aliah University for providing research fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tauseef Khan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Khan, T., Mollah, A.F. (2018). A Novel Text Localization Scheme for Camera Captured Document Images. In: Chaudhuri, B., Kankanhalli, M., Raman, B. (eds) Proceedings of 2nd International Conference on Computer Vision & Image Processing . Advances in Intelligent Systems and Computing, vol 703. Springer, Singapore. https://doi.org/10.1007/978-981-10-7895-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-7895-8_20

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-7894-1

  • Online ISBN: 978-981-10-7895-8

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics