Skip to main content
Log in

Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Discriminating between the text and non text regions of an image is a complex and challenging task. In contrast to Caption text, Scene text can have any orientation and may be distorted by the perspective projection. Moreover, it is often affected by variations in scene and camera parameters such as illumination, focus, etc. These variations make the design of unified text extraction from various kinds of images extremely difficult. This paper proposes a statistical unified approach for the extraction of text from hybrid textual images (both Scene text and Caption text in an image) and Document images with variations in text by using carefully selected features with the help of multi level feature priority (MLFP) algorithm. The selected features are combinedly found to be the good choice of feature vectors and have the efficacy to discriminate between text and non text regions for Scene text, Caption text and Document images and the proposed system is robust to illumination, transformation/perspective projection, font size and radially changing/angular text. MLFP feature selection algorithm is evaluated with three common ML algorithms: a decision tree inducer (C4.5), a naive Bayes classifier, and an instance based K-nearest neighbour learner and effectiveness of MLFP is shown by comparing with three feature selection methods with benchmark dataset. The proposed text extraction system is compared with the Edge based method, Connected component method and Texture based method and shown encouraging result and finds its major application in preprocessing for optical character recognition technique and multimedia processing, mobile robot navigation, vehicle license detection and recognition, page segmentation and text-based image indexing, etc.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

NSCT:

Non sub sampled Contourlet Transform

NSP:

Non sub sampled pyramid

NSDFB:

Non sub sampled directional filter bank

CC:

Connected component

NLV:

Normalized local variance

GLRLM:

Gray level run length matrix

GLCM:

Gray level co-occurrence matrix

MLFP:

Multi level feature priority

VM Closing:

Vertical morphological closing

HM Closing:

Horizontal morphological closing

References

  1. Jung K., Kim K.I., Jain A.K.: Text information extraction in images and video: a survey. J. Pattern Recogn. Soc. 37(5), 977–997 (2004)

    Article  Google Scholar 

  2. Liu Y., Goto S., Ikenaga T.: A contour-based robust algorithm for text detection in color images. IEICE Trans. Inf. Syst. E89–D(3), 1221–1230 (2006)

    Article  Google Scholar 

  3. Jiang R., Qi F., Xu L., Wu G., Zhu K.: A learning-based method to detect and segment text from scene images. J. Zhejiang Univ. Sci. A 8(4), 568–574 (2007)

    Article  MATH  Google Scholar 

  4. Karatzas, D., Antonacopoulos, A.: Text extraction from web images based on a split-and-merge segmentation method using colour perception. In: Proceedings of 17th international conference on pattern recognition (ICPR 2004), August 2004. IEEE Computer Society Press, pp. 634–637

  5. Kumar S., Gupta R., Khanna N., Chaudhury S., Joshi S.D.: Text extraction and document image segmentation using matched wavelets and MRF model. IEEE Trans. Image Process. 16(8), 2117–2128 (2007)

    Article  MathSciNet  Google Scholar 

  6. Liu, X., Samarabandu, J.: Multiscale edge-based text extraction from complex images. IEEE Int. Conf. Multimedia Expo., pp. 1721–1724 (2006)

  7. Gllavata, J., Ewerth, R., Freisleben, B.: A Robust algorithm for text detection in images. In: Proceedings of 3rd international symposium on image and signal processing and analysis, vol. 2, pp. 611–616 (2003)

  8. Li H., Doermann D., Kia O.: Automatic text detection and tracking in digital video. IEEE Trans. Image Process. 9(1), 147–156 (2000)

    Article  Google Scholar 

  9. Lin, L., Tan, C.L.: Text extraction from name cards using neural network. In: IJCNN ’05. Proceedings of IEEE international joint conference neural networks, vol. 3, pp. 1818–1823 (2005)

  10. Zhang, D., Chang, S.-F.: Accurate Overlay Text Extraction for Digital Video Analysis. Int. Conf. on Information Technology: Research and Education, ITRE2003, pp. 233–237

  11. Jeong K.-Y., Jung K., Kim E.Y., Kim H.J.: Neural network-based text location for news video indexing. IEEE Proc ICIP 3, 319–323 (1999)

    Google Scholar 

  12. Pan, Y.-F., Hou, X., Liu, C.-L.: Text localization in natural scene images based on conditional random field (ICDAR 09’), pp. 6–10 (2009)

  13. Phan, T.Q., Shivakumara, P., Tan, C.L.: A Laplacian method for video text detection (ICDAR 09’), pp. 66–70 (2009)

  14. Shi Z., Setlur S., Govindaraju V.: Text extraction from gray scale historical document images using adaptive local connectivity map. (ICDAR 05’) 2, 794–798 (2005)

    Google Scholar 

  15. Gopalan C., Manjula D.: Contourlet based approach for text identification and extraction from heterogeneous textual images. Int. J. Comput. Sci. Eng. 2(4), 202–211 (2008)

    Google Scholar 

  16. Do M.N., Vetterli M.: The contourlet transform: an efficient directional multiresolution image representation. IEEE Trans. Image Proc. 14(12), 2091–2106 (2005)

    Article  MathSciNet  Google Scholar 

  17. da Cunha A.L., Zhou J., Do M.N.: The Non sub sampled Contourlet Transform: theory, design and applications. IEEE Trans. Image Process. 15(10), 3089–3101 (2006)

    Article  Google Scholar 

  18. Chen N., Blostein D.: A survey of document image classification: problem statement, classifier architecture and performance evaluation. Int. J. Document Anal. Recogn. 10, 1–16 (2007)

    Article  MATH  Google Scholar 

  19. Liang J., Doermann D., Huiping L.: Camera-based analysis of text and documents: a survey. Int. J. Document Anal. Recogn. 7, 84–104 (2005)

    Article  Google Scholar 

  20. Galloway M.M.: Texture analysis using gray level run lengths. Comput. Graph. Image Process 4, 172–179 (1975)

    Article  Google Scholar 

  21. Haralick R., Shanmugam K., Dinstein I.: Textual features for image classification. IEEE Trans. Syst. Man Cybern. SMC-3(6), 610–621 (1973)

    Article  Google Scholar 

  22. Gnitecki, J., Moussavi, Z.: Classification of lung sounds during bronchial provocation using waveform fractal dimension. In: Proceeding 26th conference IEEE engineering in medicine and biology society (EMBS), pp. 3844–3847 (2001)

  23. Molina, L.C., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and experimental evaluation data mining. In: ICDM 2002. Proceedings of IEEE international conference, pp. 306–313 (2002)

  24. Guyon I., Elisseeff A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    Article  MATH  Google Scholar 

  25. Gupta, S.C., Kapoor, V.K.: Fundamentals of mathematical statistics, chap. 2. Sultan Chand and Sons, New Delhi, pp. 2.43–2.45 (1970)

  26. Otsu N.: A threshold selection method from grey-level histograms. IEEE Trans. Syst. Man Cybern. SMC-1, 62–66 (1979)

    Google Scholar 

  27. Witten, I.H., Frank, E.: Data mining—practical machine learning tools and techniques with JAVA implementations, Morgan Kaufmann, Menlo Park (2000)

  28. John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Eleventh conference on uncertainty in artificial intelligence, pp. 338–345 (1995)

  29. Aha D.W., Kibler D., Albert M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)

    Google Scholar 

  30. Quinlan, R.: C4.5: Programs for machine learning. Morgan Kaufmann, Menlo Park (1993)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chitrakala Gopalan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chitrakala Gopalan, Manjula, D. Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme. SIViP 5, 165–183 (2011). https://doi.org/10.1007/s11760-010-0152-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-010-0152-1

Keywords

Navigation