Abstract
Discriminating between the text and non text regions of an image is a complex and challenging task. In contrast to Caption text, Scene text can have any orientation and may be distorted by the perspective projection. Moreover, it is often affected by variations in scene and camera parameters such as illumination, focus, etc. These variations make the design of unified text extraction from various kinds of images extremely difficult. This paper proposes a statistical unified approach for the extraction of text from hybrid textual images (both Scene text and Caption text in an image) and Document images with variations in text by using carefully selected features with the help of multi level feature priority (MLFP) algorithm. The selected features are combinedly found to be the good choice of feature vectors and have the efficacy to discriminate between text and non text regions for Scene text, Caption text and Document images and the proposed system is robust to illumination, transformation/perspective projection, font size and radially changing/angular text. MLFP feature selection algorithm is evaluated with three common ML algorithms: a decision tree inducer (C4.5), a naive Bayes classifier, and an instance based K-nearest neighbour learner and effectiveness of MLFP is shown by comparing with three feature selection methods with benchmark dataset. The proposed text extraction system is compared with the Edge based method, Connected component method and Texture based method and shown encouraging result and finds its major application in preprocessing for optical character recognition technique and multimedia processing, mobile robot navigation, vehicle license detection and recognition, page segmentation and text-based image indexing, etc.
Similar content being viewed by others
Abbreviations
- NSCT:
-
Non sub sampled Contourlet Transform
- NSP:
-
Non sub sampled pyramid
- NSDFB:
-
Non sub sampled directional filter bank
- CC:
-
Connected component
- NLV:
-
Normalized local variance
- GLRLM:
-
Gray level run length matrix
- GLCM:
-
Gray level co-occurrence matrix
- MLFP:
-
Multi level feature priority
- VM Closing:
-
Vertical morphological closing
- HM Closing:
-
Horizontal morphological closing
References
Jung K., Kim K.I., Jain A.K.: Text information extraction in images and video: a survey. J. Pattern Recogn. Soc. 37(5), 977–997 (2004)
Liu Y., Goto S., Ikenaga T.: A contour-based robust algorithm for text detection in color images. IEICE Trans. Inf. Syst. E89–D(3), 1221–1230 (2006)
Jiang R., Qi F., Xu L., Wu G., Zhu K.: A learning-based method to detect and segment text from scene images. J. Zhejiang Univ. Sci. A 8(4), 568–574 (2007)
Karatzas, D., Antonacopoulos, A.: Text extraction from web images based on a split-and-merge segmentation method using colour perception. In: Proceedings of 17th international conference on pattern recognition (ICPR 2004), August 2004. IEEE Computer Society Press, pp. 634–637
Kumar S., Gupta R., Khanna N., Chaudhury S., Joshi S.D.: Text extraction and document image segmentation using matched wavelets and MRF model. IEEE Trans. Image Process. 16(8), 2117–2128 (2007)
Liu, X., Samarabandu, J.: Multiscale edge-based text extraction from complex images. IEEE Int. Conf. Multimedia Expo., pp. 1721–1724 (2006)
Gllavata, J., Ewerth, R., Freisleben, B.: A Robust algorithm for text detection in images. In: Proceedings of 3rd international symposium on image and signal processing and analysis, vol. 2, pp. 611–616 (2003)
Li H., Doermann D., Kia O.: Automatic text detection and tracking in digital video. IEEE Trans. Image Process. 9(1), 147–156 (2000)
Lin, L., Tan, C.L.: Text extraction from name cards using neural network. In: IJCNN ’05. Proceedings of IEEE international joint conference neural networks, vol. 3, pp. 1818–1823 (2005)
Zhang, D., Chang, S.-F.: Accurate Overlay Text Extraction for Digital Video Analysis. Int. Conf. on Information Technology: Research and Education, ITRE2003, pp. 233–237
Jeong K.-Y., Jung K., Kim E.Y., Kim H.J.: Neural network-based text location for news video indexing. IEEE Proc ICIP 3, 319–323 (1999)
Pan, Y.-F., Hou, X., Liu, C.-L.: Text localization in natural scene images based on conditional random field (ICDAR 09’), pp. 6–10 (2009)
Phan, T.Q., Shivakumara, P., Tan, C.L.: A Laplacian method for video text detection (ICDAR 09’), pp. 66–70 (2009)
Shi Z., Setlur S., Govindaraju V.: Text extraction from gray scale historical document images using adaptive local connectivity map. (ICDAR 05’) 2, 794–798 (2005)
Gopalan C., Manjula D.: Contourlet based approach for text identification and extraction from heterogeneous textual images. Int. J. Comput. Sci. Eng. 2(4), 202–211 (2008)
Do M.N., Vetterli M.: The contourlet transform: an efficient directional multiresolution image representation. IEEE Trans. Image Proc. 14(12), 2091–2106 (2005)
da Cunha A.L., Zhou J., Do M.N.: The Non sub sampled Contourlet Transform: theory, design and applications. IEEE Trans. Image Process. 15(10), 3089–3101 (2006)
Chen N., Blostein D.: A survey of document image classification: problem statement, classifier architecture and performance evaluation. Int. J. Document Anal. Recogn. 10, 1–16 (2007)
Liang J., Doermann D., Huiping L.: Camera-based analysis of text and documents: a survey. Int. J. Document Anal. Recogn. 7, 84–104 (2005)
Galloway M.M.: Texture analysis using gray level run lengths. Comput. Graph. Image Process 4, 172–179 (1975)
Haralick R., Shanmugam K., Dinstein I.: Textual features for image classification. IEEE Trans. Syst. Man Cybern. SMC-3(6), 610–621 (1973)
Gnitecki, J., Moussavi, Z.: Classification of lung sounds during bronchial provocation using waveform fractal dimension. In: Proceeding 26th conference IEEE engineering in medicine and biology society (EMBS), pp. 3844–3847 (2001)
Molina, L.C., Belanche, L., Nebot, A.: Feature selection algorithms: a survey and experimental evaluation data mining. In: ICDM 2002. Proceedings of IEEE international conference, pp. 306–313 (2002)
Guyon I., Elisseeff A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Gupta, S.C., Kapoor, V.K.: Fundamentals of mathematical statistics, chap. 2. Sultan Chand and Sons, New Delhi, pp. 2.43–2.45 (1970)
Otsu N.: A threshold selection method from grey-level histograms. IEEE Trans. Syst. Man Cybern. SMC-1, 62–66 (1979)
Witten, I.H., Frank, E.: Data mining—practical machine learning tools and techniques with JAVA implementations, Morgan Kaufmann, Menlo Park (2000)
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Eleventh conference on uncertainty in artificial intelligence, pp. 338–345 (1995)
Aha D.W., Kibler D., Albert M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
Quinlan, R.: C4.5: Programs for machine learning. Morgan Kaufmann, Menlo Park (1993)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chitrakala Gopalan, Manjula, D. Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme. SIViP 5, 165–183 (2011). https://doi.org/10.1007/s11760-010-0152-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-010-0152-1