Scene text extraction based on edges and support vector regression

Lu, Shijian; Chen, Tao; Tian, Shangxuan; Lim, Joo-Hwee; Tan, Chew-Lim

doi:10.1007/s10032-015-0237-z

Scene text extraction based on edges and support vector regression

Special Issue Paper
Published: 08 February 2015

Volume 18, pages 125–135, (2015)
Cite this article

International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Shijian Lu¹,
Tao Chen¹,
Shangxuan Tian²,
Joo-Hwee Lim¹ &
…
Chew-Lim Tan²

981 Accesses
70 Citations
Explore all metrics

Abstract

This paper presents a scene text extraction technique that automatically detects and segments texts from scene images. Three text-specific features are designed over image edges with which a set of candidate text boundaries is first detected. For each detected candidate text boundary, one or more candidate characters are then extracted by using a local threshold that is estimated based on the surrounding image pixels. The real characters and words are finally identified by a support vector regression model that is trained using bags-of-words representation. The proposed technique has been evaluated over the latest ICDAR-2013 Robust Reading Competition dataset. Experiments show that it obtains superior F-measures of 78.19 % and 75.24 % (on atom level), respectively, for the scene text detection and segmentation tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

http://dag.cvc.uab.es/icdar2013competition/.

References

Niblack, W.: An Introduction to Digital Image Processing. Prentice-Hall, Englewood (1986)
Google Scholar
Liang, J., Doermann, D., Li, H.: Camera-based analysis of text and documents: a survey. Int. J. Doc. Anal. Recognit. 7(2–3), 84–104 (2005)
Article Google Scholar
Jung, K., Kim, K.I., Jain, A.K.: Text information extraction in images and video: a survey. Pattern Recognit. 37(5), 977–997 (2004)
Article Google Scholar
Clavelli, A., Karatzas, D., Llados, J.: A framework for the assessment of text extraction algorithms on complex colour images. In: IAPR International Workshop on Document Analysis Systems, pp. 19–26 (2010)
Chen, X., Yuille, A.: Detecting and reading text in natural scenes. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 366–373 (2004)
Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Gomez i Bigorda, L., Robles Mestre, S., Mas, J., Fernandez Mota, D., Almazan Almazan, J., de las Heras, L.-P.: ICDAR 2013 robust reading competition. In: International Conference on Document Analysis and Recognition, pp. 1484–1493 (2013)
Lienhart, R., Wernicke, A.: Localizing and segmenting text in images and videos. IEEE Trans. Circuit Syst. Video Technol. 12(4), 256–268 (2002)
Article Google Scholar
Jain, A.K., Yu, B.: Automatic text location in images and video frames. Pattern Recognit. 31(12), 2055–2076 (1998)
Article Google Scholar
Kim, H.K.: Efficient automatic text location method and content-based indexing and structuring of video database. J. Vis. Commun. Image Represent. 7(4), 336–344 (1996)
Article Google Scholar
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: International Conference on Document Analysis and Recognition, pp. 682–687 (2003)
Lucas, S.M.: ICDAR 2005 text locating competition results. In: International Conference on Document Analysis and Recognition, pp. 80–84 (2005)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2963–2970 (2010)
Datta, R., Joshi, D., Li, J., Wang, James Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Comput. Surv. 40(2), 1–60 (2008)
Article Google Scholar
Shivakumara, P., Phan, T.Q., Tan, C.L.: A Laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern. Anal. Mach. Intell. 33(2), 412–419 (2011)
Article Google Scholar
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern. Anal. Mach. Intell. 8(6), 679–698 (1986)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Mishra, A., Alahari, K., Jawahar, C.V.: Top-down and bottom-up cues for scene text recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2687–2694 (2012)
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: IEEE International Conference on Computer Vision, pp. 1457–1464 (2011)
Wang, K., Belongie, S.: Word spotting in the wild. In: European Conference on Computer Vision, pp. 591–604 (2010)
Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Wu, D.J., Ng, A.Y.: Text detection and character recognition in scene images with unsupervised feature learning. In: International Conference on Document Analysis and Recognition, pp. 440–445 (2011)
Wang, T., Wu, David J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In: International Conference on Pattern Recognition, pp. 3304–3308 (2012)
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3538–3545 (2012)
Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1083–1090 (2012)
Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20(9), 2594–2605 (2011)
Shahab, A., Shafait, F., Dengel, A.: ICDAR 2011 robust reading competition challenge 2: reading text in scene images. In: International Conference on Document Analysis and Recognition, pp. 1491–1496 (2011)
Chen, H., Tsai, S.S., Schroth, G., Chen, D.M., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: International Conference on Image Processing, pp. 2609–2612 (2011)
Wolf, C., Jolion, J.: Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int. J. Doc. Anal. Recognit. 8(4), 280–296 (2006)
Article Google Scholar
Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S.: Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognit. Lett. 34(2), 280–296 (2012)
Google Scholar
Pan, Y.F., Hou, X., Liu, C.L.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans. Image. Process. 20(3), 800–813 (2011)
Article MathSciNet Google Scholar
Yi, C., Tian, Y.: Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Trans. Image. Process. 21(9), 4256–4268 (2012)
Article MathSciNet Google Scholar
Kasar, T., Kumar, J., Ramakrishnan, A.G.: Font and background color independent text binarization. In: International workshop on Camera Based Document Analysis and Recognition (workshop of ICDAR), pp. 3–9 (2007)
Basak, D., Pal, S., Patranabis, D.C.: Support vector regression. Neural Inf. Process. 11(10), 203–224 (2007)
Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–65 (1979)
Article MathSciNet Google Scholar
Huang, W., Lin, Z., Yang, J., Wang, J.: Text localization in natural images using stroke feature transform and text covariance descriptors. In: IEEE International Conference on Computer Vision, pp. 1241–1248 (2013)
Chen, T., Yap, K.-H., Zhang, D.J.: Discriminative soft bag-of-visual phrase for mobile landmark recognition. IEEE Trans. Multimed. 13, 612–622 (2014)
Article Google Scholar
Li, T., Mei, T., Kweon, I.-S., Hua, X.S.: Contextual bags-of-words for visual categorization. IEEE Trans. Circuits Syst. Video Technol. 21, 381–392 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Infocomm Research, A*STAR, 1 Fusionopolis Way, #21-01 Connexis, Singapore, 138632, Singapore
Shijian Lu, Tao Chen & Joo-Hwee Lim
School of Computing, National University of Singapore, 21 Lower Kent Ridge Road, Singapore, 119077, Singapore
Shangxuan Tian & Chew-Lim Tan

Authors

Shijian Lu
View author publications
You can also search for this author in PubMed Google Scholar
Tao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shangxuan Tian
View author publications
You can also search for this author in PubMed Google Scholar
Joo-Hwee Lim
View author publications
You can also search for this author in PubMed Google Scholar
Chew-Lim Tan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shijian Lu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, S., Chen, T., Tian, S. et al. Scene text extraction based on edges and support vector regression. IJDAR 18, 125–135 (2015). https://doi.org/10.1007/s10032-015-0237-z

Download citation

Received: 31 May 2014
Revised: 22 December 2014
Accepted: 07 January 2015
Published: 08 February 2015
Issue Date: June 2015
DOI: https://doi.org/10.1007/s10032-015-0237-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scene text extraction based on edges and support vector regression

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification

A review of semi-supervised learning for text classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Scene text extraction based on edges and support vector regression

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification

A review of semi-supervised learning for text classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation