Scene text detection using adaptive color reduction, adjacent character model and hybrid verification strategy

Wu, Hui; Zou, Beiji; Zhao, Yu-qian; Guo, Jianjing

doi:10.1007/s00371-015-1156-1

Scene text detection using adaptive color reduction, adjacent character model and hybrid verification strategy

Original Article
Published: 23 September 2015

Volume 33, pages 113–126, (2017)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Hui Wu^1,2,
Beiji Zou^1,2,
Yu-qian Zhao^1,2,3 &
…
Jianjing Guo^1,2

882 Accesses
23 Citations
Explore all metrics

Abstract

Text detection is a primary task for text recognition and understanding, which can be used in many image analysis techniques. In this paper, we propose an effective scene text detection method including three major steps: connected components (CCs) extraction, character-linking and text/non-text classification. First, for CCs extraction, we design an adaptive color reduction scheme by analyzing image color histogram, which reasonably selects color centers and generates unfixed number of color layers for images in different color complexities. Then, for character-linking, an adjacent character model is built by training an extreme learning machine (ELM), instead of setting various thresholds in previous approaches. Finally, a hybrid text verification strategy is adopted, combining convolutional neural network with ELM for text/non-text classification and performing better than just using one of them. Experimental results on some publicly available datasets illustrate the effectiveness of our method and comparative results with some state-of-the-art algorithms demonstrate our competitiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CBAM: Convolutional Block Attention Module

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

A review of convolutional neural networks in computer vision

Article Open access 23 March 2024

References

Yi, C., Tian, Y: Assistive text reading from complex background for blind persons. In: Camera-Based Document Analysis and Recognition, pp. 15–28. Springer, Berlin (2012)
Yi, C., Tian, Y.: Scene text recognition in mobile applications by character descriptor and structure configuration. IEEE Trans. Image Process. 23(7), 2972–2982 (2014)
Article MathSciNet Google Scholar
Weinman, J.J., Miller, E.L., Hanson, A.R.: Scene text recognition using similarity and a lexicon with sparse belief propagation. IEEE Trans. Pattern Anal. Mach. Intell. 31(10), 1733–1746 (2009)
Article Google Scholar
Yin, X.C., Hao, H.W., Sun, J., Naoi, S.: Robust vanishing point detection for mobilecam-based documents. In: International Conference on Document Analysis and Recognition (ICDAR), 2011, pp. 136–140. IEEE (2011)
Liu, X., Li, C., Zhu, H., Wong, T.-T., Xu, X.: Text-aware balloon extraction from manga. Vis. Comput., pp. 1–11 (2015). doi: 10.1007/s00371-015-1084-0
Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004, vol. 2, pp. II–366. IEEE (2004)
Minetto, R., Thome, N., Cord, M., Leite, N.J., Stolfi, J.: T-hog: an effective gradient-based descriptor for single line text regions. Pattern Recognit. 46(3), 1078–1090 (2013)
Article Google Scholar
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1457–1464. IEEE (2011)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970. IEEE (2010)
Koo, H.I., Kim, D.H.: Scene text detection via connected component clustering and nontext filtering. IEEE Trans. Image Process. 22(6), 2296–2305 (2013)
Article MathSciNet Google Scholar
Huang, W., Lin, Z., Yang, J., Wang, J.: Text localization in natural images using stroke feature transform and text covariance descriptors. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1241–1248. IEEE (2013)
Yin, X., Huang, K., Hao, H.: Robust text detection in natural scene images. IEEE Trans. Pattern Anal. Mach. Intell. 36(5), 970–983 (2014)
Article Google Scholar
Huang, G.-B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 42(2), 513–529 (2012)
Nikolaou, N., Papamarkos, N.: Color reduction for complex document images. Int. J. Imaging Syst. Technol. 19(1), 14–26 (2009)
Article Google Scholar
Yi, C., Tian, Y.L.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20(9), 2594–2605 (2011)
Article MathSciNet Google Scholar
Chen, H., Tsai, S.S., Schroth, G., Chen, D.M., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: 2011 18th IEEE International Conference on Image Processing (ICIP), pp. 2609–2612. IEEE (2011)
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)
Article Google Scholar
Yi, C., Tian, Y.L.: Localizing text in scene images by boundary clustering, stroke segmentation, and string fragment classification. IEEE Trans. Image Process. 21(9), 4256–4268 (2012)
Article MathSciNet Google Scholar
Wang, X., Song, Y., Zhang, Y.: Natural scene text detection with multi-channel connected component segmentation. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1375–1379. IEEE (2013)
Yao, C., Bai, X., Liu, W.: A unified framework for multi-oriented text detection and recognition. IEEE Trans. Image Process. 23(11), 4737–4749 (2014)
Article MathSciNet Google Scholar
Zhang, X., Lin, Z., Sun, F., Ma, Y.: Transform invariant text extraction. Vis. Comput. 30(4), 401–415 (2014)
Article Google Scholar
Zhang, Z., Ganesh, A., Liang, X., Ma, Y.: Tilt: transform invariant low-rank textures. Int. J. Comput. Vis. 99(1), 1–24 (2012)
Article MathSciNet MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MathSciNet MATH Google Scholar
Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 3304–3308. IEEE (2012)
Huang, W., Qiao, Y., Tang, X.: Robust scene text detection with convolution neural network induced MSER trees. In: Computer Vision—ECCV 2014, pp. 497–511. Springer, Berlin (2014)
Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1083–1090. IEEE (2012)
Saidane, Z., Garcia, C.: Automatic scene text recognition using a convolutional neural network. In: Workshop on Camera-Based Document Analysis and Recognition, vol. 1 (2007)
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Article Google Scholar
Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S.: Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recognit. Lett. 34(2), 107–116 (2013)
Article Google Scholar
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 682–682. IEEE Computer Society (2003)
Wolf, C., Jolion, J.-M.: Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int. J. Doc. Anal. Recognit. 8(4), 280–296 (2006)
Article Google Scholar
Lucas, S.M.: ICDAR 2005 text locating competition results. In: Eighth International Conference on Document Analysis and Recognition, 2005. Proceedings, pp. 80–84. IEEE (2005)
Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Gomez i Bigorda, L., Robles Mestre, S., Mas, J., Fernandez Mota, D., Almazan Almazan, J., de las Heras, L.-P.: ICDAR 2013 robust reading competition. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1484–1493. IEEE (2013)
Wang, K., Belongie, S.: Word Spotting in the Wild. Springer, Berlin (2010)
Book Google Scholar
de Campos, T.E., Babu, B.R., Varma, M.: Character recognition in natural images. In: VISAPP (2), pp. 273–280 (2009)
Neumann, L., Matas, J.: Text localization in real-world images using efficiently pruned exhaustive search. In: 2011 International Conference on Document Analysis and Recognition (ICDAR), pp. 687–691. IEEE (2011)
Shi, C., Wang, C., Xiao, B., Gao, S., Hu, J.: Scene text recognition using structure guided character detection and linguistic knowledge. IEEE Trans. Circuits Syst. Video Technol. 24(7), 1235–1250 (2014)
Article Google Scholar
Fabrizio, J., Marcotegui, B., Cord, M.: Text segmentation in natural scenes using toggle-mapping. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 2373–2376. IEEE (2009)
Fabrizio, J., Marcotegui, B., Cord, M.: Text detection in street level images. Pattern Anal. Appl. 16(4), 519–533 (2013)
Article MathSciNet Google Scholar

Download references

Acknowledgments

The authors would like to thank the anonymous reviewers for their valuable suggestions.

Author information

Authors and Affiliations

School of Information Science and Engineering, Central South University, Changsha, 410083, China
Hui Wu, Beiji Zou, Yu-qian Zhao & Jianjing Guo
Mobile Health Ministry of Education, China Mobile Joint Laboratory, Changsha, 410012, Hunan, China
Hui Wu, Beiji Zou, Yu-qian Zhao & Jianjing Guo
School of Geosciences and Info-Physics, Central South University, Changsha, 410083, China
Yu-qian Zhao

Authors

Hui Wu
View author publications
You can also search for this author in PubMed Google Scholar
Beiji Zou
View author publications
You can also search for this author in PubMed Google Scholar
Yu-qian Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jianjing Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Beiji Zou or Yu-qian Zhao.

Additional information

This work is partly supported by the National Natural Science Foundation of China (Grant Nos. 61172184, 61379107, 61402539, and 61573380), Program for New Century Excellent Talents in University of Education Ministry in China (Grant No. NCET-13-0603), Specialized Research Fund for the Doctoral Program of Higher Education (Grant No. 20130162110016) and the Fundamental Research Funds for the Central Universities of Central South University (Grant No. 2015zzts052).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, H., Zou, B., Zhao, Yq. et al. Scene text detection using adaptive color reduction, adjacent character model and hybrid verification strategy. Vis Comput 33, 113–126 (2017). https://doi.org/10.1007/s00371-015-1156-1

Download citation

Published: 23 September 2015
Issue Date: January 2017
DOI: https://doi.org/10.1007/s00371-015-1156-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scene text detection using adaptive color reduction, adjacent character model and hybrid verification strategy

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

A review of convolutional neural networks in computer vision

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Scene text detection using adaptive color reduction, adjacent character model and hybrid verification strategy

Abstract

Access this article

Similar content being viewed by others

CBAM: Convolutional Block Attention Module

HCRNN: A Novel Architecture for Fast Online Handwritten Stroke Classification

A review of convolutional neural networks in computer vision

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation