A hierarchical recursive method for text detection in natural scene images

Wang, Xiaobing; Song, Yonghong; Zhang, Yuanlin; Xin, Jingmin

doi:10.1007/s11042-016-4099-2

A hierarchical recursive method for text detection in natural scene images

Published: 12 December 2016

Volume 76, pages 26201–26223, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xiaobing Wang¹,
Yonghong Song¹,
Yuanlin Zhang¹ &
…
Jingmin Xin¹

486 Accesses
5 Citations
Explore all metrics

Abstract

Text detection in natural scene images is a challenging problem in computer vision. To robust detect various texts in complex scenes, a hierarchical recursive text detection method is proposed in this paper. Usually, texts in natural scenes are not alone and arranged into lines for easy reading. To find all possible text lines in an image, candidate text lines are obtained using text edge box and conventional neural network at first. Then, to accurately find out the true text lines in the image, these candidate text lines are analyzed in a hierarchical recursive architecture. For each of them, connected components segmentation and hierarchical random field based analysis are recursively employed until the detected text line no more changes. Now the detected text lines are output as the text detection result. Experiments on ICDAR 2003 dataset, ICDAR 2013 dataset and Street View Dataset show that the hierarchical recursive architecture can improve text detection performance and the proposed method achieves the state-of-art in scene text detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SFENet: Arbitrary Shapes Scene Text Detection with Semantic Feature Extractor

Bottom-Up Scene Text Detection with Markov Clustering Networks

Article 10 February 2020

A Fast Method for Scene Text Detection

References

Bengio Y (2009) Learning Deep Architectures for AI. Foundations and Trends in Machine Learning 2(1):1–27
Article MathSciNet MATH Google Scholar
Boykov Y, Kolmogorov V (2004) An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. IEEE Trans Pattern Anal Mach Intell 26(9):1124–1137
Article MATH Google Scholar
Cabrera CR, Sastre RJ, Rodriguez JA, Bascon SM (2012) Surfing the point clouds: Selective 3D spatial pyramids for category-level object recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 3458–3465
Chang CC, Lin CJ (2011) LIBSVM: A Library for Support Vector Machines. ACM Trans Intell Syst Technol 2(3):27:1–27:27
Article Google Scholar
Chen XR, Yuille AL (2004) Detecting and Reading Text in Natural Scenes. Proc. IEEE Conf. on Computer Vison and Pattern Recognition, pp II366-II373
De Campos TE, Babu BR, Varma M (2009) Character recognition in natural images. In: Proceedings of the 4th International Conference on Computer Vision Theory and Applications, pp 273–280
Dollr P, Zitnick CL (2015) Fast edge detection using structured forests. IEEE Trans Pattern Anal Mach Intell 37(8):1558–1570
Article Google Scholar
Epshtein B, Ofek E, Wexler Y (2010) Detecting Text in Natural Scenes with Stroke Width Transform. In: Proceedings IEEE Conf. on Computer Vison and Pattern Recognition, pp 2963–2970
Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2016) Reading Text in the Wild with Convolutional Neural Networks. Int J Comput Vis 116(1):1–20
Article MathSciNet Google Scholar
Karatzas D, Shafaity F, Uchidaz S, Iwamurax M, Bigorda LG, Mestre SR, Mas J, Mota DF, Almazan JA, De Las Heras JP (2013) ICDAR 2013 Robust Reading Competition. In: Proceedings of the twelfth International Conference on Document Analysis and Recognition, pp 1484–1493
Kohli P, Ladicky L, Torr PH (2009) Robust Higher Order Potentials for Enforcing Label Consistency. Int J Comput Vis 82:302–324
Article Google Scholar
Ladicky L, Russell C, Kohli P, Torr PH (2014) Associative hierarchical random fields. IEEE Trans Pattern Anal Mach Intell 36(6):1056–1077
Article Google Scholar
Lucas SM, Panaretos A, Sosa L, Tang A, Wong S, Young R (2003) ICDAR 2003 robust reading competitions. In: Proceedings of the Seventh International Conference on Document Analysis and Recognition, pp 682–687
Mariano V Y, Min J, Park J H, Kasturi R, Mihalcik D, Li H P, Doermann D, Drayer T (2002) Performance evaluation of object detection algorithms. In: Proceedings of the 16th International Conference on Pattern Recognition, pp 965–969
Minetto R, Thome N, Cord M, Leite NJ, Stolfi J (2013) T-HOG: An effective gradient-based descriptor for single line text regions. Pattern Recogn 46 (3):1078–1090
Article Google Scholar
Neumann L, Matas J (2011) Text Localization in Real-world Images using Efficiently Pruned Exhaustive Search. In: Proceedings of the 11th International Conference on Document Analysis and Recognition, pp 687–691
Neumann L, Matas J (2012) Real-Time Scene Text Localization and Recognition. In: Proceedings IEEE Conf. on Computer Vison and Pattern Recognition, pp 3538–3545
Neumann L, Matas J (2015) Efficient Scene Text Localization and Recognition with Local Character Refinement. IEEE Conf. on Computer Vison and Pattern Recognition
Opitz M, Diem M, Fiel S, Kleber F, Sablatnig R (2014) End-to-End Text Recognition using Local Ternary Patterns, MSER and Deep Convolutional Nets. In: Proceedings of the 11th IAPR International Workshop on Document Analysis Systems, pp 186–190
Pan YF, Hou XW, Liu CL (2011) A Hybrid Approach to Detect and Localize Texts in Natural Scene Images. IEEE Trans Image Process 20(3):800–813
Article MathSciNet MATH Google Scholar
Phan TQ, Shivakumara P, Tan CL (2012) Detecting text in the real world. In: Proceedings of the 20th ACM international conference on Multimedia, pp 765–768
Shahab A, Shafait F (2011) Dengel A (2011) ICDAR Robust Reading Competition Challenge 2: Reading Text in Scene Images. In: Proceedings of the eleventh International Conference on Document Analysis and Recognition, pp 1491–1496
Vedaldi A, Lenc K (2015) MatConvNet: Convolutional neural networks for MATLAB. In: Proceedings of the 2015 ACM Multimedia Conferenc, pp 689–692
Wang K, Babenko B, Belongie S (2011) End-to-end scene text recognition. In: Proceedings of the 13th IEEE International Conference on Computer Vision, pp 1457–1464
Wolf C, Jolion JM (2006) Object count/area graphs for the evaluation of object detection and segmentation algorithms. International Journal of Document Analysis 8 (4):280–296
Wang T, Wu DJ, Coates A, Ng AY (2012) End-to-End Text Recognition with Convolutional Neural Networks. In: Proceedings of the 21st International Conference on Pattern Recognition, pp 3304– 3308
Wang S, Yang Y, Ma ZG, Li X, Pang CY, Hauptmann AG (2012) Action recognition by exploring data distribution and feature correlation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 1370–1377
Wang XB, Song YH, Zhang YL (2013) Natural Scene Text Detection with Multi-channel Connected Component Segmentation. In: Proceedings of the twelfth International Conference on Document Analysis and Recognition, pp 1375–1379
Wang XB, Song YH, Yuan LZ, Xin JM (2015) Natural scene text detection with multi-layer segmentation and higher order conditional random field based analysis. Pattern Recogn Lett 60-61:41–47
Article Google Scholar
Yang Y, Ma ZG, Xu ZW, Yan SC, Hauptmann AG (2013) How Related Exemplars Help Complex Event Detection in Web Videos?. In: Proceedings of IEEE International Conference on Computer Vision, pp 2104–2111
Yao C, Bai X, Liu WY, Ma Y, Tu ZW (2012) Detecting Texts of Arbitrary Orientations in Natural Images. In: Proceedings IEEE Conf. on Computer Vison and Pattern Recognition, pp 1083–1090
Ye QX, Gao W, Wang WQ, Zeng W (2003) A robust text detection algorithm in images and video frames. In: Proceedings of the 2003 Joint Conference of the 4th International Conference on Information, Communications and Signal Processing and 4th Pacific-Rim Conference on Multimedia, vol 2, pp 802–806
Yin XC, Yin YW, Huang KZ, Hao HW (2014) Robust Text Detection in Natural Scene Images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983
Article Google Scholar
Yu C, Song Y H, Zhang Y L, Liu Y (2016) Scene text localization using edge analysis and feature pool. Neurocomputing 175:625–661
Google Scholar
Yuan J, Wei B G, Liu Y H, Zhang Y, Wang L D (2015) A method for text line detection in natural images. Multimedia Tools and Applications 74(3):859–884
Article Google Scholar
Zitnick CL, Dollar P (2014) Edge Boxes: Locating Object Proposals from Edges. In: Proceedings of the 13th European Conference on Computer Vision part V Lecture Notes in Computer Science, pp 391–405
Zhu KH, Qi FH, Jiang RJ, Xu L (2007) Automatic character detection and segmentation in natural scene images. J Zheijang Univ Sci A 8(1):63–71
Article Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (91520301).

Author information

Authors and Affiliations

Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an, 710049, China
Xiaobing Wang, Yonghong Song, Yuanlin Zhang & Jingmin Xin

Authors

Xiaobing Wang
View author publications
You can also search for this author inPubMed Google Scholar
Yonghong Song
View author publications
You can also search for this author inPubMed Google Scholar
Yuanlin Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Jingmin Xin
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yonghong Song.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Song, Y., Zhang, Y. et al. A hierarchical recursive method for text detection in natural scene images. Multimed Tools Appl 76, 26201–26223 (2017). https://doi.org/10.1007/s11042-016-4099-2

Download citation

Received: 06 April 2016
Revised: 26 August 2016
Accepted: 27 October 2016
Published: 12 December 2016
Issue Date: December 2017
DOI: https://doi.org/10.1007/s11042-016-4099-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hierarchical recursive method for text detection in natural scene images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SFENet: Arbitrary Shapes Scene Text Detection with Semantic Feature Extractor

Bottom-Up Scene Text Detection with Markov Clustering Networks

A Fast Method for Scene Text Detection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now