A Hierarchical Visual Saliency Model for Character Detection in Natural Scenes

Gao, Renwu; Shafait, Faisal; Uchida, Seiichi; Feng, Yaokai

doi:10.1007/978-3-319-05167-3_2

Renwu Gao¹⁷,
Faisal Shafait¹⁸,
Seiichi Uchida¹⁹ &
…
Yaokai Feng¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8357))

Included in the following conference series:

International Workshop on Camera-Based Document Analysis and Recognition

837 Accesses

Abstract

Visual saliency models have been introduced to the field of character recognition for detecting characters in natural scenes. Researchers believe that characters have different visual properties from their non-character neighbors, which make them salient. With this assumption, characters should response well to computational models of visual saliency. However in some situations, characters belonging to scene text mignt not be as salient as one might expect. For instance, a signboard is usually very salient but the characters on the signboard might not necessarily be so salient globally. In order to analyze this hypothesis in more depth, we first give a view of how much these background regions, such as sign boards, affect the task of saliency-based character detection in natural scenes. Then we propose a hierarchical-saliency method for detecting characters in natural scenes. Experiments on a dataset with over 3,000 images containing scene text show that when using saliency alone for scene text detection, our proposed hierarchical method is able to capture a larger percentage of text pixels as compared to the conventional single-pass algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Towards reduced-complexity scene text recognition (RCSTR) through a novel salient feature selection

Article 22 May 2024

Enhanced Characterness for Text Detection in the Wild

Fast and Accurate Text Detection in Natural Scene Images

Notes

1.
We are planning to make the database freely available in near feature.

References

Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Wu, D., Ng, A.: Text detection and character recognition in scene images with unsupervised feature learning. In: International Conference on Document Analysis and Recognition (ICDAR), pp. 440–445 (2011)
Google Scholar
Yao, C., Bai, X., Liu, W., Tu, Z.: Detection texts of arbitrary orientations in natural images. In: Computer Vision and Pattern Recognition (CVPR), pp. 1083–1090 (2012)
Google Scholar
Mishra, A., Alahari, K., Jawahar, C.V.: Top-down and bottom-up cues for scene text recognition. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2687–2694 (2012)
Google Scholar
Lee, J.J., Lee, P.H., Lee, S.W., Yuille, A., Koch, C.: AdaBoost for text detection in natural scene. In: International Conference on Document Analysis and Recognition (ICDAR), pp. 429–434 (2011)
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Computer Vision and Pattern Recognition (CVPR), pp. 2963–2970 (2010)
Google Scholar
Uchida, S.: Text localization and recognition in images and video. In: Doerman, D., Tombre, K.(eds.) Handbook of Document Image Processing and Recognition (to be published in 2013)
Google Scholar
Sun, Q.Y., Lu, Y., Sun, S.L.: A visual attention based approach to text extraction. In: International Conference on Pattern Recognition (ICPR), pp. 3991–3995 (2010)
Google Scholar
Walther, D., Itti, L., Riesenhuber, M., Poggio, T.A., Koch, Ch.: Attentional selection for object recognition - a gentle way. In: Bülthoff, H.H., Lee, S.-W., Poggio, T.A., Wallraven, Ch. (eds.) BMCV 2002. LNCS, vol. 2525, pp. 472–479. Springer, Heidelberg (2002)
Google Scholar
Elazary, L., Itti, L.: A Bayesian model for efficient visual search and recognition. Vision. Res. 50(14), 1338–1352 (2010)
Article Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T., Rubin, M.A.: Context-based vision system for place and object recognition. In: International Conference on Computer Vision (ICCV), vol. 1, pp. 273–280 (2003)
Google Scholar
Shahab, A., Shafait, F., Dengel, A.: Bayesian approach to photo time-stamp recognition, In: International Conference on Document Analysis and Recognition (ICDAR), pp. 1039–1043 (2011)
Google Scholar
Shahab, A., Shafait, F., Dengel, A., Uchida, S.: How salient is scene text?. In: International Workshop on Document Analysis Systems (DAS), pp. 317–321 (2012)
Google Scholar
Uchida, S., Shigeyoshi, Y., Kunishige, Y., Feng, Y.K.: A keypoint-based approach toward scenery character detection. In: International Conference on Document Analysis and Recognition (ICDAR), pp. 918–823 (2011)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 20(11), 1254–1259 (1998)
Article Google Scholar
Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. Human Neurobiol. 4, 219–227 (1985)
Google Scholar
Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 35(1), 185–207 (2013)
Article MathSciNet Google Scholar
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: International Conference on Computer Vision, Kyoto, Japan, pp. 2016–2113 (2009)
Google Scholar
Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980)
Google Scholar
http://ilab.usc.edu/toolkit
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Sys. Man Cybern. 9(1), 62–66 (1979)
Article MathSciNet Google Scholar
Ward Jr, J.H.: Hierarchical grouping to optimize an object function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Information Sciene and Electrical Engineering, Kyushu University, Fukuoka, Japan
Renwu Gao
The University of Western Australia, Perth, Australia
Faisal Shafait
Kyushu University, Fukuoka, Japan
Seiichi Uchida & Yaokai Feng

Authors

Renwu Gao
View author publications
You can also search for this author in PubMed Google Scholar
Faisal Shafait
View author publications
You can also search for this author in PubMed Google Scholar
Seiichi Uchida
View author publications
You can also search for this author in PubMed Google Scholar
Yaokai Feng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Renwu Gao .

Editor information

Editors and Affiliations

Graudate School of Engineering, Osaka Prefecture University, Osaka, Japan
Masakazu Iwamura
The University of Western Australia, Crawley, West Australia, Australia
Faisal Shafait

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, R., Shafait, F., Uchida, S., Feng, Y. (2014). A Hierarchical Visual Saliency Model for Character Detection in Natural Scenes. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2013. Lecture Notes in Computer Science(), vol 8357. Springer, Cham. https://doi.org/10.1007/978-3-319-05167-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-05167-3_2
Published: 19 March 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05166-6
Online ISBN: 978-3-319-05167-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics