A New Text Detection Algorithm in Images/Video Frames

Ye, Qixiang; Huang, Qingming

doi:10.1007/978-3-540-30542-2_106

Qixiang Ye¹⁹ &
Qingming Huang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3332))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

790 Accesses
8 Citations

Abstract

In this paper, we propose a new text detection algorithm for images/video frames in a coarse-to-fine framework. Firstly, in the coarse detection, multiscale wavelet energy feature is employed to locate all possible text pixels and then a density-based region growing method is developed to connect these pixels into text lines. Secondly, in the fine detection, four kinds of texture features are combined to represent a text line and a SVM classifier is employed to identify texts from the candidate ones. Experimental results on two datasets show the encouraging performance of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Zhong, Y., Zhang, H.J., Jain, A.K.: Automatic Caption Localization in Compressed Video. IEEE Trans. on PAMI 22(4), 385–392 (2000)
Google Scholar
Wu, V., Manmatha, R., Riseman, E.M.: Textfinder: An Automatic System to Detect and Recognize Text in Images. IEEE Trans. on PAMI 20, 1224–1229 (1999)
Google Scholar
Lienhart, R., Wernicke, A.: Localizing and Segmenting Text in Images and Videos. IEEE Trans. on CSVT 12(4) (2002)
Google Scholar
Jain, A.K., Yu, B.: Automatic Text Location in Images and Video Frames. Pattern Recognition 31(12), 2055–2076 (1998)
Article Google Scholar
Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. Pattern Recognition 28, 1523–1535 (1995)
Article Google Scholar
Li, H., Doermann, D., Kia, O.: Automatic Text Detection and Tracking in Digital Video. IEEE Trans. on Image Processing 9(1) (2000)
Google Scholar
Tang, X., Gao, X.B., Liu, J., Zhang, H.: Spatial-Temporal Approach for Video Caption Detection and Recognition. IEEE Trans. Neural Networks 13, 961–971 (2002)
Article Google Scholar
Luo, B., Tang, X.O., Liu, J.Z., Zhang, H.: Video Caption Detection and Extraction Using Temporal Feature Vector. In: Int. Conf. on Image Processing (2003)
Google Scholar
Chen, D.T., Bourlard, H., Thiran, J.-P.: Text Identification in Complex Background Using SVM. In: Int. Conf. on CVPR (2001)
Google Scholar
Mallat, S.G.: A Theory for Multiresolution Signal Decomposition: The Wavelet Representation. IEEE Trans. on PAMI 11(7) (1989)
Google Scholar
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Heidelberg (1995)
MATH Google Scholar
Jain, A.K.: Statistical Pattern Recognition: A Review. IEEE Trans. on PAMI 2(1), 4–37 (2001)
Google Scholar
Sung, K., Paggio, T.: Example-based Learning for View-based Human Face Detection. Mass, Inst. Technol., Cambridge, MA, A.I. Memo 1521 (1994)
Google Scholar
Hua, X.S., Liu, W.Y., Zhang, H.J.: Automatic Performance Evaluation for Video Text Detection. In: Int. Conf. on Document Analysis and Recognition (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Digital Media Lab, Institute of Computing Technology, and Research Center of Digital Media, Graduate School of of Sciences, Beijing, 100039, China
Qixiang Ye & Qingming Huang

Authors

Qixiang Ye
View author publications
You can also search for this author in PubMed Google Scholar
Qingming Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information and Communication Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, 113-8656, Tokyo, Japan
Kiyoharu Aizawa
Tokyo Research Laboratory, IBM Research, 1623-14 Shimo-tsuruma, 242-0001, Yamato, Kanagawa, Japan
Yuichi Nakamura
National Institute of Informatics, Tokyo, Japan
Shin’ichi Satoh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ye, Q., Huang, Q. (2004). A New Text Detection Algorithm in Images/Video Frames. In: Aizawa, K., Nakamura, Y., Satoh, S. (eds) Advances in Multimedia Information Processing - PCM 2004. PCM 2004. Lecture Notes in Computer Science, vol 3332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30542-2_106

Download citation

DOI: https://doi.org/10.1007/978-3-540-30542-2_106
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23977-2
Online ISBN: 978-3-540-30542-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics