Perspective Scene Text Recognition with Feature Compression and Ranking

Zhou, Yu; Liu, Shuang; Zhang, Yongzheng; Wang, Yipeng; Lin, Weiyao

doi:10.1007/978-3-319-16631-5_14

Yu Zhou¹⁵,
Shuang Liu¹⁵,
Yongzheng Zhang¹⁵,
Yipeng Wang¹⁵ &
…
Weiyao Lin¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9009))

Included in the following conference series:

Asian Conference on Computer Vision

2082 Accesses
1 Citations

Abstract

In this paper we propose a novel character representation for scene text recognition. In order to recognize each individual character, we first adopt a bag-of-words approach, in which the rotation-invariant circular Fourier-HOG features are densely extracted from an individual character and compressed into middle level features. Then we train a set of two-class linear Support Vector Machines in a one-vs-all schema to rank the compressed features by their contributions to the classification. Based on the ranking result we further select and keep those top rated features to build a compact and discriminative codebook. By using densely extracted features that are rotation-invariant and efficient, our method is capable of recognizing perspective texts of arbitrary orientations, and can be combined with the existing word recognition methods. Experimental results demonstrates that our method is highly efficient and achieves state-of-the-art performance on several benchmark datasets.

Y. Zhou and S. Liu are contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: CVPR (2012)
Google Scholar
Neumann, L., Jiri, M.: Real-time scene text localization and recognition. In: CVPR (2012)
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: CVPR (2010)
Google Scholar
Chen, H., Tsai, S.S., Schroth, G., Chen, D.M., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: ICIP (2011)
Google Scholar
Huang, W., Lin, Z., Yang, J., Wang, J.: Text localization in natural images using stroke feature transform and text covariance descriptors. In: ICCV (2013)
Google Scholar
Neumann, L., Matas, J.: A method for text localization and recognition in real-world images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)
Chapter Google Scholar
Dance, C.R.: Perspective estimation for document images. In: Electronic Imaging (2002)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Mishra, A., Alahari, K., Jawahar, C.: Top-down and bottom-up cues for scene text recognition. In: CVPR (2012)
Google Scholar
Novikova, T., Barinova, O., Kohli, P., Lempitsky, V.: Large-lexicon attribute-consistent text recognition in natural images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 752–765. Springer, Heidelberg (2012)
Chapter Google Scholar
Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV (2004)
Google Scholar
Phan, T.Q., Shivakumara, P., Tian, S., Tan, C.L.: Recognizing text with perspective distortion in natural scenes. In: ICCV (2013)
Google Scholar
Skibbe, H., Reisert, M.: Circular fourier-hog features for rotation invariant object detection in biomedical images. In: ISBI (2012)
Google Scholar
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: ICCV (2011)
Google Scholar
Chen, H., Tsai, S.S., Schroth, G., Chen, D.M., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced maximally s table extremal regions. In: ICIP (2011)
Google Scholar
Weinman, J.J., Learned-Miller, E., Hanson, A.R.: Scene text recognition using similarity and a lexicon with sparse belief propagation. IEEE TPAMI 31(10), 1733–1746 (2009)
Article Google Scholar
Shi, C., Wang, C., Xiao, B., Zhang, Y., Gao, S., Zhang, Z.: Scene text recognition using part-based tree-structured character detection. In: CVPR (2013)
Google Scholar
Bosch, A., Zisserman, A., Muñoz, X.: Scene classification via pLSA. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 517–530. Springer, Heidelberg (2006)
Chapter Google Scholar
Lin, W.Y., Liu, L., Matsushita, Y., Low, K.L., Liu, S.: Aligning images in the wild. In: CVPR (2012)
Google Scholar
Napoleon, D., Pavalakodi, S.: A new method for dimensionality reduction using k-means clustering algorithm for high dimensional data set. Int. J. Comput. Appl. (2011)
Google Scholar
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002)
Article MATH Google Scholar
Chang, Y.W., Lin, C.J.: Feature ranking using linear svm. In: JMLR (2008)
Google Scholar
Mishra, A., Alahari, K., Jawahar, C.: Scene text recognition using higher order language priors. In: BMVC (2012)
Google Scholar
Sosa, L.P., Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: ICDAR (2003)
Google Scholar
Wang, K., Belongie, S.: Word spotting in the wild. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 591–604. Springer, Heidelberg (2010)
Chapter Google Scholar
de Campos, T., Babu, B.R., Varma, M.: Character recognition in natural images. In: VISAPP (2009)
Google Scholar
Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In: ICPR (2012)
Google Scholar
ABBYY FineReader Professional 9.0 (2008). http://www.abbyy.com/
Coates, A., Carpenter, B., Satheesh, S., Suresh, B., Wang, T., Wu, D.J., Ng, A.Y.: Text detection and character recognition in scene images with unsupervised feature learning. In: ICDAR (2011)
Google Scholar
Yi, C., Yang, X., Tian, Y.: Feature representations for scene text character recognition: A comparative study. In: ICDAR (2013)
Google Scholar

Download references

Acknowledgment

This paper is partially supported by National Natural Science Foundation of China under Contract nos. 61303170, 61402472 and 61471235, and also supported by the National High Technology Research and Development Program of China (863 programs)under Contract nos. 2013AA014703 and 2012AA012803.

Author information

Authors and Affiliations

Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
Yu Zhou, Shuang Liu, Yongzheng Zhang & Yipeng Wang
School of Electronic, Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
Weiyao Lin

Authors

Yu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Shuang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yongzheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yipeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Weiyao Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongzheng Zhang .

Editor information

Editors and Affiliations

Center for Visual Information Technology, International Institute of Information Technology, Hyderabad, India
C.V. Jawahar
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Shiguang Shan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Y., Liu, S., Zhang, Y., Wang, Y., Lin, W. (2015). Perspective Scene Text Recognition with Feature Compression and Ranking. In: Jawahar, C., Shan, S. (eds) Computer Vision - ACCV 2014 Workshops. ACCV 2014. Lecture Notes in Computer Science(), vol 9009. Springer, Cham. https://doi.org/10.1007/978-3-319-16631-5_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-16631-5_14
Published: 11 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16630-8
Online ISBN: 978-3-319-16631-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics