Feature Pooling in Scene Character Recognition: A Comprehensive Study

Zhang, Zhong; Wang, Hong; Liu, Shuang; Shao, Yunxue

doi:10.1007/978-981-10-6571-2_262

Feature Pooling in Scene Character Recognition: A Comprehensive Study

Zhong Zhang³⁸,
Hong Wang³⁸,
Shuang Liu³⁸ &
…
Yunxue Shao³⁹

Conference paper
First Online: 07 June 2018

76 Accesses

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 463))

Abstract

In this paper, we focus on the feature pooling methods for scene character recognition. We research three kinds of pooling methods: the average (sum) pooling, max pooling and weighted-based pooling methods. Specifically, various feature pooling methods are introduced, their merits and demerits are studied, and existing problems are discussed. Finally, we offer a specific comparison on the ICDAR2003 and Chars74k databases.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Pavlovic, V.I., Sharma, R., Huang, T.S.: Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 677–695 (1997)
Google Scholar
DeSouza, G.N., Kak, A.C.: Vision for mobile robot navigation: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 24(2), 237–267 (2002)
Google Scholar
Vailaya, A., Figueiredo, M.A.T., Jain, A.K., Zhang, H.J.: Image classification for content-based indexing. IEEE Trans. Image Process. 10(1), 117–130 (2001)
Google Scholar
Chen, X., Yuille, A.L.: Detecting and reading text in natural scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2, pp. 366–373 (2004)
Google Scholar
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3538–3545 (2012)
Google Scholar
Gemert, J., Geusebroek, J., Veenman, C., Smeulders, A.: Kernel codebooks for scene categorization. In: European Conference on Computer Vision (ECCV), pp. 696–709 (2008)
Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3360–3367 (2010)
Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: International Conference on Machine Learning, pp. 1096–1103 (2008)
Google Scholar
Norouzi, M., Ranjbar, M., Mori, G.: Stacks of convolutional restricted boltzmann machines for shift-invariant feature learning. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2735–2742 (2009)
Google Scholar
Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Cross-view action recognition using contextual maximum margin clustering. IEEE Trans. Circuits Syst. Video Technol. 24(10), 1663–1668 (2014)
Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1794–1801 (2009)
Google Scholar
Zhang, Z., Wang, C., Xiao, B., Zhou, W., Liu, S.: Action recognition using context-constrained linear coding. IEEE Signal Process. Lett. 19(7), 439–442 (2012)
Google Scholar
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: International Conforence on Document Analysis and Recognition, pp. 682–687 (2003)
Google Scholar
de Campos, T.E., Babu, B.R., Varma, M.: Character recognition in natural images. In: International Conference on Computer Vision and Applications, pp. 273–280 (2009)
Google Scholar
Zubair, S., Yan, F., Wang, W.: Dictionary learning based sparse coefficients for audio classification with max and average pooling. Digit. Signal Proc. 23(3), 960–970 (2013)
Google Scholar
Murray, N., Perronnin, F.: Generalized max pooling. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2473–2480 (2014)
Google Scholar
Hu, Y., Li, M., Yu, N.: Multiple-instance ranking: learning to rank images for image retrieval. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2169–2178 (2006)
Google Scholar
Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)
Google Scholar
Gao, S., Wang, C., Xiao, B., Shi, C., Zhang, Z.: Stroke bank: a high-level representation for scene character recognition. In: International Conference on Pattern Recognition (ICPR), pp. 2909–2913 (2014)
Google Scholar
Xiong, W., Zhang, L., Du, B., Tao, D.: Combining local and global: rich and robust feature pooling for visual recognition. Pattern Recogn. 62, 225–235 (2017)
Google Scholar
Lee, C., Bhardwaj, A., Di, W., Jagadeesh, V., Piramuthu, R.: Region-based discriminative feature pooling for scene text recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4050–4057 (2014)
Google Scholar
Shi, C., Gao, S., Liu, M., Qi, C., Wang, C., Xiao, B.: Stroke detector and structure based models for character recognition: a comparative study. IEEE Trans. Image Process. 24(12), 4952–4964 (2015)
Google Scholar
Yi, C., Yang, X., Tian, Y.: Feature representations for scene text character recognition: a comparative study. In: International Conference on Document Analysis and Recognition, pp. 907–911 (2013)
Google Scholar
Tian, S., Bhattacharya, U., Lu, S., Su, B.: Multilingual scene character recognition with co-occurrence of histogram of oriented gradients. Pattern Recogn. 51, 126–134 (2016)
Google Scholar
Su, B., Lu, S., Tian, S., Lim, J.H., Tan, C.L.: Character recognition in natural scene using convolutional co-occurrence HOG. In: International Conference on Pattern Recognition (ICPR), pp. 2926–2931 (2014)
Google Scholar
Gao, S., Wang, C., Xiao, B., Shi, C., Zhou, W., Zhang, Z.: Learning co-occurrence strokes for scene character recognition based on spatiality embedded dictionary. In: IEEE International Conference on Image Processing (ICIP), pp. 5956–5960 (2014)
Google Scholar

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China under Grant No. 61501327, No. 61711530240 and No. 61401309, Natural Science Foundation of Tianjin under Grant No. 17JCZDJC30600, and No. 15JCQNJC01700, the Open Projects Program of National Laboratory of Pattern Recognition under Grant No. 201700001, and Doctoral Fund of Tianjin Normal University under Grant No. 5RL134 and No. 52XB1405.

Author information

Authors and Affiliations

Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, Tianjin Normal University, Tianjin, China
Zhong Zhang, Hong Wang & Shuang Liu
College of Computer Science, Inner Mongolia University, Inner Mongolia, China
Yunxue Shao

Authors

Zhong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yunxue Shao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhong Zhang .

Editor information

Editors and Affiliations

Department of Electrical Engineering, University of Texas at Arlington, Arlington, Texas, USA
Qilian Liang
College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin, China
Jiasong Mu
Harbin Institute of Technology, Harbin, Heilongjiang, China
Min Jia
College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin, China
Wei Wang
College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin, China
Xuhong Feng
College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin, China
Baoju Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z., Wang, H., Liu, S., Shao, Y. (2019). Feature Pooling in Scene Character Recognition: A Comprehensive Study. In: Liang, Q., Mu, J., Jia, M., Wang, W., Feng, X., Zhang, B. (eds) Communications, Signal Processing, and Systems. CSPS 2017. Lecture Notes in Electrical Engineering, vol 463. Springer, Singapore. https://doi.org/10.1007/978-981-10-6571-2_262

Download citation

DOI: https://doi.org/10.1007/978-981-10-6571-2_262
Published: 07 June 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6570-5
Online ISBN: 978-981-10-6571-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics