Chinese Text Detection Using Deep Learning Model and Synthetic Data

Gao, Wei-wei; Zhang, Jun; Chen, Peng; Wang, Bing; Xia, Yi

doi:10.1007/978-3-319-95930-6_46

Wei-wei Gao¹⁷,
Jun Zhang¹⁷,
Peng Chen¹⁸,
Bing Wang¹⁹ &
…
Yi Xia¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10954))

Included in the following conference series:

International Conference on Intelligent Computing

2857 Accesses

Abstract

Detection of text in natural scene images is very challenging, and it is not completely solved. In this work we propose a fast and reliable algorithm to generate synthetic data of Chinese characters in images. The proposed algorithm make the text content cover the background in a natural way. To validate the proposed method effective, another dataset are generated by ordinary fusion method. Two dataset are used to train Faster-RCNN network. And the experimental result shows that the dataset are generated by proposed method achieve a better performance of detection than the normal way.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: International Conference on Neural Information Processing Systems, pp. 91–99. MIT Press (2015)
Google Scholar
Geng, Y., Liang, R.Z., Li, W., et al.: Learning convolutional neural network to maximize Pos@Top performance measure (2016)
Google Scholar
Geng, Y., et al.: A novel image tag completion method based on convolutional neural transformation. In: Lintas, A., Rovetta, S., Verschure, P.F.M.J., Villa, A.E.P. (eds.) ICANN 2017. LNCS, vol. 10614, pp. 539–546. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68612-7_61
Chapter Google Scholar
Zhang, G., et al.: Learning convolutional ranking-score function by query preference regularization. In: Yin, H., et al. (eds.) IDEAL 2017. LNCS, vol. 10585, pp. 1–8. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68935-7_1
Chapter Google Scholar
Gupta, A., Vedaldi, A., Zisserman, A.: Synthetic data for text localization in natural images, 2315–2324 (2016)
Google Scholar
Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolutional neural networks. In: Proceedings ICPR, pp. 3304–3308 (2012)
Google Scholar
Jaderberg, M., Simonyan, K., Vedaldi, A., Zisserman, A.: Synthetic data and artificial neural networks for natural scene text recognition. In: Workshop on Deep Learning, NIPS (2014)
Google Scholar
Dosovitskiy, A., Fischery, P., Ilg, E., et al.: FlowNet: learning optical flow with convolutional networks. In: IEEE International Conference on Computer Vision, pp. 2758–2766. IEEE Computer Society (2015)
Google Scholar
Dosovitskiy, A., Brox, T.: Inverting visual representations with convolutional networks, pp. 4829–4837 (2015)
Google Scholar
Yildirim, I., Kulkarni, T., Freiwald, W., et al.: Efficient analysis-by-synthesis in vision: a computational framework, behavioral tests, and comparison with neural representations. In: Conference of the Cognitive Science Society (2015)
Google Scholar
Jaderberg, M., Vedaldi, A., Zisserman, A.: Deep features for text spotting. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 512–528. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_34
Chapter Google Scholar
Ozuysal, O.M., Fua, P., Lepetit, V.: Fast keypoint recognition in ten lines of code. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1–8. DBLP (2007)
Google Scholar
Wang, K., Babenko, B., Belongie, S.: End-to-end scene text recognition. In: IEEE International Conference on Computer Vision, pp. 1457–1464. IEEE (2012)
Google Scholar
Alsharif, O., Pineau, J.: End-to-end text recognition with hybrid HMM maxout model. Comput. Sci. (2013)
Google Scholar
Bissacco, A., Cummins, M., Netzer, Y., et al.: PhotoOCR: reading text in uncontrolled conditions. In: IEEE International Conference on Computer Vision, pp. 785–792. IEEE (2014)
Google Scholar
Szegedy, C., Toshev, A., Erhan, D.: Deep neural networks for object detection. Adv. Neural. Inf. Process. Syst. 26, 2553–2561 (2013)
Google Scholar
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: OverFeat: integrated recognition, localization and detection using convolutional networks. In: ICLR (2014)
Google Scholar
Erhan, D., Szegedy, C., Toshev, A., et al.: Scalable object detection using deep neural networks. 3(4), 2155–2162 (2013)
Google Scholar
Szegedy, C., Reed, S., Erhan, D., et al.: Scalable, high-quality object detection. Comput. Sci. (2014)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Google Scholar
Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. IEEE PAMI 33, 898–916 (2011)
Article Google Scholar
Liu, C.S., Lin, G.: Deep convolutional neural fields for depth estimation from a single image. In: Proceedings CVPR (2015)
Google Scholar
Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar

Download references

Acknowledge

This work is supported by Anhui Provincial Natural Science Foundation (grant number 1608085MF136).

Author information

Authors and Affiliations

School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, Anhui, China
Wei-wei Gao, Jun Zhang & Yi Xia
Institute of Health Sciences, Anhui University, Hefei, 230601, Anhui, China
Peng Chen
School of Electrical and Information Engineering, Anhui University of Technology, Ma Anshan, 243032, China
Bing Wang

Authors

Wei-wei Gao
View author publications
You can also search for this author in PubMed Google Scholar
Jun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Bing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yi Xia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Zhang .

Editor information

Editors and Affiliations

Tongji University, Shanghai, China
De-Shuang Huang
Polytechnic of Bari, Bari, Italy
Vitoantonio Bevilacqua
University of Wollongong, North Wollongong, New South Wales, Australia
Prashan Premaratne
Indian Institute of Technology Kanpur, Kanpur, India
Phalguni Gupta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gao, Ww., Zhang, J., Chen, P., Wang, B., Xia, Y. (2018). Chinese Text Detection Using Deep Learning Model and Synthetic Data. In: Huang, DS., Bevilacqua, V., Premaratne, P., Gupta, P. (eds) Intelligent Computing Theories and Application. ICIC 2018. Lecture Notes in Computer Science(), vol 10954. Springer, Cham. https://doi.org/10.1007/978-3-319-95930-6_46

Download citation

DOI: https://doi.org/10.1007/978-3-319-95930-6_46
Published: 06 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95929-0
Online ISBN: 978-3-319-95930-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics