Path Aggregation and Dual Supervision Network for Scene Text Detection

Feng, Shuyang; Zhang, Na; Zhao, Cairong

doi:10.1007/978-3-030-60636-7_6

Shuyang Feng¹⁶,
Na Zhang¹⁶ &
Cairong Zhao^16,17

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12307))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1215 Accesses

Abstract

In recent years, instance segmentation-based scene text detection has been widely concerned by academics and industry. However, these segmentation methods based on the coding-decoding paradigm are limited by the loss of information caused by subsampling, which is the root cause of pixel misclassification in the instance segmentation task. In this paper, we propose an effective approach for scene text detection, which named Path Aggregation and Dual Supervision Network (PADSNet). To introduce the from coarse to fine detection idea into the one-stage segmentation algorithm, a single-task multi-level supervision method is designed. Meanwhile, deformable convolution is used to break through the limits of CNN’s rectangular receptive field, so that it can better adapt to arbitrary shape scene text. The experimental results show that our method can effectively reduce pixel misclassification, and achieve f-measure 85.4% and 83.19% on the ICDAR2015 dataset and CTW1500 dataset respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Liu, S., et al.: Path aggregation network for instance segmentation. In: Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Google Scholar
Lin, T., et al.: Feature pyramid networks for object detection. In: Computer Vision and Pattern Recognition, pp. 936–944 (2017)
Google Scholar
Zhu, X., et al.: Deformable ConvNets V2: more deformable, better results. In: Computer Vision and Pattern Recognition, pp. 9308–9316 (2019)
Google Scholar
Hu, H., et al.: WordSup: exploiting word annotations for character based text detection. In: International Conference on Computer Vision, pp. 4950–4959 (2017)
Google Scholar
Baek, Y., et al.: Character region awareness for text detection. In: Computer Vision and Pattern Recognition, pp. 9365–9374 (2019)
Google Scholar
Liao, M., et al.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2019)
Google Scholar
Xing, L., et al.: Convolutional character networks. In: International Conference on Computer Vision, pp. 9126–9136 (2019)
Google Scholar
Tian, Z., Huang, W., He, T., He, P., Qiao, Yu.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_4
Chapter Google Scholar
Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 1137–1149 (2017)
Google Scholar
Shi, B., Xiang, B., Serge, B.: Detecting oriented text in natural images by linking segments. In: Computer Vision and Pattern Recognition, pp. 3482–3490 (2017)
Google Scholar
Liao, M., Shi, B., Bai, X.: Textboxes ++: a single- shot oriented scene text detector. IEEE Trans. Image Process. 3676–3690 (2018)
Google Scholar
Zhou, X., et al.: EAST: an efficient and accurate scene text detector. In: Computer Vision and Pattern Recognition, pp. 2642–2651 (2017)
Google Scholar
Long, S., Ruan, J., Zhang, W., He, X., Wu, W., Yao, C.: textsnake: a flexible representation for detecting text of arbitrary shapes. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 19–35. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_2
Chapter Google Scholar
Wang, W., et al.: Shape robust text detection with progressive scale expansion network. In: Computer Vision and Pattern Recognition, pp. 9336–9345 (2019)
Google Scholar
Xu, Y., et al.: TextField: learning a deep direction field for irregular scene text detection. IEEE Trans. Image Process. 5566–5579 (2019)
Google Scholar
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
He, K., et al.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Milletari, F., Nassir, N., Seyedahmad, A.: V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In: International Conference on 3d Vision, pp. 565–571 (2016)
Google Scholar
Shrivastava, A., Abhinav, G., Ross, G.: Training region-based object detectors with online hard example mining. In: Computer Vision and Pattern Recognition, pp. 761–769 (2016)
Google Scholar
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., et al.: ICDAR 2015 competition on robust reading. In: ICDAR 2015
Google Scholar
Icdar2017 competition on multi-lingual scene text detection and script identification. http://rrc.cvc.uab.es/?ch=8&com=introduction
Liu, Y., Jin, L., Zhang, S., Zhang, S.: Detecting curve text in the wild: New dataset and new solution. CoRR, abs/1712.02170 (2017)
Google Scholar
Deng, Jia, et al.: ImageNet: a large-scale hierarchical image database. In: Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar
Ma, J., et al.: Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimedia, pp. 3111–3122 (2018)
Google Scholar
He, P., et al.: Single shot text detector with regional attention. In: International Conference on Computer Vision, pp. 3066–3074 (2017)
Google Scholar
Lyu, P., et al.: Multi-oriented scene text detection via corner localization and region segmentation. In: Computer Vision and Pattern Recognition, pp. 7553–7563 (2018)
Google Scholar
Liao, M., et al.: Rotation-sensitive regression for oriented scene text detection. In: Computer Vision and Pattern Recognition, pp. 5909–5918 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Tongji University, Shanghai, China
Shuyang Feng, Na Zhang & Cairong Zhao
Key Laboratory of Embedded System and Service Computing (Tongji University), Ministry of Education, Shanghai, 201804, China
Cairong Zhao

Authors

Shuyang Feng
View author publications
You can also search for this author in PubMed Google Scholar
Na Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Cairong Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cairong Zhao .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Yuxin Peng
Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Dalian University of Technology, Dalian, China
Huchuan Lu
Chinese Academy of Sciences, Beijing, China
Zhenan Sun
Chinese Academy of Sciences, Beijing, China
Chenglin Liu
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Xilin Chen
Peking University, Beijing, China
Hongbin Zha
Nanjing University of Science and Technology, Nanjing, China
Jian Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Feng, S., Zhang, N., Zhao, C. (2020). Path Aggregation and Dual Supervision Network for Scene Text Detection. In: Peng, Y., et al. Pattern Recognition and Computer Vision. PRCV 2020. Lecture Notes in Computer Science(), vol 12307. Springer, Cham. https://doi.org/10.1007/978-3-030-60636-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-60636-7_6
Published: 13 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60635-0
Online ISBN: 978-3-030-60636-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics