Abstract
In recent years, instance segmentation-based scene text detection has been widely concerned by academics and industry. However, these segmentation methods based on the coding-decoding paradigm are limited by the loss of information caused by subsampling, which is the root cause of pixel misclassification in the instance segmentation task. In this paper, we propose an effective approach for scene text detection, which named Path Aggregation and Dual Supervision Network (PADSNet). To introduce the from coarse to fine detection idea into the one-stage segmentation algorithm, a single-task multi-level supervision method is designed. Meanwhile, deformable convolution is used to break through the limits of CNN’s rectangular receptive field, so that it can better adapt to arbitrary shape scene text. The experimental results show that our method can effectively reduce pixel misclassification, and achieve f-measure 85.4% and 83.19% on the ICDAR2015 dataset and CTW1500 dataset respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Liu, S., et al.: Path aggregation network for instance segmentation. In: Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Lin, T., et al.: Feature pyramid networks for object detection. In: Computer Vision and Pattern Recognition, pp. 936–944 (2017)
Zhu, X., et al.: Deformable ConvNets V2: more deformable, better results. In: Computer Vision and Pattern Recognition, pp. 9308–9316 (2019)
Hu, H., et al.: WordSup: exploiting word annotations for character based text detection. In: International Conference on Computer Vision, pp. 4950–4959 (2017)
Baek, Y., et al.: Character region awareness for text detection. In: Computer Vision and Pattern Recognition, pp. 9365–9374 (2019)
Liao, M., et al.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2019)
Xing, L., et al.: Convolutional character networks. In: International Conference on Computer Vision, pp. 9126–9136 (2019)
Tian, Z., Huang, W., He, T., He, P., Qiao, Yu.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_4
Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 1137–1149 (2017)
Shi, B., Xiang, B., Serge, B.: Detecting oriented text in natural images by linking segments. In: Computer Vision and Pattern Recognition, pp. 3482–3490 (2017)
Liao, M., Shi, B., Bai, X.: Textboxes ++: a single- shot oriented scene text detector. IEEE Trans. Image Process. 3676–3690 (2018)
Zhou, X., et al.: EAST: an efficient and accurate scene text detector. In: Computer Vision and Pattern Recognition, pp. 2642–2651 (2017)
Long, S., Ruan, J., Zhang, W., He, X., Wu, W., Yao, C.: textsnake: a flexible representation for detecting text of arbitrary shapes. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11206, pp. 19–35. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01216-8_2
Wang, W., et al.: Shape robust text detection with progressive scale expansion network. In: Computer Vision and Pattern Recognition, pp. 9336–9345 (2019)
Xu, Y., et al.: TextField: learning a deep direction field for irregular scene text detection. IEEE Trans. Image Process. 5566–5579 (2019)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
He, K., et al.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Milletari, F., Nassir, N., Seyedahmad, A.: V-Net: Fully convolutional neural networks for volumetric medical image segmentation. In: International Conference on 3d Vision, pp. 565–571 (2016)
Shrivastava, A., Abhinav, G., Ross, G.: Training region-based object detectors with online hard example mining. In: Computer Vision and Pattern Recognition, pp. 761–769 (2016)
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., et al.: ICDAR 2015 competition on robust reading. In: ICDAR 2015
Icdar2017 competition on multi-lingual scene text detection and script identification. http://rrc.cvc.uab.es/?ch=8&com=introduction
Liu, Y., Jin, L., Zhang, S., Zhang, S.: Detecting curve text in the wild: New dataset and new solution. CoRR, abs/1712.02170 (2017)
Deng, Jia, et al.: ImageNet: a large-scale hierarchical image database. In: Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Ma, J., et al.: Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimedia, pp. 3111–3122 (2018)
He, P., et al.: Single shot text detector with regional attention. In: International Conference on Computer Vision, pp. 3066–3074 (2017)
Lyu, P., et al.: Multi-oriented scene text detection via corner localization and region segmentation. In: Computer Vision and Pattern Recognition, pp. 7553–7563 (2018)
Liao, M., et al.: Rotation-sensitive regression for oriented scene text detection. In: Computer Vision and Pattern Recognition, pp. 5909–5918 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Feng, S., Zhang, N., Zhao, C. (2020). Path Aggregation and Dual Supervision Network for Scene Text Detection. In: Peng, Y., et al. Pattern Recognition and Computer Vision. PRCV 2020. Lecture Notes in Computer Science(), vol 12307. Springer, Cham. https://doi.org/10.1007/978-3-030-60636-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-60636-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60635-0
Online ISBN: 978-3-030-60636-7
eBook Packages: Computer ScienceComputer Science (R0)