ABSTRACT
Driven by deep learning and instance segmentation, scene text detection based on segmentation has achieved remarkable results during the past few years. However, most existing pixel-wised segmentation-based detectors may fail to separate two close text instances because of their adjacent boundaries. To tackle this issue, inspired by Polarmask, we present a detector for curved text detection, which formulates segmentation problem as predicting contour of text instance. We model scene text instances in the polar coordinate and it contains two sub-tasks: text instance center classification and distance regression. Moreover, we propose Polygon Center-ness and Spatial Attention Module, which helps suppress the low-quality detected bounding boxes and improves the overall performance by a large margin. We evaluate the proposed method on curved challenging benchmarks Total-Text and CTW1500. Experiments demonstrate that our method outperforms comparable performance.
- Gupta, Ankush, Andrea Vedaldi, and Andrew Zisserman. "Synthetic data for text localisation in natural images." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.Google Scholar
- Ch'ng, Chee Kheng, and Chee Seng Chan. "Total-text: A comprehensive dataset for scene text detection and recognition." 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE, 2017.Google Scholar
- Yuliang, Liu, et al. "Detecting curve text in the wild: New dataset and new solution." arXiv preprint arXiv:1712.02170 (2017).Google Scholar
- Liao, Minghui, et al. "Textboxes: A fast text detector with a single deep neural network." arXiv preprint arXiv:1611.06779 (2016).Google Scholar
- Tian, Zhi, et al. "Detecting text in natural image with connectionist text proposal network." European conference on computer vision. Springer, Cham, 2016.Google Scholar
- Tian, Zhi, et al. "Fcos: Fully convolutional one-stage object detection." Proceedings of the IEEE international conference on computer vision. 2019.Google Scholar
- Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.Google Scholar
- Liao, Minghui, et al. "Textboxes: A fast text detector with a single deep neural network." arXiv preprint arXiv:1611.06779 (2016).Google Scholar
- Liao, Minghui, Baoguang Shi, and Xiang Bai. "Textboxes++: A single-shot oriented scene text detector." IEEE transactions on image processing 27.8 (2018): 3676--3690.Google Scholar
- Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.Google Scholar
- Long, Shangbang, et al. "Textsnake: A flexible representation for detecting text of arbitrary shapes." Proceedings of the European conference on computer vision (ECCV). 2018.Google Scholar
- P. Lyu, M. Liao, C. Yao, W. Wu, and X. Bai. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. Proceedings of the European Conference on Computer Vision (ECCV), 2018. 2, 6Google ScholarCross Ref
- Xie, Enze, et al. "Scene text detection with supervised pyramid context network." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019.Google Scholar
- Liao, Minghui, et al. "Real-Time Scene Text Detection with Differentiable Binarization." AAAI. 2020.Google Scholar
- Xie, Enze, et al. "Polarmask: Single shot instance segmentation with polar representation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.Google Scholar
- Lin, Tsung-Yi, et al. "Focal loss for dense object detection." Proceedings of the IEEE international conference on computer vision. 2017.Google Scholar
- Peng, Sida, et al. "Deep Snake for Real-Time Instance Segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.Google Scholar
- Wang, W., et al. "Shape Robust Text Detection with Progressive Scale Expansion Network" Computer Vision and Pattern Recognition, pp. 9336--9345 (2019).Google Scholar
- Yao, Cong, et al. "Detecting texts of arbitrary orientations in natural images." 2012 IEEE conference on computer vision and pattern recognition. IEEE, 2012.Google Scholar
- Ye, Q., and D. Doermann. "Text detection and recognition in imagery: A survey. Pattern Analysis and Machine Intelligence." IEEE Transactions on, PP (99) (2014): 1--1.Google Scholar
- Yin, Xu-Cheng, et al. "Robust text detection in natural scene images." IEEE transactions on pattern analysis and machine intelligence 36.5 (2013): 970--983.Google Scholar
- Huang, Weilin, et al. "Text localization in natural images using stroke feature transform and text covariance descriptors." Proceedings of the IEEE international conference on computer vision. 2013.Google Scholar
- Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.Google ScholarDigital Library
- Redmon, Joseph, et al. "You only look once: Unified, realtime object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.Google Scholar
- Shi, Baoguang, Xiang Bai, and Serge Belongie. "Detecting oriented text in natural images by linking segments." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.Google Scholar
- Chen, Kai, et al. "Mmdetection: Open mmlab detection toolbox and benchmark." arXiv preprint arXiv:1906.07155 (2019).Google Scholar
- Paszke, Adam, et al. "Pytorch: An imperative style, high-performance deep learning library." Advances in neural information processing systems. 2019.Google Scholar
- Xu, Yongchao, et al. "Textfield: Learning a deep direction field for irregular scene text detection." IEEE Transactions on Image Processing 28.11 (2019): 5566--5579.Google Scholar
Index Terms
- TextPolar: Accurate Scene Text Detection in the Polar Coordinate
Recommendations
Curved Scene Text Detection Based on Mask R-CNN
Image and GraphicsAbstractText detection in natural scenes has achieved good results in existing research methods. However, detecting the curved scene text is still a challenging task because of perspective distortion and variation of text scale. We proposed Mask-CSTD (...
Generic radial orthogonal moment invariants for invariant image recognition
As the variation of parameters in Jacobi polynomial, Jacobi-Fourier moments can form various types of orthogonal moments: Legendre-Fourier moments, Orthogonal Fourier-Mellin moments, Zernike moments, pseudo-Zernike moments, and so on. In this paper, we ...
Radial Tchebichef moment invariants for image recognition
Radial Tchebichef moments as discrete orthogonal moments in the polar coordinate have been successfully used in the field of image recognition. However, the scale invariant property of these moments has not been studied due to its complexity of the ...
Comments