skip to main content
10.1145/3448823.3448866acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicvispConference Proceedingsconference-collections
research-article

TextPolar: Accurate Scene Text Detection in the Polar Coordinate

Authors Info & Claims
Published:04 March 2021Publication History

ABSTRACT

Driven by deep learning and instance segmentation, scene text detection based on segmentation has achieved remarkable results during the past few years. However, most existing pixel-wised segmentation-based detectors may fail to separate two close text instances because of their adjacent boundaries. To tackle this issue, inspired by Polarmask, we present a detector for curved text detection, which formulates segmentation problem as predicting contour of text instance. We model scene text instances in the polar coordinate and it contains two sub-tasks: text instance center classification and distance regression. Moreover, we propose Polygon Center-ness and Spatial Attention Module, which helps suppress the low-quality detected bounding boxes and improves the overall performance by a large margin. We evaluate the proposed method on curved challenging benchmarks Total-Text and CTW1500. Experiments demonstrate that our method outperforms comparable performance.

References

  1. Gupta, Ankush, Andrea Vedaldi, and Andrew Zisserman. "Synthetic data for text localisation in natural images." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.Google ScholarGoogle Scholar
  2. Ch'ng, Chee Kheng, and Chee Seng Chan. "Total-text: A comprehensive dataset for scene text detection and recognition." 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE, 2017.Google ScholarGoogle Scholar
  3. Yuliang, Liu, et al. "Detecting curve text in the wild: New dataset and new solution." arXiv preprint arXiv:1712.02170 (2017).Google ScholarGoogle Scholar
  4. Liao, Minghui, et al. "Textboxes: A fast text detector with a single deep neural network." arXiv preprint arXiv:1611.06779 (2016).Google ScholarGoogle Scholar
  5. Tian, Zhi, et al. "Detecting text in natural image with connectionist text proposal network." European conference on computer vision. Springer, Cham, 2016.Google ScholarGoogle Scholar
  6. Tian, Zhi, et al. "Fcos: Fully convolutional one-stage object detection." Proceedings of the IEEE international conference on computer vision. 2019.Google ScholarGoogle Scholar
  7. Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.Google ScholarGoogle Scholar
  8. Liao, Minghui, et al. "Textboxes: A fast text detector with a single deep neural network." arXiv preprint arXiv:1611.06779 (2016).Google ScholarGoogle Scholar
  9. Liao, Minghui, Baoguang Shi, and Xiang Bai. "Textboxes++: A single-shot oriented scene text detector." IEEE transactions on image processing 27.8 (2018): 3676--3690.Google ScholarGoogle Scholar
  10. Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.Google ScholarGoogle Scholar
  11. Long, Shangbang, et al. "Textsnake: A flexible representation for detecting text of arbitrary shapes." Proceedings of the European conference on computer vision (ECCV). 2018.Google ScholarGoogle Scholar
  12. P. Lyu, M. Liao, C. Yao, W. Wu, and X. Bai. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes. Proceedings of the European Conference on Computer Vision (ECCV), 2018. 2, 6Google ScholarGoogle ScholarCross RefCross Ref
  13. Xie, Enze, et al. "Scene text detection with supervised pyramid context network." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019.Google ScholarGoogle Scholar
  14. Liao, Minghui, et al. "Real-Time Scene Text Detection with Differentiable Binarization." AAAI. 2020.Google ScholarGoogle Scholar
  15. Xie, Enze, et al. "Polarmask: Single shot instance segmentation with polar representation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.Google ScholarGoogle Scholar
  16. Lin, Tsung-Yi, et al. "Focal loss for dense object detection." Proceedings of the IEEE international conference on computer vision. 2017.Google ScholarGoogle Scholar
  17. Peng, Sida, et al. "Deep Snake for Real-Time Instance Segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.Google ScholarGoogle Scholar
  18. Wang, W., et al. "Shape Robust Text Detection with Progressive Scale Expansion Network" Computer Vision and Pattern Recognition, pp. 9336--9345 (2019).Google ScholarGoogle Scholar
  19. Yao, Cong, et al. "Detecting texts of arbitrary orientations in natural images." 2012 IEEE conference on computer vision and pattern recognition. IEEE, 2012.Google ScholarGoogle Scholar
  20. Ye, Q., and D. Doermann. "Text detection and recognition in imagery: A survey. Pattern Analysis and Machine Intelligence." IEEE Transactions on, PP (99) (2014): 1--1.Google ScholarGoogle Scholar
  21. Yin, Xu-Cheng, et al. "Robust text detection in natural scene images." IEEE transactions on pattern analysis and machine intelligence 36.5 (2013): 970--983.Google ScholarGoogle Scholar
  22. Huang, Weilin, et al. "Text localization in natural images using stroke feature transform and text covariance descriptors." Proceedings of the IEEE international conference on computer vision. 2013.Google ScholarGoogle Scholar
  23. Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Redmon, Joseph, et al. "You only look once: Unified, realtime object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.Google ScholarGoogle Scholar
  25. Shi, Baoguang, Xiang Bai, and Serge Belongie. "Detecting oriented text in natural images by linking segments." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.Google ScholarGoogle Scholar
  26. Chen, Kai, et al. "Mmdetection: Open mmlab detection toolbox and benchmark." arXiv preprint arXiv:1906.07155 (2019).Google ScholarGoogle Scholar
  27. Paszke, Adam, et al. "Pytorch: An imperative style, high-performance deep learning library." Advances in neural information processing systems. 2019.Google ScholarGoogle Scholar
  28. Xu, Yongchao, et al. "Textfield: Learning a deep direction field for irregular scene text detection." IEEE Transactions on Image Processing 28.11 (2019): 5566--5579.Google ScholarGoogle Scholar

Index Terms

  1. TextPolar: Accurate Scene Text Detection in the Polar Coordinate

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICVISP 2020: Proceedings of the 2020 4th International Conference on Vision, Image and Signal Processing
      December 2020
      366 pages
      ISBN:9781450389532
      DOI:10.1145/3448823

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 March 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      ICVISP 2020 Paper Acceptance Rate60of147submissions,41%Overall Acceptance Rate186of424submissions,44%
    • Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader