skip to main content
10.1145/3581807.3581854acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccprConference Proceedingsconference-collections
research-article

ResAsapp: An Effective Convolution to Distinguish Adjacent Pixels For Scene Text Detection

Authors Info & Claims
Published:22 May 2023Publication History

ABSTRACT

The segmentation-based approach is an essential direction of scene text detection, and it can detect arbitrary or curved text, which has attracted the increasing attention of many researchers. However, extensive research has shown that the segmentation-based method will be disturbed by adjoining pixels and cannot effectively identify the text boundaries. To tackle this problem, we proposed a ResAsapp Conv based on the PSE algorithm. This convolution structure can provide different scale visual fields about the object and make it effectively recognize the boundary of texts. The method's effectiveness is validated on three benchmark datasets, CTW1500, Total-Text, and ICDAR2015 datasets. In particular, on the CTW1500 dataset, a dataset full of long curve text in all kinds of scenes, which is hard to distinguish, our network achieves an F-measure of 81.2%.

References

  1. Liao, Minghui, "Textboxes: A fast text detector with a single deep neural network." Thirty-first AAAI conference on artificial intelligence. 2017.Google ScholarGoogle Scholar
  2. Ma, Jianqi, "Arbitrary-oriented scene text detection via rotation proposals." IEEE Transactions on Multimedia 20.11 (2018): 3111-3122.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Tian, Zhi, "Detecting text in natural image with connectionist text proposal network." European conference on computer vision. Springer, Cham, 2016.Google ScholarGoogle Scholar
  4. Zhang, Zheng, "Multi-oriented text detection with fully convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.Google ScholarGoogle Scholar
  5. Sun, Lei, "Robust text detection in natural scene images by generalized color-enhanced contrasting extremal region and neural networks." 2014 22nd International Conference on Pattern Recognition. IEEE, 2014.Google ScholarGoogle Scholar
  6. Ren, Shaoqing, "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems 28 (2015).Google ScholarGoogle Scholar
  7. Wang, Yuxin, "Contournet: Taking a further step toward accurate arbitrary-shaped scene text detection." proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.Google ScholarGoogle Scholar
  8. Wang, Hao, "All you need is boundary: Toward arbitrary-shaped text spotting." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. No. 07. 2020.Google ScholarGoogle Scholar
  9. Wang, Xiaobing, "Arbitrary shape scene text detection with adaptive text region representation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.Google ScholarGoogle Scholar
  10. Qin, Siyang, "Towards unconstrained end-to-end text spotting." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.Google ScholarGoogle Scholar
  11. Shi, Baoguang, Xiang Bai, and Serge Belongie. "Detecting oriented text in natural images by linking segments." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.Google ScholarGoogle Scholar
  12. Lyu, Pengyuan, "Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes." Proceedings of the European Conference on Computer Vision (ECCV). 2018.Google ScholarGoogle Scholar
  13. Liao, Minghui, "Mask textspotter v3: Segmentation proposal network for robust scene text spotting." European Conference on Computer Vision. Springer, Cham, 2020.Google ScholarGoogle Scholar
  14. Zhou, Xinyu, "East: an efficient and accurate scene text detector." Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2017.Google ScholarGoogle Scholar
  15. Wang, Wenhai, "Shape robust text detection with progressive scale expansion network." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.Google ScholarGoogle Scholar
  16. Liao, Minghui, "Real-time scene text detection with differentiable binarization." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. No. 07. 2020.Google ScholarGoogle Scholar
  17. Tian, Zhi, "Fcos: Fully convolutional one-stage object detection." Proceedings of the IEEE/CVF international conference on computer vision. 2019.Google ScholarGoogle Scholar
  18. Zhu, Yiqin, "Fourier contour embedding for arbitrary-shaped text detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.Google ScholarGoogle Scholar
  19. Lin, Tsung-Yi, "Feature pyramid networks for object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.Google ScholarGoogle Scholar
  20. Ch'ng, Chee Kheng, and Chee Seng Chan. "Total-text: A comprehensive dataset for scene text detection and recognition." 2017 14th IAPR international conference on document analysis and recognition (ICDAR). Vol. 1. IEEE, 2017.Google ScholarGoogle Scholar
  21. Karatzas, Dimosthenis, "ICDAR 2015 competition on robust reading." 2015 13th international conference on document analysis and recognition (ICDAR). IEEE, 2015.Google ScholarGoogle Scholar
  22. Yuliang, Liu, "Detecting curve text in the wild: New dataset and new solution." arXiv preprint arXiv:1712.02170 (2017).Google ScholarGoogle Scholar
  23. De Boer, Pieter-Tjerk, "A tutorial on the cross-entropy method." Annals of operations research 134.1 (2005): 19-67.Google ScholarGoogle ScholarCross RefCross Ref
  24. Milletari, Fausto, Nassir Navab, and Seyed-Ahmad Ahmadi. "V-net: Fully convolutional neural networks for volumetric medical image segmentation." 2016 fourth international conference on 3D vision (3DV). IEEE, 2016.Google ScholarGoogle Scholar
  25. He, Kaiming, "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.Google ScholarGoogle Scholar
  26. Liu, Yuliang, "ABCNet v2: Adaptive bezier-curve network for real-time end-to-end text spotting." arXiv preprint arXiv:2105.03620 (2021).Google ScholarGoogle Scholar
  27. Feng, Wei, "Textdragon: An end-to-end framework for arbitrary shaped text spotting." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.Google ScholarGoogle Scholar
  28. Tan, Mingxing, Ruoming Pang, and Quoc V. Le. "Efficientdet: Scalable and efficient object detection." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.Google ScholarGoogle Scholar
  29. Liu, Rosanne, "An intriguing failing of convolutional neural networks and the coordconv solution." Advances in neural information processing systems 31 (2018).Google ScholarGoogle Scholar
  30. Chen, Liang-Chieh, "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs." IEEE transactions on pattern analysis and machine intelligence 40.4 (2017): 834-848.Google ScholarGoogle Scholar
  31. Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.Google ScholarGoogle Scholar
  32. Shrivastava, Abhinav, Abhinav Gupta, and Ross Girshick. "Training region-based object detectors with online hard example mining." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.Google ScholarGoogle Scholar
  33. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems 25 (2012).Google ScholarGoogle Scholar
  34. Long, Shangbang, "Textsnake: A flexible representation for detecting text of arbitrary shapes." Proceedings of the European conference on computer vision (ECCV). 2018.Google ScholarGoogle Scholar
  35. He, Kaiming, "Mask r-cnn." Proceedings of the IEEE international conference on computer vision. 2017.Google ScholarGoogle Scholar
  36. Wang, Wenhai, "Efficient and accurate arbitrary-shaped text detection with pixel aggregation network." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.Google ScholarGoogle Scholar
  37. He, Pan, "Single shot text detector with regional attention." Proceedings of the IEEE international conference on computer vision. 2017.Google ScholarGoogle Scholar
  38. Hu, Han, "Wordsup: Exploiting word annotations for character based text detection." Proceedings of the IEEE international conference on computer vision. 2017.Google ScholarGoogle Scholar

Index Terms

  1. ResAsapp: An Effective Convolution to Distinguish Adjacent Pixels For Scene Text Detection

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICCPR '22: Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition
      November 2022
      683 pages
      ISBN:9781450397056
      DOI:10.1145/3581807

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 May 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)10
      • Downloads (Last 6 weeks)0

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format