skip to main content
10.1145/3655532.3655558acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicrsaConference Proceedingsconference-collections
research-article

UFNet: A Multi-scale Fusion Feature based Text Detection Method

Published: 28 June 2024 Publication History

Abstract

Recently, the field of text detection has witnessed a growing trend, with more and more segmentation-based methods incorporating feature sampling. Segmentation methods possess a natural advantage in detecting text with both regular and irregular shapes due to their ability to effectively segment diverse targets and backgrounds that exhibit significant differences.The common sampling method in networks is typically the Feature Pyramid Network (FPN), which is used to match different dimensions for detecting the scale of images. However, due to the inherent limitations of scene text, such as variations in aspect ratio, dense text, and differences in width-to-height ratio, these general sampling methods (FPN) may not effectively address these issues. To ease this problem, we have proposed a novel network architecture called Unified Feature Fusion Network (UFNet), which integrates feature sampling. Compared to the DBU network, UFNet achieves significantly better performance in terms of accuracy and recall on English text detection datasets such as ICDAR2015 and the mixed English-Chinese dataset MSRA-TD500. Text detection results indicate that this algorithm solves the problem of poor performance in handling variations in aspect ratio and width-to-height ratio in images.

References

[1]
M. Liao, B. Shi, X. Bai, X. Wang, W. Liu, Textboxes: a fast text detector with a single deep neural network, in: Proceedings of AAAI, 2017, pp. 4161–4167.
[2]
Liao, Minghui, Baoguang Shi, and Xiang Bai. "Textboxes++: A single-shot oriented scene text detector." IEEE transactions on image processing 27.8 (2018): 3676-3690.
[3]
Z. Tian, W. Huang, T. He, P. He, and Y. Qiao, “Detecting text in natural image with connectionist text proposal network,” in Proc. Eur. Conf. Comput. Vis., 2016, pp. 56–72.
[4]
Zhou, Xinyu, "East: an efficient and accurate scene text detector." Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2017.
[5]
Wang, Fangfang, "Geometry-aware scene text detection with instance transformation network." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
[6]
Yuliang, Liu, "Detecting curve text in the wild: New dataset and new solution." arXiv preprint arXiv:1712.02170 (2017).
[7]
Deng, Dan, "Pixellink: Detecting scene text via instance segmentation." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 32. No. 1. 2018.
[8]
Wu, Yue, and Prem Natarajan. "Self-organized text detection with minimal post-processing via border learning." Proceedings of the IEEE International Conference on Computer Vision. 2017.
[9]
Tian, Zhuotao, "Learning shape-aware embedding for scene text detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
[10]
Wang, Wenhai, "Shape robust text detection with progressive scale expansion network." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
[11]
Wang, Wenhai, "Efficient and accurate arbitrary-shaped text detection with pixel aggregation network." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
[12]
Liao, Minghui, "Real-time scene text detection with differentiable binarization." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 34. No. 07. 2020.
[13]
Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.
[14]
Dai, Pengwen, "Progressive Contour Regression for Arbitrary-Shape Scene Text Detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
[15]
He, Minghang, "MOST: A Multi-Oriented Scene Text Detector with Localization Refinement." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
[16]
Zhu, Yiqin, "Fourier contour embedding for arbitrary-shaped text detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
[17]
Tang, Jun, "Seglink++: Detecting dense and arbitrary-shaped scene text by instance-aware component grouping." Pattern recognition 96 (2019): 106954.
[18]
Wang, Yuxin, "Contournet: Taking a further step toward accurate arbitrary-shaped scene text detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[19]
Zhang, Chengquan, "Look more than once: An accurate detector for text of arbitrary shapes." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
[20]
Xue C, Lu S, Zhang W. Msr: Multi-scale shape regression for scene text detection [J]. arXiv preprint arXiv:1901.02596, 2019.
[21]
Karatzas, D.; Gomez-Bigorda, L.; Nicolaou, A.; Ghosh, S.;Bagdanov, A.; Iwamura, M.; Matas, J.; Neumann, L.; Chandrasekhar, V . R.; Lu, S.; 2015. Icdar 2015 competition on robust reading. In Document Analysis and Recognition(ICDAR), 2015 13th International Conference, 1156–1160.IEEE.
[22]
Yao C, Bai X, Liu W, Detecting texts of arbitrary orientations in natural images [C]//2012 IEEE conference on computer vision and pattern recognition. IEEE, 2012: 1083-1090.
[23]
He P, Huang W, He T, Single shot text detector with regional attention [C]//Proceedings of the IEEE international conference on computer vision. 2017: 3047-3055.
[24]
Hu H, Zhang C, Luo Y, Wordsup: Exploiting word annotations for character based text detection [C]//Proceedings of the IEEE international conference on computer vision. 2017: 4940-4949.
[25]
Lyu P, Yao C, Wu W, Multi-oriented scene text detection via corner localization and region segmentation [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7553-7563.
[26]
He W, Zhang X Y, Yin F, Deep direct regression for multi-oriented scene text detection [C]//Proceedings of the IEEE international conference on computer vision. 2017: 745-753.
[27]
Ma J, Shao W, Ye H, Arbitrary-oriented scene text detection via rotation proposals [J]. IEEE transactions on multimedia, 2018, 20(11): 3111-3122.
[28]
Long S, Ruan J, Zhang W, Textsnake: A flexible representation for detecting text of arbitrary shapes [C]//Proceedings of the European conference on computer vision (ECCV). 2018: 20-36.
[29]
Liao M, Wan Z, Yao C, Real-time scene text detection with differentiable binarization [C]//Proceedings of the AAAI conference on artificial intelligence. 2020, 34(07): 11474-11481.
[30]
Cheng Y, Wan Y, Sima Y, Text Detection of Transformer Based on Deep Learning Algorithm [J]. Tehnički vjesnik, 2022, 29(3): 861-866.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICRSA '23: Proceedings of the 2023 6th International Conference on Robot Systems and Applications
September 2023
335 pages
ISBN:9798400708039
DOI:10.1145/3655532
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 June 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. binarization
  2. feature fusion
  3. segmentation network
  4. text detection

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICRSA 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 14
    Total Downloads
  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media