Semantic-Information Space Sharing Interaction Network for Arbitrary Shape Text Detection

Chen, Hua; Wang, Runmin; Zhu, Yanbin; Zhu, Zhenlin; Hei, Jielei; Xu, Juan; Ding, Yajun

doi:10.1007/978-981-99-8540-1_4

Hua Chen¹⁵,
Runmin Wang ORCID: orcid.org/0000-0001-9687-9918¹⁵,
Yanbin Zhu¹⁵,
Zhenlin Zhu¹⁵,
Jielei Hei¹⁵,
Juan Xu¹⁵ &
…
Yajun Ding¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14431))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

732 Accesses
1 Citations

Abstract

Arbitrary shape text detection is a challenging task due to significant variations in text shapes, sizes, and aspect ratios. Previous approaches relying on single-level feature map generated through a top-down fusion of different feature levels have limitations in harnessing high-level semantic information and expressing multi-scale features. To address these challenges, this paper introduces a novel arbitrary shape scene text detector called the Semantic-information Space Sharing Interaction Network (SSINet). The proposed network leverages the Semantic-information Space Sharing Module (SSM) to generate a single-level feature map capable of expressing multi-scale features with rich semantic and prominent foreground, enabling effective processing of text-related information. Experimental evaluations on three benchmark datasets, namely CTW-1500, MSRA-TD500, and ICDAR2017-MLT, validate the effectiveness of our method. The proposed SSINet achieves impressive results with an F-score of 86.0% on CTW-1500, 89.1% on MSRA-TD500, and 72.4% on ICDAR2017-MLT. The code will be available at https://github.com/123cjjjj/SSINet.

This work was supported in part by the Natural Science Foundation of Hunan Province (No. 2020JJ4057), the Key Research and Development Program of Changsha Science and Technology Bureau (No. kq2004050), and the Scientific Research Foundation of the Education Department of Hunan Province of China (No. 21A0052).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

ESRNet: an exploring sample relationships network for arbitrary-shaped scene text detection

Article 09 September 2024

Arbitrary-shaped scene text detection by predicting distance map

Article 07 March 2022

SFENet: Arbitrary Shapes Scene Text Detection with Semantic Feature Extractor

Notes

1.
https://rrc.cvc.uab.es/.

References

Baek, Y., Lee, B., Han, D., Yun, S., Lee, H.: Character region awareness for text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9365–9374 (2019)
Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Chapter Google Scholar
Chen, X., Zhang, R., Yan, P.: Feature fusion encoder decoder network for automatic liver lesion segmentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pp. 430–433. IEEE (2019)
Google Scholar
Ch’ng, C.K., Chan, C.S.: Total-Text: a comprehensive dataset for scene text detection and recognition. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 935–942. IEEE (2017)
Google Scholar
Dai, P., Zhang, S., Zhang, H., Cao, X.: Progressive contour regression for arbitrary-shape scene text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7393–7402 (2021)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Kang, J., Ibrayim, M., Hamdulla, A.: Overview of scene text detection and recognition. In: 2022 14th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA), pp. 661–666. IEEE (2022)
Google Scholar
Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 11474–11481 (2020)
Google Scholar
Liao, M., Zou, Z., Wan, Z., Yao, C., Bai, X.: Real-time scene text detection with differentiable binarization and adaptive scale fusion. IEEE Trans. Pattern Anal. Mach. Intell. 45(1), 919–931 (2022)
Article Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016, Proceedings, Part I 14, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Ma, J., et al.: Arbitrary-oriented scene text detection via rotation proposals. IEEE Trans. Multimedia 20(11), 3111–3122 (2018)
Article MathSciNet Google Scholar
Nayef, N., et al.: ICDAR 2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1454–1459. IEEE (2017)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Google Scholar
Tang, J., et al.: Few could be better than all: feature sampling and grouping for scene text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4563–4572 (2022)
Google Scholar
Wang, W., et al.: Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9336–9345 (2019)
Google Scholar
Wang, Y., Xie, H., Zha, Z.J., Xing, M., Fu, Z., Zhang, Y.: ContourNet: taking a further step toward accurate arbitrary-shaped scene text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11753–11762 (2020)
Google Scholar
Wang, Y., Xie, H., Zha, Z., Tian, Y., Fu, Z., Zhang, Y.: R-Net: a relationship network for efficient and accurate scene text detection. IEEE Trans. Multimedia 23, 1316–1329 (2020)
Article Google Scholar
Yao, C., Bai, X., Liu, W., Ma, Y., Tu, Z.: Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1083–1090. IEEE (2012)
Google Scholar
Ye, J., Chen, Z., Liu, J., Du, B.: TextFuseNet: scene text detection with richer fused features. In: IJCAI, vol. 20, pp. 516–522 (2020)
Google Scholar
Yu, W., Liu, Y., Hua, W., Jiang, D., Ren, B., Bai, X.: Turning a clip model into a scene text detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6978–6988 (2023)
Google Scholar
Yuliang, L., Lianwen, J., Shuaitao, Z., Sheng, Z.: Detecting curve text in the wild: new dataset and new solution. arXiv preprint arXiv:1712.02170 (2017)
Zhang, C., et al.: Look more than once: an accurate detector for text of arbitrary shapes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10552–10561 (2019)
Google Scholar
Zhang, S.X., Yang, C., Zhu, X., Yin, X.C.: Arbitrary shape text detection via boundary transformer. IEEE Trans. Multimedia 1–14 (2023)
Google Scholar
Zhang, S.X., Zhu, X., Chen, L., Hou, J.B., Yin, X.C.: Arbitrary shape text detection via segmentation with probability maps. IEEE Trans. Pattern Anal. Mach. Intell. (2022). https://doi.org/10.1109/TPAMI.2022.3176122
Article Google Scholar
Zhang, S.X., et al.: Deep relational reasoning graph network for arbitrary shape text detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9699–9708 (2020)
Google Scholar
Zhang, S.X., Zhu, X., Yang, C., Wang, H., Yin, X.C.: Adaptive boundary proposal network for arbitrary shape text detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1305–1314 (2021)
Google Scholar
Zhou, X., et al.: EAST: an efficient and accurate scene text detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5551–5560 (2017)
Google Scholar
Zhu, Y., Du, J.: TextMountain: accurate scene text detection via instance segmentation. Pattern Recogn. 110, 107336 (2021)
Article Google Scholar
Zhuang, J., Qin, Z., Yu, H., Chen, X.: Task-specific context decoupling for object detection. arXiv preprint arXiv:2303.01047 (2023)

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, Hunan Normal University, Changsha, 410081, China
Hua Chen, Runmin Wang, Yanbin Zhu, Zhenlin Zhu, Jielei Hei, Juan Xu & Yajun Ding

Authors

Hua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Runmin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yanbin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Zhenlin Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jielei Hei
View author publications
You can also search for this author in PubMed Google Scholar
Juan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yajun Ding
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Runmin Wang .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, H. et al. (2024). Semantic-Information Space Sharing Interaction Network for Arbitrary Shape Text Detection. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14431. Springer, Singapore. https://doi.org/10.1007/978-981-99-8540-1_4

Download citation

DOI: https://doi.org/10.1007/978-981-99-8540-1_4
Published: 25 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8539-5
Online ISBN: 978-981-99-8540-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics