Real-Time Accurate Text Detection with Adaptive Double Pyramid Network

Zhou, Weina; Song, Wanyu

doi:10.1007/s11063-022-11080-5

Real-Time Accurate Text Detection with Adaptive Double Pyramid Network

Published: 17 November 2022

Volume 55, pages 5055–5067, (2023)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

210 Accesses
Explore all metrics

Abstract

Segmentation-based methods have been widely adopted in scene text detection recently, for they could more accurately predict the shape of various scene text at pixel-level than other methods. However, complicated feature aggregation or label assignment algorithms used in current segmentation-based methods would significantly decrease the detection speed during the improving of accuracy. In this paper, we present an Adaptive Double Pyramid Network (ADPNet) for real-time detection of arbitrary-shaped text, which sets a Double Feature Enhancement Pyramid using Packet Downsampling Units (PDUnits) to enhance feature maps with a minimal amount of processing. The performance of ADPNet is validated on three benchmark datasets, and it shows that ADPNet obtains state-of-the-art performance in both speed and accuracy. Specifically, the proposed network achieves an F-measure of 85.7% while running at 40.5 fps on the ICDAR2015 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Arbitrary-shaped text detection with adaptive convolution and path enhancement pyramid network

Article 10 August 2020

A spatial feature adaptive network for text detection

Article 28 February 2022

Adaptive Segmentation Network for Scene Text Detection

References

Yu J, Yao J, Zhang J, Yu Z, Tao D (2019) Single pixel reconstruction for one-stage instance segmentation. arXiv preprint arXiv:1904.07426
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440
Zhang Z, Zhang C, Shen W, Yao C, Liu W, Bai X (2016) Multi-oriented text detection with fully convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4159–4167
Deng D, Liu H, Li X, Cai D (2018) Pixellink: Detecting scene text via instance segmentation. In: Proceedings of the AAAI conference on artificial intelligence, vol. 32
Long S, Ruan J, Zhang W, He X, Wu W, Yao C (2018) Textsnake: a flexible representation for detecting text of arbitrary shapes. In: Proceedings of the European conference on computer vision (ECCV), pp. 20–36
Wang W, Xie E, Li X, Hou W, Lu T, Yu G, Shao S (2019) Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9336–9345
Liao M, Wan Z, Yao C, Chen K, Bai X (2020) Real-time scene text detection with differentiable binarization. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, 11474–11481
Li J, Lin Y, Liu R, Ho CM, Shi H (2021) Rsca: real-time segmentation-based context-aware scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2349–2358
Tian Z, Shu M, Lyu P, Li R, Zhou C, Shen X, Jia J (2019) Learning shape-aware embedding for scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4234–4243
Zhang J, Cao Y, Wu Q (2021) Vector of locally and adaptively aggregated descriptors for image feature representation. Pattern Recognit. 116:107952
Article Google Scholar
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems 28
Huang J, Jiang Z, Zhang H, Cai B, Yao Y (2017) Region proposal for ship detection based on structured forests edge method. In: 2017 IEEE international geoscience and remote sensing symposium (IGARSS), pp. 1856–1859. IEEE
Wang S, Liu Y, He Z, Wang Y, Tang Z (2020) A quadrilateral scene text detector with two-stage network architecture. Pattern Recognit 102:107230
Article Google Scholar
Xue C, Lu S, Hoi S (2022) Detection and rectification of arbitrary shaped scene texts by using text keypoints and links. Pattern Recognit 124:108494
Article Google Scholar
Deng L, Gong Y, Lin Y, Shuai J, Tu X, Zhang Y, Ma Z, Xie M (2019) Detecting multi-oriented text with corner-based region proposals. Neurocomputing 334:134–142
Article Google Scholar
Li J, Cheng B, Feris R, Xiong J, Huang TS, Hwu W-M, Shi H (2021) Pseudo-iou: Improving label assignment in anchor-free object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2378–2387
Mafla A, Tito R, Dey S, Gómez L, Rusiñol M, Valveny E, Karatzas D (2021) Real-time lexicon-free scene text retrieval. Pattern Recognit 110:107656
Article Google Scholar
Zhou W, Chen K (2022) A lightweight hand gesture recognition in complex backgrounds. Displays 74:102226
Article Google Scholar
Zhou X, Yao C, Wen H, Wang Y, Zhou S, He W, Liang J (2017) East: an efficient and accurate scene text detector. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5551–5560
Xu X, Zhang Z, Wang Z, Price B, Wang Z, Shi H (2021) Rethinking text segmentation: A novel dataset and a text-specific refinement approach. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12045–12055
Qiao L, Tang S, Cheng Z, Xu Y, Niu Y, Pu S, Wu F (2020) Text perceptron: Towards end-to-end arbitrary-shaped text spotting. In: Proceedings of the AAAI conference on artificial intelligence, vol. 34, pp. 11899–11907
Cao M, Zou Y (2020) All you need is a second look: Towards tighter arbitrary shape text detection. In: ICASSP 2020-2020 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp. 2228–2232. IEEE
Wang W, Xie E, Song X, Zang Y, Wang W, Lu T, Yu G, Shen C (2019) Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8440–8449
Wang Y, Xie H, Zha Z-J, Xing M, Fu Z, Zhang Y (2020) Contournet: Taking a further step toward accurate arbitrary-shaped scene text detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11753–11762
Liang M, Hou J-B, Zhu X, Yang C, Qin J, Yin X-C (2021) Multi-orientation scene text detection with scale-guided regression. Neurocomputing 461:310–318
Article Google Scholar
Gupta A, Vedaldi A, Zisserman A (2016) Synthetic data for text localisation in natural images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2315–2324
Yao C, Bai X, Liu W, Ma Y, Tu Z (2012) Detecting texts of arbitrary orientations in natural images. In: 2012 IEEE conference on computer vision and pattern recognition, pp. 1083–1090. IEEE
Yao C, Bai X, Liu W (2014) A unified framework for multioriented text detection and recognition. IEEE Transact Image Process 23(11):4737–4749
Article MathSciNet MATH Google Scholar
Liu Y, Jin L, Zhang S, Luo C, Zhang S (2019) Curved scene text detection via transverse and longitudinal sequence connection. Pattern Recognit 90:337–345
Article Google Scholar
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141
Liao M, Zhu Z, Shi B, Xia G-s, Bai X (2018) Rotation-sensitive regression for oriented scene text detection. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp. 5909–5918

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 61404083) and State Key Laboratory of ASIC & System (2021KF010).

Author information

Authors and Affiliations

College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, China
Weina Zhou & Wanyu Song

Authors

Weina Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Wanyu Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weina Zhou.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhou, W., Song, W. Real-Time Accurate Text Detection with Adaptive Double Pyramid Network. Neural Process Lett 55, 5055–5067 (2023). https://doi.org/10.1007/s11063-022-11080-5

Download citation

Accepted: 20 October 2022
Published: 17 November 2022
Issue Date: August 2023
DOI: https://doi.org/10.1007/s11063-022-11080-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-Time Accurate Text Detection with Adaptive Double Pyramid Network

Abstract

Access this article

Similar content being viewed by others

Arbitrary-shaped text detection with adaptive convolution and path enhancement pyramid network

A spatial feature adaptive network for text detection

Adaptive Segmentation Network for Scene Text Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Real-Time Accurate Text Detection with Adaptive Double Pyramid Network

Abstract

Access this article

Similar content being viewed by others

Arbitrary-shaped text detection with adaptive convolution and path enhancement pyramid network

A spatial feature adaptive network for text detection

Adaptive Segmentation Network for Scene Text Detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation