ABSTRACT
Abstract: We introduce a new type of text detection neural network, which can accurately locate the position of the text in a variety of complex environments and give the best rectangle containing them. It is composed of three parts, the first part is the backbone composed of residual network, which is responsible for refining the feature map. the second part is the sequence module composed of transformer, which processes the feature map as a linear behavioral unit, so as to deeply mine the context of characters in the image, and the last part is the multi-scale detection module, which is based on different sizes of feature maps The best target box is detected as the result. The residual backbone ensures that there will be no gradient explosion in the process of back propagation.as information between grid cells in the same line is consistent, the transformer module pay more attention to the text line. The detection module uses multiple anchors in the vertical direction at the same time, which achieves good results in speed and accuracy. Based on the data set icdar2015, which is commonly used in the field of text detection, we do experiments and achieve a f score of 0.7.
- Jiří Martínek, Lenc L, Pavel Král. Building an efficient OCR system for historical documents with little training data[J]. Neural Computing and Applications, 2020(3).Google Scholar
- Epshtein B, Ofek E, Wexler Y . Detecting Text in Natural Scenes with Stroke Width Transform[C]// Computer Vision & Pattern Recognition. IEEE, 2010.Google Scholar
- Kingma D, Ba J. Adam: A Method for Stochastic Optimization[J]. Computer Science, 2014.Google Scholar
- Ren S, He K, Girshick R, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6):1137-1149.Google ScholarDigital Library
- Redmon J, Divvala S, Girshick R, You Only Look Once: Unified, Real-Time Object Detection[J]. 2015.Google Scholar
- Redmon J, Farhadi A. YOLOv3: An Incremental Improvement[J]. arXiv e-prints, 2018.Google Scholar
- Tian Z, Huang W, He T, Detecting Text in Natural Image with Connectionist TextGoogle Scholar
- Proposal Network[C]// European Conference on Computer Vision. Springer, Cham, 2016.Google Scholar
- He K, Zhang X, Ren S, Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.Google Scholar
- Karatzas D, Gomez-Bigorda L, Nicolaou A, ICDAR 2015 competition on Robust Reading[C]// International Conference on Document Analysis & Recognition. IEEE Computer Society, 2015.Google Scholar
- Tian Z, Huang W, He T, Detecting Text in Natural Image with Connectionist Text Proposal Network[J]. 2016.Google ScholarCross Ref
- Deep Residual Learning for Image Recognition[C]// IEEE Conference on Computer Vision & Pattern Recognition. IEEE Computer Society, 2016.Google Scholar
- Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift[J]. JMLR.org, 2015.Google Scholar
- Vaswani A, Shazeer N, Parmar N, Attention is all you need[J]. arXiv preprint arXiv:1706.03762, 2017.Google Scholar
Recommendations
Improved localization accuracy by LocNet for Faster R-CNN based text detection in natural scene images
Highlights- We are motivated to address the text localization accuracy problem and propose replacing the bounding box regression module with a novel LocNet based ...
AbstractAlthough Faster R-CNN based text detection approaches have achieved promising results, their localization accuracy is not satisfactory in certain cases due to their sub-optimal bounding box regression based localization modules. In ...
Image Target Detection Using Morphological Neural Network
CIS '09: Proceedings of the 2009 International Conference on Computational Intelligence and Security - Volume 01It is known that detecting small targets in remotely sensed image is difficult and challenging work. Filter neural network is designed to detect target which based morphological, structure element is used as network parameter, by competition and ...
Abnormal Traffic Detection Based on a Fusion BiGRU Neural Network
Advances in Swarm IntelligenceAbstractAs network security is getting more and more attention, methods for anomalous traffic detection are proposed. However, the methods for anomalous traffic detection have problems such as low detection rate and high false alarm rate, so this paper ...
Comments