Abstract
Most of the mainstream object detectors are unable to handle the problem of small object detection. Therefore, we proposed a small target deep convolution recognition algorithm which was based on the improved YOLOv4 network. Firstly, in order to obtain more object feature information and improve the detection efficiency of multi-scale small objects, spatial pyramid pooling with different pooling core sizes was introduced; To improve the value of the anchor frame, an improved adaptive anchor structure was proposed; finally, for enhancing the learning ability of the neural network and reduce the calculation cost, two cross stage partial parallel structures are adopted. In order to verify the feasibility of our algorithm, this paper uses small and micro electronic components in the industrial assembly line to construct a data set. Experiments show that compared with the original YOLOv4, the average detection speed and accuracy of the improved network are increased by about 30% and 7% respectively.








Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems (NIPS), pp 91–99
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 779–788
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 7263– 7271
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv: 1804.02767
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. In: Proceedings of the European conference on computer vision (ECCV), pp 21–37
He W, Huang Z, Wei Z, Li C, Guo B (2019) TF-YOLO: an improved incremental network for real-time object detection. Appl Sci 9(16):3225
Zhuang D, Jiang M, Kong J et al (2021) Spatiotemporal attention enhanced features fusion network for action recognition. Int J Mach Learn Cybern 12:823–841. https://doi.org/10.1007/s13042-020-01204-5
Miao Y, Xiangbin S (2021) A deep learning model S-Darknet suitable for small target detection. J Phys Conf Ser 1871(1):012118
Wang H, Hu Z, Guo Y, Yang Z, Zhou F, Xu P (2020) A real-time safety helmet wearing detection approach based on CSYOLOv3. Appl Sci 10:6732
Bochkovskiy A, Wang CY, Liao HYM (2020) YOLOv4: optimal speed and accuracy of object detection. In: IEEE conference on computer vision and pattern recognition. 2020. arXiv: 2004.10934
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 40(4):834–848
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell (TPAMI) 37(9):1904–1916
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 8759–8768
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv: 1804.02767
Shang R, Zhang J, Jiao L, Li Y, Marturi N, Stolkin R (2020) Multi-scale adaptive feature fusion network for semantic segmentation in remote sensing images. Remote Sens 12(5):872
Yu J, Jiang Y, Wang Z, Cao Z, Huang T (2016) UnitBox: an advanced object detection network. In: Proceedings of the 24th ACM international conference on multimedia, vol 3, pp 516–520
Li S, Yang L, Huang J, Hua X-S, Zhang L (2019) Dynamic anchor feature selection for single-shot object detection. In: Proceedings of the IEEE international conference on computer vision (ICCV), vol 12, pp 6609–6618
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D. (2020) Distance-IoU loss: faster and better learning for bounding box regression. In: Proceedings of the AAAI conference on artificial intelligence (AAAI), vol 3
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Jebadurai J, Jebadurai IJ, Paulraj GJL, Samuel NE (2019) Super-resolution of digital images using CNN with leaky ReLU. Int J Recent Technol Eng 8(2S8)
Wang C-Y, Liao H-YM, Wu Y-H, Chen P-Y, Hsieh J-W, Yeh I-H (2020) CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshop (CVPR workshop), vol 2
Konka P, Lingam R, Singh UA, Shivaprasad CH, Reddy NV (2020) Enhancement of accuracy in double sided incremental forming by compensating tool path for machine tool errors. Int J Adv Manuf Technol 111(3):1187–1199
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, F., Gao, D., Yang, Y. et al. Small target deep convolution recognition algorithm based on improved YOLOv4. Int. J. Mach. Learn. & Cyber. 14, 387–394 (2023). https://doi.org/10.1007/s13042-021-01496-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-021-01496-1