Abstract
Currently, the existing object detection methods have many limitations in detecting defects on the base surface of metal sensors, such as a high rate of false detection and missed detection. Therefore, we proposed an improved algorithm based on You Look Only Once (YOLO) v5s aiming to solve the problem. Firstly, the C3 module was poor at detecting small defects. To enrich the gradient flow information and improve the detection accuracy of defects, the C2f module was used to replace part of the C3 module in the neck of the YOLO v5s. Then, an improved attention mechanism named Dilated Global Attention Mechanism was proposed to make the network focus more on the important information features. The dilated convolution was integrated into the spatial attention mechanism to enhance the receptive field of the model, reduce the model size and improve the detection performance of small defects. Finally, we proposed a novel localization loss function named Intersection over Union (IoU) with Normalized Wasserstein Distance, which not only alleviated the issue of Complete IoU loss based metrics being sensitive to the location deviations of small defects but also adjusted to diverse datasets. Results from ablation experiments demonstrated that the improved YOLO v5s algorithm enhanced the detection of the mean Average Precision by 5.3% and the Precision rate (P) by 7% compared with the original algorithm.















Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Mordia, R.: Visual techniques for defects detection in steel products: a comparative study. Eng. Fail. Anal. 134, 106047 (2022)
Cheng, J.Y.: Research on metal surface defect detection by improved YOLOv3. Comput. Eng. Appl. 57(19), 252–258 (2021)
Ren, S.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)
Girshick, R.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
He, K.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Liu, W.: Ssd: Single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference. Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37 (2016)
Redmon, J.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J.: Yolov3: An Incremental Improvement. arXiv:1804.02767 (2018)
Bochkovskiy, A.: Yolov4: Optimal Speed and Accuracy of Object Detection. arXiv:2004.10934 (2020)
Ge, Z.: Yolox: Exceeding Yolo Series in 2021. arXiv:2107.08430 (2021)
Wang, C. Y.: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–7475 (2023)
Zhang, B.: Research on surface defect detection of rare-earth magnetic materials based on improved SSD. Complexity 1, 4795396 (2021)
Liu, S.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Yang, M.: A deep learning model S-Darknet suitable for small target detection. J. Phys. Conf. Ser. 1871(1), 012118 (2021)
Wang, M.: FE-YOLOv5: feature enhancement network based on YOLOv5 for small object detection. J. Vis. Commun. Image R 90, 103752 (2023)
Li, C.: Steel surface defect detection method based on improved YOLOX. IEEE Access (2024)
Yang, R.: KPE-YOLOv5: an improved small target detection algorithm based on YOLOv5. Electronics 12(4), 817 (2023)
Wang, C. Y.: CSPNet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 390–391 (2020)
Lin, T. Y.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Hu, J.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Wang, Q.: ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11534–11542 (2020)
Woo, S.: CBAM: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Liu, Y.: Global Attention Mechanism: Retain Information to Enhance Channel-Spatial Interactions. arXiv:2112.05561 (2021)
Zheng, Z.: Distance-IoU loss: Faster and better learning for bounding box regression. Proc. AAAI Conf. Artif. Intell. 34(7), 12993–13000 (2020)
Wang, J.: A normalized Gaussian Wasserstein Distance for Tiny Object Detection. arXiv:2110.13389 (2021)
Zhu, X.: Enhanced feature Fusion structure of YOLO v5 for detecting small defects on metal surfaces. Int. J. Mach. Learn. Cybern. 14(6), 2041–2051 (2023)
Xiao, D.: A detection method of spangle defects on zinc-coated steel surfaces based on improved YOLO-v5. Int. J. Adv. Manuf. Technol. 128(1–2), 937–951 (2023)
Zhou, C.: Metal surface defect detection based on improved YOLOv5. Sci. Rep. 13(1), 20803 (2023)
Lv, Z.: LAACNet: Lightweight adaptive activation convolution network-based defect detection on polished metal surfaces. Eng. Appl. Artif. 133, 108482 (2024)
Tan, M.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020)
Wang, C. Y.: Designing Network Design Strategies Through Gradient Path Analysis. arXiv:2211.04800 (2022)
Liu, K.: Underwater target detection based on improved YOLOv7. J. Mar. Sci. Eng. 11(3), 677 (2023)
Wang, X.: Single shot multibox detector object detection based on attention mechanism and feature fusion. J. Electron. Imaging 32(2), 023032–023032 (2023)
Lin, T. Y. Microsoft coco: common objects in context. In: Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13, pp. 740–755 (2014)
Deng, J.: Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. 248–255 (2009)
Cao, C.: A Survey of Mix-Based Data Augmentation: Taxonomy, Methods, Applications, and Explainability. arXiv:2212.10888 (2022)
Ding, K.: Data augmentation for deep graph learning: a survey. ACM SIGKDD Exp. Newsl. 24(2), 61–77 (2022)
Kumar, T.: Advanced Data Augmentation Approaches: A Comprehensive Survey and Future Directions. arXiv:2301.02830 (2023)
Hou, Q.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Rezatofighi, H.: Generalized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 658–666 (2019)
Acknowledgements
The paper work was supported by Jiangsu Key Laboratory of Advanced Food Manufacturing Equipment and Technology (FMZ201901), and the National Natural Science Foundation of China “Research on bionic chewing robot for physical property detection and evaluation of food materials” (51375209).
Author information
Authors and Affiliations
Contributions
Bufan Zhang, Xingfei Zhu and Jinghu Yu performed conceptualization and methodology. Bufan Zhang, Xingfei Zhu, Zhaofei Sun and Qimeng Wang performed data curation. Bufan Zhang, Xingfei Zhu, Jinghu Yu, Zhaofei Sun and Qimeng Wang performed formal analysis. Bufan Zhang and Xingfei Zhu performed draft manuscript. Bufan Zhang, Xingfei Zhu and Jinghu Yu performed supervision. All authors reviewed the results and approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, B., Zhu, X., Yu, J. et al. Metal sensor base defects detection using deep learning based YOLO network. SIViP 19, 47 (2025). https://doi.org/10.1007/s11760-024-03685-1
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11760-024-03685-1