Abstract
Fabric defect detection is an important part of the textile industry, aiming at the problems of many types of fabric defects, small size defects and unbalanced samples, an improved YOLOv5 fabric defect detection algorithm, FD-YOLOv5, was proposed. First, the coordinate attention module is embedded in the YOLOv5 backbone network structure to replace the bottleneck structure in the original network model. While reducing the amount of parameters and calculation, it enhances the ability of the network to extract features and improves the model's ability to detect small target defects. Secondly, a smoother Mish activation function is used in the original model convolution structure for model training, which improves the nonlinear expression ability of the model; the SIoU loss function considering the direction of the anchor box is used to improve the convergence speed and detection accuracy of the model. Finally, combining the focal loss and GHM loss functions as the target confidence loss function to solve the problem of sample imbalance in the fabric defect dataset. The experimental results based on the public fabric defect dataset of Aliyun TianChi shows that the mAP@.5 and mAP@.5:.95 of the improved algorithm are 65.1% and 30.4%, respectively, which are 8.3% and 3.2% higher than the original model, respectively, and the parameter amount, calculation amount and weight of the model are reduced by 8.4%, 11.2% and 14.3%, respectively, compared with the original model. Even compared with the state-of-the-art YOLOv7 model, the mAP@.5 value of the proposed model is improved by 6.5%. Although the FPS value is lower than YOLOv7 model, it also achieves a detection speed of 79 frames per second, which can meet the real-time demand. The experimental results demonstrate the effectiveness of the method in this paper, which can provide a reference for the automatic detection method of fabric defects.
Similar content being viewed by others
Data availability
The datasets generated during the current study are available in the TianChi website of Aliyun [https://tianchi.aliyun.com/dataset/79336].
References
Girshick, R., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (2014). https://doi.org/10.1109/cvpr.2014.81
Girshick, R.: Fast r-cnn. Proc. IEEE Int. Conf. Comput. Vis. (2015). https://doi.org/10.1109/ius54386.2022.9957216
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN:towards realtime objectdetection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/tpami.2016.2577031
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779–788. https://doi.org/10.1109/cvpr.2016.91 (2016)
Liu, W., Anguelov, D., Erhan, D. et al.: SSD: single shot multiBox detector. In: Amsterdam: European Conference on ComputerVision, pp. 21–37. (2016)
Lu, S., Liu, X., He, Z., Zhang, X., Liu, W., Karkee, M.: Swin-transformer-YOLOv5 for real-time wine grape bunch detection. Remote Sens. 14(22), 5853 (2022). https://doi.org/10.3390/rs14225853
Zhao, Z., Yang, X., Zhou, Y., Sun, Q., Ge, Z., Liu, D.: Real-time detection of particleboard surface defects based on improved YOLOV5 target detection. Sci. Rep. 11(1), 1–15 (2021). https://doi.org/10.1038/s41598-021-01084-x
Zhu, X., Lyu, S., Wang, X., Zhao, Q.: TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. IEEE/CVF Int. Conf. Comput. Vis. Workshops (ICCVW) 2021, 2778–2788 (2021). https://doi.org/10.1109/iccvw54120.2021.00312
Chen, W., Huang, H., Peng, S., et al.: YOLO-face: a real-time face detector. Vis. Comput. 37, 805–813 (2021). https://doi.org/10.1007/s00371-020-01831-7
Wang, C.Y., Liao, H.Y., Wu Y.H. et al.: CSPNet: a new backbone that can enhance learning capability of CNN. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1571–1580. https://doi.org/10.1109/cvprw50498.2020.00203 (2020)
He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015). https://doi.org/10.1109/tpami.2015.2389824
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936–944. (2017)
Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., Shen, C.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8440–8449. https://doi.org/10.1109/iccv.2019.00853 (2019)
Zheng, Z., Wang, P., Liu, W., et al.: Distance-IoU loss: faster and better learning for bounding box regression. Proc. AAAI Conf. Artif. Intell. 34(07), 12993–13000 (2020). https://doi.org/10.1609/aaai.v34i07.6999
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13713–13722. https://doi.org/10.1109/cvpr46437.2021.01350 (2021)
Misra, D.M.: A self regularized non-monotonic activation function. arXiv preprint https://arxiv.org/abs/1908.08681 (2019). Accessed 23 Aug 2019
Gevorgyan, Z.: SIoU loss: more powerful learning for bounding box regression. arXiv preprint https://arxiv.org/abs/2205.12740. (2022)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp. 2980–2988. https://doi.org/10.1109/iccv.2017.324 (2017)
Li, B., Liu, Y., Wang, X.: Gradient harmonized single-stage detector. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, No. 01, pp. 8577–8584. https://doi.org/10.1609/aaai.v33i01.33018577 (2019)
Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.: Unitbox: an advanced object detection network. In: Proceedings of the 24th ACM International Conference on Multimedia, pp. 516–520. https://doi.org/10.1145/2964284.2967274 (2016)
Zheng, Z., Wang, P., Ren, D., Liu, W., Ye, R., Hu, Q., Zuo, W.: Enhancing geometric factors in model learning and inference for object detection and instance segmentation. IEEE Trans. Cybernet. (2021). https://doi.org/10.1109/tcyb.2021.3095305
Zhang, Y.F., Ren, W., Zhang, Z., Jia, Z., Wang, L., Tan, T.: Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 506, 146–157 (2022)
Tianchi, Smart diagnosis of cloth flaw dataset[EB/OL]. https://tianchi.aliyun.com/dataset/dataDetail?dataId=79336, (2020)
Woo, S., Park, J., Lee, J. Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19. (2018)
Wang, Q., et al.: ECA-Net: efficient channel attention for deep convolutional neural networks. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2020). https://doi.org/10.1109/cvpr42600.2020.01155
Howard, A., Sandler, M., Chu, G., Chen, L. C., Chen, B., Tan, M., Adam, H.: Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324. https://doi.org/10.1109/iccv.2019.00140 (2019)
Funding
This paper was funded and technically supported by the “National Innovation Center of Advanced Dyeing & Finishing Technology”.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study deign. Data collection and analysis were performed by KX and FL. The draft of the manuscript was written by KX, and FL commented on previous versions of the manuscript. Experimental guidance and equipment were provided by ZH and GZ. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, F., Xiao, K., Hu, Z. et al. Fabric defect detection algorithm based on improved YOLOv5. Vis Comput 40, 2309–2324 (2024). https://doi.org/10.1007/s00371-023-02918-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-02918-7