Abstract
Surface defect detection in filled vials is significant for the pharmaceutical safety. Due to the weak features and different scales of defects, it leads to missed and false inspection. In this paper, we firstly design a data acquisition solution and create the custom datasets VialG1_DET, VialG2_DET, VialG3_DET. Secondly, we design a multi-workstation inspection system combining traditional image processing algorithms and deep learning object detection algorithms to detect defects of surface and contents in vials. We propose Defect Detection of Surface and Contents in Vials (DDSCNet) by designing Quadra Fusion and Attention (QUFUAtt) module which enhances the capability of feature fusion in network, introducing the self-attention and convolution (ACmix) which focuses on the defective areas, and Linear Deformable Convolution which extracts the weak features of defects. Our experiments show that the proposed DDSCNet achieves 76.7% mean Average Precision (mAP@0.5) on the VialG1_DET, 65.9% mAP@0.5 on the VialG2_DET, along with 86.9% mAP@0.5 on the VialG3_DET with low computational complexity of 9.3GFLOPS, and outperforms YOLOv11 by 3.5% mAP@0.5.














Similar content being viewed by others
Data availability
Data are available from the authors upon reasonable request.
References
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 580–587
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1440–1448
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7263–7271
Farhadi A, Redmon J (2018) Yolov3: An incremental improvement, (1804) 1–6. Springer, Berlin
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934
Jocher G (2020) YOLOv5 by Ultralytics. https://doi.org/10.5281/zenodo.3908559. https://github.com/ultralytics/yolov5
Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W et al (2022) Yolov6: A single-stage object detection framework for industrial applications. arXiv preprint arXiv:2209.02976
Wang C-Y, Bochkovskiy A, Liao H-YM (2023) Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7464–7475
Ge Z, Liu S, Wang F, Li Z, Sun J (2021) Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430
Jocher G, Qiu J, Chaurasia A (2023) Ultralytics YOLO. https://github.com/ultralytics/ultralytics
Wang C-Y, Yeh I-H, Mark Liao H-Y (2025) Yolov9: Learning what you want to learn using programmable gradient information. In: European Conference on Computer Vision. Springer, pp 1–21
Zhou X, Wang Y, Zhu Q, Mao J, Xiao C, Lu X, Zhang H (2019) A surface defect detection framework for glass bottle bottom using visual attention model and wavelet transform. IEEE Trans Industr Inf 16(4):2189–2201
Zhai Y, Yang K, Zhao Z, Wang Q, Bai K (2022) Geometric characteristic learning r-cnn for shockproof hammer defect detection. Eng Appl Artif Intell 116:105429
Liu M, Chen Y, Xie J, He L, Zhang Y (2023) Lf-yolo: A lighter and faster yolo for weld defect detection of x-ray image, vol 23. IEEE, pp 7430–7439
Zhao C, Shu X, Yan X, Zuo X, Zhu F (2023) Rdd-yolo: a modified yolo for detection of steel surface defects. Measurement 214:112776
Saberironaghi A, Ren J, El-Gindy M (2023) Defect detection methods for industrial products using deep learning techniques: a review. Algorithms 16(2):95
Liu Q, Wang C, Li Y, Gao M, Li J (2022) A fabric defect detection method based on deep learning. IEEE Access 10:4284–4296
Yin X, Chen Y, Bouferguene A, Zaman H, Al-Hussein M, Kurach L (2020) A deep learning-based framework for an automated defect detection system for sewer pipes. Autom Constr 109:102967
Zhang Z, Zhou M, Wan H, Li M, Li G, Han D (2023) Idd-net: industrial defect detection method based on deep-learning. Eng Appl Artif Intell 123:106390
Liu X, Zhu Q, Wang Y, Zhou X, Li K, Liu X (2018) Machine vision based defect detection system for oral liquid vial. In: 2018 13th World Congress on Intelligent Control and Automation (WCICA). IEEE, pp 945–950
Tiong LCO, Yoo HJ, Kim NY, Lee K-Y, Han SS, Kim D (2022) Machine vision for vial positioning detection toward the safe automation of material synthesis. arXiv preprint arXiv:2206.07272
Xu H, Ding F, Zhou W, Han F, Liu Y, Zhu J (2024) Cff-yolo: cross-space feature fusion based yolo model for screw detection in vehicle chassis. SIViP 18(12):8537–8546
Xu H, Han F, Zhou W, Liu Y, Ding F, Zhu J (2024) Esmnet: an enhanced yolov7-based approach to detect surface defects in precision metal workpieces. Measurement 235:114970
Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122
Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 764–773
Qi Y, He Y, Qi X, Zhang Y, Yang G (2023) Dynamic snake convolution based on topological geometric constraints for tubular structure segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6070–6079
Zhang X, Song Y, Song T, Yang D, Ye Y, Zhou J, Zhang L (2024) Ldconv: linear deformable convolution for improving convolutional neural networks. Image Vis Comput 149:105190
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Gao Z, Xie J, Wang Q, Li P (2019) Global second-order pooling convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 3024–3033
Wang X, Girshick R, Gupta A, He K (2018) Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7794–7803
Huang Z, Wang X, Huang L, Huang C, Wei Y, Liu W (2019) Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 603–612
Woo S, Park J, Lee J-Y, Kweon IS (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 3–19
Pan X, Ge C, Lu R, Song S, Chen G, Huang Z, Huang G (2022) On the integration of self-attention and convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 815–825
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2881–2890
Li Y, Chen Y, Wang N, Zhang Z (2019) Scale-aware trident networks for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 6054–6063
Cognex: VisionPro Software (2024)
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, pp 21–37
Ren S, He K, Girshick R, Sun J (2016) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2980–2988
Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10781–10790
Zhao Y, Lv W, Xu S, Wei J, Wang G, Dang Q, Liu Y, Chen J (2024) Detrs beat yolos on real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16965–16974
Jocher G, Qiu J (2024) Ultralytics YOLO11. https://github.com/ultralytics/ultralytics
Tian Z, Shen C, Chen H, He T (2022) Fcos: a simple and strong anchor-free object detector. IEEE Trans Pattern Anal Mach Intell 44(4):1922–1933. https://doi.org/10.1109/TPAMI.2020.3032166
Wang A, Chen H, Liu L, Chen K, Lin Z, Han J et al (2025) Yolov10: real-time end-to-end object detection. Adv Neural Inf Process Syst 37:107984–108011
Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
Acknowledgements
This work was supported in part by Provincial Natural Science Foundation of Hunan (No. 2024JJ5383), and Key Program Scientific Research Fund of Hunan Provincial Education Department (No. 22A0127, No. 23A0155).
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
Haixia Xu contributed to conceptualization, methodology, software, and writing—original draft. Yuting Xu was involved in conceptualization, methodology, software, visualization, and Writing—original draft. Kaiyu Hu assisted in validation, investigation, supervision, writing—reviewing and editing.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xu, H., Xu, Y. & Hu, K. A vision-based inspection system for pharmaceutical production line. J Supercomput 81, 625 (2025). https://doi.org/10.1007/s11227-025-07135-8
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-025-07135-8