Skip to main content
Log in

Image dataset creation and networks improvement method based on CAD model and edge operator for object detection in the manufacturing industry

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Creating image dataset for object detection is a time-consuming and laborious work, seriously hindering the rapid application of object detection in the industrial manufacturing field. To reduce time and cost of object detection application, a method of image dataset creation and networks improvement based on CAD model and edge extraction operators is proposed. It can quickly generate effective training dataset without any tedious work and make the object detection networks obtain good detection performance. The method consists of three steps: capture-images-automatically, cut-and-paste and networks-improvement. To improve the performance of the detection networks, edge extraction operators are used to obtain the common features of the synthetic images and the real images. These edge extraction operators include Sobel edge, Laplacian edge, Canny edge and adaptive threshold edge, and the experimental results show that the adaptive threshold edge achieves the best effect. In addition, a class-weights is adopted to improve the AP of hard-to-detect parts. Ten mechanical parts of a 3D-printed aero-engine are used to evaluate this method. The results show that the improved networks (yolov5s) trained with the synthetic images can achieve 99.08%, 93.83% and 98.91% of the average recall, average precision and mAP, respectively. Taking into account the time, cost and detection performance, the proposed method is much better than the traditional method and current advanced method. The proposed method is feasible for object detection in many industrial scenarios where CAD models of products can be easily obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig.5
Fig. 6
Fig. 7
Fig. 8.
Fig. 9
Fig. 10
Fig.11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Availability of data and material

Not applicable.

Code availability

Not applicable.

References

  1. Lai, Z.H., Tao, W.J., Leu, M.C., Yin, Z.Z.: Smart augmented reality instructional system for mechanical assembly towards worker-centered intelligent manufacturing. J. Manuf. Syst. 55, 69–81 (2020). https://doi.org/10.1016/j.jmsy.2020.02.010. (in English)

    Article  Google Scholar 

  2. Chen, C., Wang, T., Li, D., Hong, J.: Repetitive assembly action recognition based on object detection and pose estimation. J. Manuf. Syst. 55, 325–333 (2020). https://doi.org/10.1016/j.jmsy.2020.04.018

    Article  Google Scholar 

  3. Xiao, L., Lu, M.Y., Huang, H.: Detection of powder bed defects in selective laser sintering using convolutional neural network. Int. J. Adv. Manuf. Technol. 107(5–6), 2485–2496 (2020). https://doi.org/10.1007/s00170-020-05205-0. (in English)

    Article  Google Scholar 

  4. Bang, S., Baek, F., Park, S., Kim, W., Kim, H.: Image augmentation to improve construction resource detection using generative adversarial networks, cut-and-paste, and image transformation techniques. Autom. Constr. 115, 11 (2020). https://doi.org/10.1016/j.autcon.2020.103198. (in English)

    Article  Google Scholar 

  5. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  6. Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: ORB: an efficient alternative to SIFT or SURF. In: International Conference on Computer Vision, 2012

  7. Liu, L., et al.: Deep learning for generic object detection: a survey. Int. J. Comput. Vis. 128(2), 261–318 (2020). https://doi.org/10.1007/s11263-019-01247-4. (in English)

    Article  Google Scholar 

  8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386. (in English)

    Article  Google Scholar 

  9. Gu, J., Wang, Z., Kuen, J., Ma, L., Gang, W.: Recent advances in convolutional neural networks. Pattern Recognit. 77, 354–377 (2018)

    Article  Google Scholar 

  10. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2015). arXiv preprint arXiv:1409.1556

  11. Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q., IEEE: Densely connected convolutional networks. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261–2269. IEEE, New York (2017)

  12. Ren, S.Q., He, K.M., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/tpami.2016.2577031. (in English)

    Article  Google Scholar 

  13. Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: YOLOv4: optimal speed and accuracy of object detection (2020). arXiv preprint arXiv:2004.10934

  14. Lu, K.Y., Chen, J.H., Little, J.J., He, H.G.: Lightweight convolutional neural networks for player detection and classification. Comput. Vis. Image Underst. 172, 77–87 (2018). https://doi.org/10.1016/j.cviu.2018.02.008. (in English)

    Article  Google Scholar 

  15. Chen, F.Y., Zhu, C.C., Shen, Z.Q., Zhang, H., Savvides, M.: NCMS: towards accurate anchor free object detection through l(2) norm calibration and multi-feature selection. Comput. Vis. Image Underst. 200, 8 (2020). https://doi.org/10.1016/j.cviu.2020.103050. (in English)

    Article  Google Scholar 

  16. Wang, B.S., Cao, G., Zhou, L.C., Zhang, Y.Q., Shang, Y.F.: Task differentiation: Constructing robust branches for precise object detection. Comput. Vis. Image Underst. 199, 14 (2020). https://doi.org/10.1016/j.cviu.2020.103030. (in English)

    Article  Google Scholar 

  17. Jose, M.J., Tuytelaars, T.: Recovering hard-to-find object instances by sampling context-based object proposals. Comput. Vis. Image Underst. 152, 118–130 (2016). https://doi.org/10.1016/j.cviu.2016.08.007. (in English)

    Article  Google Scholar 

  18. Rateke, T., von Wangenheim, A.: Road obstacles positional and dynamic features extraction combining object detection, stereo disparity maps and optical flow data. Mach. Vis. Appl. 31(7–8), 11 (2020). https://doi.org/10.1007/s00138-020-01126-w. (in English)

    Article  Google Scholar 

  19. Kocur, V., Ftacnik, M.: Detection of 3D bounding boxes of vehicles using perspective transformation for accurate speed measurement. Mach. Vis. Appl. 31(7–8), 15 (2020). https://doi.org/10.1007/s00138-020-01117-x. (in English)

    Article  Google Scholar 

  20. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  21. Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: CVPR: 2009 IEEE Conference on Computer Vision and Pattern Recognition, vols. 1–4, pp. 248–255 (2009)

  22. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010). https://doi.org/10.1007/s11263-009-0275-4. (in English)

    Article  Google Scholar 

  23. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Zitnick, C.L.: Microsoft COCO: Common Objects in Context (2015). arXiv preprint arXiv:1405.0312

    Google Scholar 

  24. Yuan, P., Wang, T., Tao, Y., IEEE: Smart robot perception through internet data mining. In: 2010 8th World Congress on Intelligent Control and Automation, pp. 1574–1578 (2010)

  25. Li, W., Wang, M., Wang, H.B., Zhang, Y.F.: Object detection based on semi-supervised domain adaptation for imbalanced domain resources. Mach. Vis. Appl. 31(3), 18 (2020). https://doi.org/10.1007/s00138-020-01068-3. (in English)

    Article  Google Scholar 

  26. Jun-Yan, Z., Taesung, P., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv (USA), p. 18 (2017). [Online]. Available: <Go to ISI>://INSPEC:16979133

  27. Li, M., Lin, J., Ding, Y., Liu, Z., Zhu, J.-Y., Han, S.: GAN compression: efficient architectures for interactive conditional GANs (2020)

  28. Rozantsev, A., Lepetit, V., Fua, P.: On rendering synthetic images for training an object detector. Comput. Vis. Image Underst. 137, 24–37 (2015). https://doi.org/10.1016/j.cviu.2014.12.006. (in English)

    Article  Google Scholar 

  29. Zidek, K., Lazorik, P., Pitel, J., Hosovsky, A.: An automated training of deep learning networks by 3D virtual models for object recognition. Symmetry-Basel 11(4), 16 (2019). https://doi.org/10.3390/sym11040496. (in English)

    Article  Google Scholar 

  30. Saini, A., Biswas, M.: Object detection in underwater image by detecting edges using adaptive thresholding. In: 2019 3rd International Conference on Trends in Electronics and Informatics, pp. 628–632 (2019)

  31. Tianhan, G., Zhenhao, Y.: 3D object recognition method based on improved canny edge detection algorithm in augmented reality. In: 2020 IEEE 5th International Conference on Image, Vision and Computing, pp. 19–23 (2020)

  32. Dahi, I., El Mezouar, M.C., Taleb, N., Elbahri, M.: An edge-based method for effective abandoned luggage detection in complex surveillance videos. Comput. Vis. Image Underst. 158, 141–151 (2017). https://doi.org/10.1016/j.cviu.2017.01.008. (in English)

    Article  Google Scholar 

  33. Das Bhattacharjee, S., Mittal, A.: Part-based deformable object detection with a single sketch. Comput. Vis. Image Underst. 139, 73–87 (2015). https://doi.org/10.1016/j.cviu.2015.06.005. (in English)

    Article  Google Scholar 

  34. Princess, P.J.B., Silas, S., Rajsingh, E.B.: Performance analysis of edge detection algorithms for object detection in accident images. In: 2019 Global Conference for Advancement in Technology, pp. 5 (2019)

  35. Zubizarreta, J., Aguinaga, I., Amundarain, A.: A framework for augmented reality guidance in industry. Int. J. Adv. Manuf. Technol. 102(9–12), 4095–4108 (2019). https://doi.org/10.1007/s00170-019-03527-2. (in English)

    Article  Google Scholar 

  36. Dharmara, K., Monfared, R.P., Ogun, P.S., Jackson, M.R.: Robotic assembly of threaded fasteners in a non-structured environment. Int. J. Adv. Manuf. Technol. 98(5–8), 2093–2107 (2018). https://doi.org/10.1007/s00170-018-2363-5. (in English)

    Article  Google Scholar 

  37. Sobel, I., Feldman, G., Feldman, G.: A 3x3 isotropic gradient operator for image processing. Die Pharmazie 7(8) (1968)

  38. Ziou, D., Tabbone, S.: Edge detection techniques-an overview. Pattern Recognit. Image Anal. 8(4), 537–59 (1998)

    Google Scholar 

  39. Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679–698 (1986). https://doi.org/10.1109/tpami.1986.4767851. (in English)

    Article  Google Scholar 

  40. Yuan, L.Y., Xu, X.: Adaptive image edge detection algorithm based on canny operator. In: 2015 4th International Conference on Advanced Information Technology and Sensor Application, pp. 28–31. IEEE, New York (2015) (in English)

  41. Wang, Y., Zhang, S., Yang, S., He, W., Bai, X.: Mechanical assembly assistance using marker-less augmented reality system. Assem. Autom. 38(1), 77–87 (2018)

    Article  Google Scholar 

  42. Smith, J., Petrova, G., Schaefer, S.: Encoding normal vectors using optimized spherical coordinates. Comput. Graph. 36(5), 360–365 (2012). https://doi.org/10.1016/j.cag.2012.03.017

    Article  Google Scholar 

  43. Redmon, J., Divvala, S., Girshick, R., Farhadi, A., IEEE: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE, New York (2016)

  44. Redmon, J., Farhadi, A., IEEE: YOLO9000: better, faster, stronger. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517–6525. IEEE, New York (2017)

  45. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  46. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision—Eccv 2016. Lecture Notes in Computer Science, Pt I, vol. 9905, pp. 21–37. Springer, Cham (2016)

  47. Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015). https://doi.org/10.1007/s11263-014-0733-5. (in English)

    Article  Google Scholar 

  48. Lin, T.Y., et al.: Feature pyramid networks for object detection. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 936–944. IEEE, New York (2017)

Download references

Acknowledgements

This work was supported in part by the National Defense Basic Scientific Research Program of China (CN) under Grants JCKY2018605C003, JCKY2018203A001 and JCKY2017203B071.

Funding

This work was supported in part by the National Defense Basic Scientific Research Program of China (CN) under Grants JCKY2018605C003, JCKY2018203A001 and JCKY2017203B071.

Author information

Authors and Affiliations

Authors

Contributions

PT was involved in methodology, software, validation and writing original draft. YG was involved in supervision and funding acquisition. HL was involved in methodology, formal analysis and investigation. ZW was involved in data curation and formal analysis. GZ was involved in project administration and investigation. JP was involved in project administration and writing— reviewing.

Corresponding author

Correspondence to Yu Guo.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, P., Guo, Y., Li, H. et al. Image dataset creation and networks improvement method based on CAD model and edge operator for object detection in the manufacturing industry. Machine Vision and Applications 32, 111 (2021). https://doi.org/10.1007/s00138-021-01237-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00138-021-01237-y

Keywords

Navigation