Abstract
Creating image dataset for object detection is a time-consuming and laborious work, seriously hindering the rapid application of object detection in the industrial manufacturing field. To reduce time and cost of object detection application, a method of image dataset creation and networks improvement based on CAD model and edge extraction operators is proposed. It can quickly generate effective training dataset without any tedious work and make the object detection networks obtain good detection performance. The method consists of three steps: capture-images-automatically, cut-and-paste and networks-improvement. To improve the performance of the detection networks, edge extraction operators are used to obtain the common features of the synthetic images and the real images. These edge extraction operators include Sobel edge, Laplacian edge, Canny edge and adaptive threshold edge, and the experimental results show that the adaptive threshold edge achieves the best effect. In addition, a class-weights is adopted to improve the AP of hard-to-detect parts. Ten mechanical parts of a 3D-printed aero-engine are used to evaluate this method. The results show that the improved networks (yolov5s) trained with the synthetic images can achieve 99.08%, 93.83% and 98.91% of the average recall, average precision and mAP, respectively. Taking into account the time, cost and detection performance, the proposed method is much better than the traditional method and current advanced method. The proposed method is feasible for object detection in many industrial scenarios where CAD models of products can be easily obtained.
Similar content being viewed by others
Availability of data and material
Not applicable.
Code availability
Not applicable.
References
Lai, Z.H., Tao, W.J., Leu, M.C., Yin, Z.Z.: Smart augmented reality instructional system for mechanical assembly towards worker-centered intelligent manufacturing. J. Manuf. Syst. 55, 69–81 (2020). https://doi.org/10.1016/j.jmsy.2020.02.010. (in English)
Chen, C., Wang, T., Li, D., Hong, J.: Repetitive assembly action recognition based on object detection and pose estimation. J. Manuf. Syst. 55, 325–333 (2020). https://doi.org/10.1016/j.jmsy.2020.04.018
Xiao, L., Lu, M.Y., Huang, H.: Detection of powder bed defects in selective laser sintering using convolutional neural network. Int. J. Adv. Manuf. Technol. 107(5–6), 2485–2496 (2020). https://doi.org/10.1007/s00170-020-05205-0. (in English)
Bang, S., Baek, F., Park, S., Kim, W., Kim, H.: Image augmentation to improve construction resource detection using generative adversarial networks, cut-and-paste, and image transformation techniques. Autom. Constr. 115, 11 (2020). https://doi.org/10.1016/j.autcon.2020.103198. (in English)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Rublee, E., Rabaud, V., Konolige, K., Bradski, G.R.: ORB: an efficient alternative to SIFT or SURF. In: International Conference on Computer Vision, 2012
Liu, L., et al.: Deep learning for generic object detection: a survey. Int. J. Comput. Vis. 128(2), 261–318 (2020). https://doi.org/10.1007/s11263-019-01247-4. (in English)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386. (in English)
Gu, J., Wang, Z., Kuen, J., Ma, L., Gang, W.: Recent advances in convolutional neural networks. Pattern Recognit. 77, 354–377 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2015). arXiv preprint arXiv:1409.1556
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q., IEEE: Densely connected convolutional networks. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261–2269. IEEE, New York (2017)
Ren, S.Q., He, K.M., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017). https://doi.org/10.1109/tpami.2016.2577031. (in English)
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: YOLOv4: optimal speed and accuracy of object detection (2020). arXiv preprint arXiv:2004.10934
Lu, K.Y., Chen, J.H., Little, J.J., He, H.G.: Lightweight convolutional neural networks for player detection and classification. Comput. Vis. Image Underst. 172, 77–87 (2018). https://doi.org/10.1016/j.cviu.2018.02.008. (in English)
Chen, F.Y., Zhu, C.C., Shen, Z.Q., Zhang, H., Savvides, M.: NCMS: towards accurate anchor free object detection through l(2) norm calibration and multi-feature selection. Comput. Vis. Image Underst. 200, 8 (2020). https://doi.org/10.1016/j.cviu.2020.103050. (in English)
Wang, B.S., Cao, G., Zhou, L.C., Zhang, Y.Q., Shang, Y.F.: Task differentiation: Constructing robust branches for precise object detection. Comput. Vis. Image Underst. 199, 14 (2020). https://doi.org/10.1016/j.cviu.2020.103030. (in English)
Jose, M.J., Tuytelaars, T.: Recovering hard-to-find object instances by sampling context-based object proposals. Comput. Vis. Image Underst. 152, 118–130 (2016). https://doi.org/10.1016/j.cviu.2016.08.007. (in English)
Rateke, T., von Wangenheim, A.: Road obstacles positional and dynamic features extraction combining object detection, stereo disparity maps and optical flow data. Mach. Vis. Appl. 31(7–8), 11 (2020). https://doi.org/10.1007/s00138-020-01126-w. (in English)
Kocur, V., Ftacnik, M.: Detection of 3D bounding boxes of vehicles using perspective transformation for accurate speed measurement. Mach. Vis. Appl. 31(7–8), 15 (2020). https://doi.org/10.1007/s00138-020-01117-x. (in English)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791
Deng, J., et al.: ImageNet: a large-scale hierarchical image database. In: CVPR: 2009 IEEE Conference on Computer Vision and Pattern Recognition, vols. 1–4, pp. 248–255 (2009)
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010). https://doi.org/10.1007/s11263-009-0275-4. (in English)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Zitnick, C.L.: Microsoft COCO: Common Objects in Context (2015). arXiv preprint arXiv:1405.0312
Yuan, P., Wang, T., Tao, Y., IEEE: Smart robot perception through internet data mining. In: 2010 8th World Congress on Intelligent Control and Automation, pp. 1574–1578 (2010)
Li, W., Wang, M., Wang, H.B., Zhang, Y.F.: Object detection based on semi-supervised domain adaptation for imbalanced domain resources. Mach. Vis. Appl. 31(3), 18 (2020). https://doi.org/10.1007/s00138-020-01068-3. (in English)
Jun-Yan, Z., Taesung, P., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv (USA), p. 18 (2017). [Online]. Available: <Go to ISI>://INSPEC:16979133
Li, M., Lin, J., Ding, Y., Liu, Z., Zhu, J.-Y., Han, S.: GAN compression: efficient architectures for interactive conditional GANs (2020)
Rozantsev, A., Lepetit, V., Fua, P.: On rendering synthetic images for training an object detector. Comput. Vis. Image Underst. 137, 24–37 (2015). https://doi.org/10.1016/j.cviu.2014.12.006. (in English)
Zidek, K., Lazorik, P., Pitel, J., Hosovsky, A.: An automated training of deep learning networks by 3D virtual models for object recognition. Symmetry-Basel 11(4), 16 (2019). https://doi.org/10.3390/sym11040496. (in English)
Saini, A., Biswas, M.: Object detection in underwater image by detecting edges using adaptive thresholding. In: 2019 3rd International Conference on Trends in Electronics and Informatics, pp. 628–632 (2019)
Tianhan, G., Zhenhao, Y.: 3D object recognition method based on improved canny edge detection algorithm in augmented reality. In: 2020 IEEE 5th International Conference on Image, Vision and Computing, pp. 19–23 (2020)
Dahi, I., El Mezouar, M.C., Taleb, N., Elbahri, M.: An edge-based method for effective abandoned luggage detection in complex surveillance videos. Comput. Vis. Image Underst. 158, 141–151 (2017). https://doi.org/10.1016/j.cviu.2017.01.008. (in English)
Das Bhattacharjee, S., Mittal, A.: Part-based deformable object detection with a single sketch. Comput. Vis. Image Underst. 139, 73–87 (2015). https://doi.org/10.1016/j.cviu.2015.06.005. (in English)
Princess, P.J.B., Silas, S., Rajsingh, E.B.: Performance analysis of edge detection algorithms for object detection in accident images. In: 2019 Global Conference for Advancement in Technology, pp. 5 (2019)
Zubizarreta, J., Aguinaga, I., Amundarain, A.: A framework for augmented reality guidance in industry. Int. J. Adv. Manuf. Technol. 102(9–12), 4095–4108 (2019). https://doi.org/10.1007/s00170-019-03527-2. (in English)
Dharmara, K., Monfared, R.P., Ogun, P.S., Jackson, M.R.: Robotic assembly of threaded fasteners in a non-structured environment. Int. J. Adv. Manuf. Technol. 98(5–8), 2093–2107 (2018). https://doi.org/10.1007/s00170-018-2363-5. (in English)
Sobel, I., Feldman, G., Feldman, G.: A 3x3 isotropic gradient operator for image processing. Die Pharmazie 7(8) (1968)
Ziou, D., Tabbone, S.: Edge detection techniques-an overview. Pattern Recognit. Image Anal. 8(4), 537–59 (1998)
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679–698 (1986). https://doi.org/10.1109/tpami.1986.4767851. (in English)
Yuan, L.Y., Xu, X.: Adaptive image edge detection algorithm based on canny operator. In: 2015 4th International Conference on Advanced Information Technology and Sensor Application, pp. 28–31. IEEE, New York (2015) (in English)
Wang, Y., Zhang, S., Yang, S., He, W., Bai, X.: Mechanical assembly assistance using marker-less augmented reality system. Assem. Autom. 38(1), 77–87 (2018)
Smith, J., Petrova, G., Schaefer, S.: Encoding normal vectors using optimized spherical coordinates. Comput. Graph. 36(5), 360–365 (2012). https://doi.org/10.1016/j.cag.2012.03.017
Redmon, J., Divvala, S., Girshick, R., Farhadi, A., IEEE: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE, New York (2016)
Redmon, J., Farhadi, A., IEEE: YOLO9000: better, faster, stronger. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517–6525. IEEE, New York (2017)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision—Eccv 2016. Lecture Notes in Computer Science, Pt I, vol. 9905, pp. 21–37. Springer, Cham (2016)
Everingham, M., Eslami, S.M.A., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015). https://doi.org/10.1007/s11263-014-0733-5. (in English)
Lin, T.Y., et al.: Feature pyramid networks for object detection. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 936–944. IEEE, New York (2017)
Acknowledgements
This work was supported in part by the National Defense Basic Scientific Research Program of China (CN) under Grants JCKY2018605C003, JCKY2018203A001 and JCKY2017203B071.
Funding
This work was supported in part by the National Defense Basic Scientific Research Program of China (CN) under Grants JCKY2018605C003, JCKY2018203A001 and JCKY2017203B071.
Author information
Authors and Affiliations
Contributions
PT was involved in methodology, software, validation and writing original draft. YG was involved in supervision and funding acquisition. HL was involved in methodology, formal analysis and investigation. ZW was involved in data curation and formal analysis. GZ was involved in project administration and investigation. JP was involved in project administration and writing— reviewing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tang, P., Guo, Y., Li, H. et al. Image dataset creation and networks improvement method based on CAD model and edge operator for object detection in the manufacturing industry. Machine Vision and Applications 32, 111 (2021). https://doi.org/10.1007/s00138-021-01237-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-021-01237-y