Skip to main content
Log in

Augmentation dataset of a two-dimensional neural network model for use in the car parts segmentation and car classification of three dimensions

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In this study, three-dimensional (3D) spatial data, two-dimensional (2D) texture information, and automatic marking processes were used for the detection and classification of car parts. The automatic marking processes involved automatic car part segmentation and classification, car part detection and classification on the basis of images of 2D textures, and car part segmentation and car identification on the basis of a 3D point cloud. The 2D image processing system identifies car parts and generates numerous car texture images, which are subjected to three stages of processing. In the first stage of processing, automated segmentation technology is used to segment images; in the second stage, different types of training images with various backgrounds are generated; and in the third stage, the You Only Look Once v3 model is used to identify car parts on the basis of the generated training images. The adopted 3D model conducts processing over two stages. In the first stage, a 3D triangular grid is combined with a texture image to achieve car part identification at a grid point by employing a PointNet model trained using ground truth data. In the second stage, the trained PointNet model is used for detecting the parts of a 3D car model on the basis of 3D triangular mesh data. The precision of fine part segmentation from texture images was considerably higher than that of simple part segmentation. In car part detection and classification experiments on texture images, the mean intersection over union (mIoU) and mean average percentage both exceeded 70%. In a 3D car model experiment conducted using the ShapeNet dataset, the average mIoU and accuracy were 73.66% and 90.2%, respectively. When the developed PointNet model was trained using the Train-2000 dataset, the mIoU and accuracy of the model were 40.6% and 56.64%, respectively. When the developed PointNet model was trained using the Train dataset, the mIoU and accuracy of the model were 40.73% and 61.61%, respectively. The developed model achieved a 4.97% higher accuracy when training it with the Train dataset than when training it with the Train-2000 dataset. However, the mIoU of this model was only 0.13% higher when it was trained with the Train dataset than when it was trained with the Train-2000 dataset. When 3DNetWeight-2 was used as the initial parameter of the developed model in the final training, the mIoU and accuracy of the model were 44.27% and 61.98%, respectively—3.54% and 0.97% higher than those obtained without transfer learning, respectively. In sum, the proposed method can identify different parts of objects, and the findings serve as a reference for the design of simulation systems for autonomous vehicles. There are also applied to automated automobile management, medical technology, military training, aerospace technology, and disaster response systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26
Fig. 27
Fig. 28
Fig. 29
Fig. 30
Fig. 31
Fig. 32
Fig. 33
Fig. 34
Fig. 35
Fig. 36

Similar content being viewed by others

References

  1. 3D Models for Professionals. TurboSquid https://www.turbosquid.com/. Accessed 13 Nov 2021

  2. Buy and sell 3D models. 3DEXPORT https://ch.3dexport.com/. Accessed 13 Nov 2021

  3. Search thousands of 3D models. CGTRADER https://www.cgtrader.com/. Accessed 13 Nov 2021

  4. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: a deep representation for volumetric shapes. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, June 7–12, 2015, pp 1912–1920

  5. Yi L, Kim VG, Ceylan D, Shen IC, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L (2016) A scalable active framework for region annotation in 3d shape collections. ACM Trans Graph 35(6):1–12

    Article  Google Scholar 

  6. Aubry M, Schlickewei U, Cremers D (2011) The wave kernel signature: a quantum mechanical approach to shape analysis. In: 2011 IEEE International Conference on Computer Vision Workshops, Barcelona, Nov 6–13, 2011, pp 1626–1633

  7. Sun J, Ovsjanikov M, Guibas L (2009) A concise and provably informative multi-scale signature based on heat diffusion. Comput Graph For 28:1383–1392

    Google Scholar 

  8. Bronstein MM, Kokkinos I (2010) Scale-invariant heat kernel signatures for non-rigid shape recognition. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, June 13–18, 2010, pp 1704–1711

  9. Rusu RB, Blodow N, Marton ZC, Beetz M (2008) Aligning point cloud views using persistent feature histograms. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, September 22 to 26, 2008, pp 3384–3391

  10. Ling H, Jacobs DW (2007) Shape classification using the inner-distance. IEEE Trans Pattern Anal Mach Intell 29(2):286–299

    Article  Google Scholar 

  11. Gernot R, Osman U, Andreas G (2017) Octnet: learning deep 3d representations at high resolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21–26, 2017, pp 6620–6629

  12. Charles RQ, Hao S, Kaichun M, Leonidas JG (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21–26, 2017, pp 77–85

  13. Charles RQ, Li Y, Hao S, Leonidas JG (2017) PointNet++: deep hierarchical feature learning on point sets in a metric space. In: 31th Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA, Dec 4–9, 2017, pp 5105–5114

  14. Wang P, Liu Y, Guo YX, Sun CY, Tong X (2017) O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans Graph 36(4):1–11

    Google Scholar 

  15. Lin H, Averkiou M, Kalogerakis E, Kovacs B, Ranade S, Kim VG, Chaudhuri S, Bala K (2018) Learning material-aware local descriptors for 3D shapes. arXiv:1810.08729

  16. Maturana D, Scherer S (2015) VoxNet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg, Germany, Sep 28–Oct 2, 2015, pp 922–928

  17. Li Y, Pirk S, Su H, Qi CR, Guibas LJ (2016) FPNN: field probing neural networks for 3D data. arXiv preprint arXiv:1605.06240

  18. Wang DZ, Posner I (2015) Voting for voting in online point cloud object detection. In: Robotics: Science and Systems Conference, Rome, Italy, July 13–17, 2015

  19. Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. arXiv:1505.00880

  20. Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas L (2016) Volumetric and multi-view cnns for object classification on 3d data. arXiv:1604.03265

  21. Savva M, Yu F, Su H, Kanezaki A, Furuya T, Ohbuchi R, Zhou Z, Yu R, Bai S, Bai X, Aono M, Tatsuma A, Thermos S, Axenopoulos A, Papadopoulos GTh, Daras P, Deng X, Lian Li ZB, Johan H, Lu Y, Mk S (2016) Large-Scale 3D shape retrieval from shapenet core55. In: Proceedings of the Eurographics 2016 Workshop on 3d Object Retrieval, Lisbon Portugal, May 8, 2016, pp 89–98

  22. Bruna J, Zaremba W, Szlam A, LeCun Y (2014) Spectral networks and locally connected networks on graphs. arXiv:1312.6203

  23. Masci J, Boscaini D, Bronstein M, Vandergheynst P (2018) Geodesic convolutional neural network sonriemannianmani folds. arXiv:1501.06297

  24. Fang Y, Xie J, Dai G, Wang M, Zhu F, Xu T, Wong E (2015) 3d deep shape descriptor. In: IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, June 7–12, 2015, pp 2319–2327

  25. Guo K, Zou D, Chen X (2015) 3D mesh labeling via deep convolutional neural networks. ACM Trans Graph 35(1):1–12

    Article  Google Scholar 

  26. Yingxue Z, Michael R (2018) A graph-CNN for 3d point cloud classification. arXiv:1812.01711

  27. Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph CNN for learning on point clouds. ACM Trans Graph 38(5):1–12

    Article  Google Scholar 

  28. Wu W, Qi Z, Li F, (2019) PointConv: deep convolutional networks on 3d point clouds. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, June 16–17, 2019, pp 9613–9622

  29. Zhao H, Jiang L, Jia J, Torr P, Koltun V (2021) Point transformer. arXiv:2012.09164

  30. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, June 20–25, 2005

  31. Yang Y, Liu X (1999). A re-examination of text categorization methods. In: 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley California USA, August 15–19, 1999, pp 42–49

  32. Nie JY, Brisebois M, Ren X (1996) On Chinese text retrieval. In: 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich Switzerland, August 18–22, 1996, pp 225–233

  33. Vapnik VN (1996) Computational learning and probabilistic reasoning. Chapter structure of statistical learning theory. Wiley and Sons, Hoboken

    Google Scholar 

  34. Mitchell TM (1997) Machine learning. Burr Ridge, IL: McGraw Hill, 45(37), pp 870–877

  35. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Li FF (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  36. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: 26th Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, USA, pp 1106–1114

  37. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  38. Szegedy C, Liu W, Jia Y, Sermanet P, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich (2015) Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, June 7–12, 2015, pp 1–9

  39. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, June 27–30, 2016, pp 770–778

  40. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  41. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21–26, 2017, pp 6517–6525

  42. Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 6:1137–1149

    Article  Google Scholar 

  43. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In: 14th European Conference, Amsterdam, the Netherlands, October 11–14, 2016, pp 21–37

  44. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv:1804.02767

  45. Alexey B, Wang CY, Liao HY (2020) YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934

  46. Jin R, Lin D (2020) Adaptive anchor for fast object detection in aerial image. IEEE Geosci Remote Sens Lett 17:839–843

    Article  Google Scholar 

  47. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66

    Article  Google Scholar 

  48. Max J, Karen S, Andrew Z, Koray K (2015) Spatial transformer networks. In: 28th International Conference on Neural Information Processing Systems, Montreal, Quebec, Canad, December 7–12, 2015, pp 2017–2025

  49. Lin TY, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollar P (2014) Microsoft COCO: common objects in context. In: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, pp 740–755

Download references

Acknowledgements

This work was supported in part by Ministry of Science and Technology, Taiwan, under Grant No. MOST 110-2221-E-025-006. This manuscript was edited by Wallace Academic Editing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chuen-Horng Lin.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Data availability

Four datasets of 3D Model that support the findings of this study are openly available [TurboSquid] at https://www.turbosquid.com/, reference number [1]; [3DEXPORT] at https://ch.3dexport.com/, reference number [2]; [CGTrader] at https://www.cgtrader.com/, reference number [3], [ShapeNet] at https://paperswithcode.com/dataset/shapenet, reference number [5]. Texture image of Car model that support the findings of this study are openly available [TurboSquid] at https://www.turbosquid.com/, reference number [1]; [3DEXPORT] at https://ch.3dexport.com/, reference number [2]; [CGTrader] at https://www.cgtrader.com/, reference number [3]. The initial training parameters of YOLOv3, which were the weight parameters of YOLOv3 on the [MS COCO dataset] at https://cocodataset.org/#home, reference number [49].

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

See Table 20 and Figs. 37 and 38.

Table 20 Number of texture images for 3D car model
Fig. 37
figure 37

Type A-1 texture image

Fig. 38
figure 38

Type A-2 Texture image

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, CH., Yu, CC. & Chen, HY. Augmentation dataset of a two-dimensional neural network model for use in the car parts segmentation and car classification of three dimensions. J Supercomput 78, 18915–18958 (2022). https://doi.org/10.1007/s11227-022-04630-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04630-0

Keywords

Navigation