Abstract
In this study, three-dimensional (3D) spatial data, two-dimensional (2D) texture information, and automatic marking processes were used for the detection and classification of car parts. The automatic marking processes involved automatic car part segmentation and classification, car part detection and classification on the basis of images of 2D textures, and car part segmentation and car identification on the basis of a 3D point cloud. The 2D image processing system identifies car parts and generates numerous car texture images, which are subjected to three stages of processing. In the first stage of processing, automated segmentation technology is used to segment images; in the second stage, different types of training images with various backgrounds are generated; and in the third stage, the You Only Look Once v3 model is used to identify car parts on the basis of the generated training images. The adopted 3D model conducts processing over two stages. In the first stage, a 3D triangular grid is combined with a texture image to achieve car part identification at a grid point by employing a PointNet model trained using ground truth data. In the second stage, the trained PointNet model is used for detecting the parts of a 3D car model on the basis of 3D triangular mesh data. The precision of fine part segmentation from texture images was considerably higher than that of simple part segmentation. In car part detection and classification experiments on texture images, the mean intersection over union (mIoU) and mean average percentage both exceeded 70%. In a 3D car model experiment conducted using the ShapeNet dataset, the average mIoU and accuracy were 73.66% and 90.2%, respectively. When the developed PointNet model was trained using the Train-2000 dataset, the mIoU and accuracy of the model were 40.6% and 56.64%, respectively. When the developed PointNet model was trained using the Train dataset, the mIoU and accuracy of the model were 40.73% and 61.61%, respectively. The developed model achieved a 4.97% higher accuracy when training it with the Train dataset than when training it with the Train-2000 dataset. However, the mIoU of this model was only 0.13% higher when it was trained with the Train dataset than when it was trained with the Train-2000 dataset. When 3DNetWeight-2 was used as the initial parameter of the developed model in the final training, the mIoU and accuracy of the model were 44.27% and 61.98%, respectively—3.54% and 0.97% higher than those obtained without transfer learning, respectively. In sum, the proposed method can identify different parts of objects, and the findings serve as a reference for the design of simulation systems for autonomous vehicles. There are also applied to automated automobile management, medical technology, military training, aerospace technology, and disaster response systems.




































Similar content being viewed by others
References
3D Models for Professionals. TurboSquid https://www.turbosquid.com/. Accessed 13 Nov 2021
Buy and sell 3D models. 3DEXPORT https://ch.3dexport.com/. Accessed 13 Nov 2021
Search thousands of 3D models. CGTRADER https://www.cgtrader.com/. Accessed 13 Nov 2021
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: a deep representation for volumetric shapes. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, June 7–12, 2015, pp 1912–1920
Yi L, Kim VG, Ceylan D, Shen IC, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L (2016) A scalable active framework for region annotation in 3d shape collections. ACM Trans Graph 35(6):1–12
Aubry M, Schlickewei U, Cremers D (2011) The wave kernel signature: a quantum mechanical approach to shape analysis. In: 2011 IEEE International Conference on Computer Vision Workshops, Barcelona, Nov 6–13, 2011, pp 1626–1633
Sun J, Ovsjanikov M, Guibas L (2009) A concise and provably informative multi-scale signature based on heat diffusion. Comput Graph For 28:1383–1392
Bronstein MM, Kokkinos I (2010) Scale-invariant heat kernel signatures for non-rigid shape recognition. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, June 13–18, 2010, pp 1704–1711
Rusu RB, Blodow N, Marton ZC, Beetz M (2008) Aligning point cloud views using persistent feature histograms. In: 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, Nice, France, September 22 to 26, 2008, pp 3384–3391
Ling H, Jacobs DW (2007) Shape classification using the inner-distance. IEEE Trans Pattern Anal Mach Intell 29(2):286–299
Gernot R, Osman U, Andreas G (2017) Octnet: learning deep 3d representations at high resolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21–26, 2017, pp 6620–6629
Charles RQ, Hao S, Kaichun M, Leonidas JG (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21–26, 2017, pp 77–85
Charles RQ, Li Y, Hao S, Leonidas JG (2017) PointNet++: deep hierarchical feature learning on point sets in a metric space. In: 31th Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA, Dec 4–9, 2017, pp 5105–5114
Wang P, Liu Y, Guo YX, Sun CY, Tong X (2017) O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Trans Graph 36(4):1–11
Lin H, Averkiou M, Kalogerakis E, Kovacs B, Ranade S, Kim VG, Chaudhuri S, Bala K (2018) Learning material-aware local descriptors for 3D shapes. arXiv:1810.08729
Maturana D, Scherer S (2015) VoxNet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, Hamburg, Germany, Sep 28–Oct 2, 2015, pp 922–928
Li Y, Pirk S, Su H, Qi CR, Guibas LJ (2016) FPNN: field probing neural networks for 3D data. arXiv preprint arXiv:1605.06240
Wang DZ, Posner I (2015) Voting for voting in online point cloud object detection. In: Robotics: Science and Systems Conference, Rome, Italy, July 13–17, 2015
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3d shape recognition. arXiv:1505.00880
Qi CR, Su H, Nießner M, Dai A, Yan M, Guibas L (2016) Volumetric and multi-view cnns for object classification on 3d data. arXiv:1604.03265
Savva M, Yu F, Su H, Kanezaki A, Furuya T, Ohbuchi R, Zhou Z, Yu R, Bai S, Bai X, Aono M, Tatsuma A, Thermos S, Axenopoulos A, Papadopoulos GTh, Daras P, Deng X, Lian Li ZB, Johan H, Lu Y, Mk S (2016) Large-Scale 3D shape retrieval from shapenet core55. In: Proceedings of the Eurographics 2016 Workshop on 3d Object Retrieval, Lisbon Portugal, May 8, 2016, pp 89–98
Bruna J, Zaremba W, Szlam A, LeCun Y (2014) Spectral networks and locally connected networks on graphs. arXiv:1312.6203
Masci J, Boscaini D, Bronstein M, Vandergheynst P (2018) Geodesic convolutional neural network sonriemannianmani folds. arXiv:1501.06297
Fang Y, Xie J, Dai G, Wang M, Zhu F, Xu T, Wong E (2015) 3d deep shape descriptor. In: IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, June 7–12, 2015, pp 2319–2327
Guo K, Zou D, Chen X (2015) 3D mesh labeling via deep convolutional neural networks. ACM Trans Graph 35(1):1–12
Yingxue Z, Michael R (2018) A graph-CNN for 3d point cloud classification. arXiv:1812.01711
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM (2019) Dynamic graph CNN for learning on point clouds. ACM Trans Graph 38(5):1–12
Wu W, Qi Z, Li F, (2019) PointConv: deep convolutional networks on 3d point clouds. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, June 16–17, 2019, pp 9613–9622
Zhao H, Jiang L, Jia J, Torr P, Koltun V (2021) Point transformer. arXiv:2012.09164
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, June 20–25, 2005
Yang Y, Liu X (1999). A re-examination of text categorization methods. In: 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley California USA, August 15–19, 1999, pp 42–49
Nie JY, Brisebois M, Ren X (1996) On Chinese text retrieval. In: 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich Switzerland, August 18–22, 1996, pp 225–233
Vapnik VN (1996) Computational learning and probabilistic reasoning. Chapter structure of statistical learning theory. Wiley and Sons, Hoboken
Mitchell TM (1997) Machine learning. Burr Ridge, IL: McGraw Hill, 45(37), pp 870–877
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Li FF (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: 26th Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, USA, pp 1106–1114
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich (2015) Going deeper with convolutions. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, June 7–12, 2015, pp 1–9
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, June 27–30, 2016, pp 770–778
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, July 21–26, 2017, pp 6517–6525
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 6:1137–1149
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. In: 14th European Conference, Amsterdam, the Netherlands, October 11–14, 2016, pp 21–37
Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement. arXiv:1804.02767
Alexey B, Wang CY, Liao HY (2020) YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934
Jin R, Lin D (2020) Adaptive anchor for fast object detection in aerial image. IEEE Geosci Remote Sens Lett 17:839–843
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66
Max J, Karen S, Andrew Z, Koray K (2015) Spatial transformer networks. In: 28th International Conference on Neural Information Processing Systems, Montreal, Quebec, Canad, December 7–12, 2015, pp 2017–2025
Lin TY, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollar P (2014) Microsoft COCO: common objects in context. In: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, pp 740–755
Acknowledgements
This work was supported in part by Ministry of Science and Technology, Taiwan, under Grant No. MOST 110-2221-E-025-006. This manuscript was edited by Wallace Academic Editing.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflict of interest.
Data availability
Four datasets of 3D Model that support the findings of this study are openly available [TurboSquid] at https://www.turbosquid.com/, reference number [1]; [3DEXPORT] at https://ch.3dexport.com/, reference number [2]; [CGTrader] at https://www.cgtrader.com/, reference number [3], [ShapeNet] at https://paperswithcode.com/dataset/shapenet, reference number [5]. Texture image of Car model that support the findings of this study are openly available [TurboSquid] at https://www.turbosquid.com/, reference number [1]; [3DEXPORT] at https://ch.3dexport.com/, reference number [2]; [CGTrader] at https://www.cgtrader.com/, reference number [3]. The initial training parameters of YOLOv3, which were the weight parameters of YOLOv3 on the [MS COCO dataset] at https://cocodataset.org/#home, reference number [49].
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lin, CH., Yu, CC. & Chen, HY. Augmentation dataset of a two-dimensional neural network model for use in the car parts segmentation and car classification of three dimensions. J Supercomput 78, 18915–18958 (2022). https://doi.org/10.1007/s11227-022-04630-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04630-0