Abstract
Processing large amount of high-resolution 3D data requires enormous computational resources. As a result, a suitable 3D data representation must be chosen, and the data must be simplified to a size that can be easily processed. The question is how can the data be simplified? Random point sampling is a common sampling strategy. However, it is sensitive to changes in density. We build a sampling module based on a hybrid model that combines point cloud and voxel data. To determine the relationship between points within each voxel, the module uses the magnitude of the point (the Euclidean distance between the point and the object’s center) along with angles between each point embedded within each voxel. By exploiting farthest point sampling (FPS) that begins with a point in the set and selects the farthest point from the points already selected iteratively, our method has the advantage of covering the whole point set within a given number of centroids and still maintains the key benefits of both point cloud and voxel to better characterize geometric details contains in a 3D shape. With further observation that the number of points in each cell differs, we use a point quantization method to ensure that each cell has the same number of points. This allows all voxels to have the same feature size vector, making it easier for 3D convolution kernels to extract object features. We demonstrate these benefits and make comparisons with solid baselines on ModelNet10, ModelNet40 and ShapeNetPart datasets, demonstrating that our method outperforms some deep learning approaches for shape classification and segmentation tasks.
Similar content being viewed by others
References
Gezawa AS, Zhang Y, Wang Q, Yunqi L (2020) A review on deep learning approaches for 3d data representations in retrieval and classifications. IEEE Access 8:57566–57593
Qi Charles, Su H, Mo K, Guibas L (2017) PointNet: Deep learning on point sets for 3D classification and segmentation. In: Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 77–85
Qi Charles, Yi L, Su H, Guibas L (2017) PointNet++: Deep hierarchical feature learning on point sets in a metric space In: Advances in Neural Information Processing Systems, pp. 5100–5109
Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) PointCNN: Convolution on X-transformed points. Proc Adv Neural Inf Process Syst (NIPS) 31:820–830
Manzil Z, Satwik K, Siamak R, Barnabás P, Ruslan S, Alexander JS (2017) Deep sets. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 3394–3404
Shen Y, Feng C, Yang Y, Tian D (2018) Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Wang D, Posner I (2015) Voting for voting in online point cloud object detection. Robot: Sci Syst 1:10–15607
Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein M, Solomon J (2019) Dynamic graph CNN for learning on point clouds. ACM Trans Gr (TOG) 38:1–12
Li J, Chen B, Lee GH (2018) SO-Net: Self-organizing network for point cloud analysis. IEEE/CVF Conf Computer Vision Pattern Recognit 2018:9397–9406
Tchapmi LP, Choy C, Armeni I, Gwak J, Savarese S (2017) SEGCloud: Semantic segmentation of 3D point clouds. In: 2017 International Conference on 3D Vision (3DV), pp. 537–547
Su H, Jampani V, Sun D, Maji S, Kalogerakis E, Yang M, Kautz J (2018) SPLATNet: Sparse lattice networks for point cloud processing. IEEE/CVF Conf Computer Vision Pattern Recognit 2018:2530–2539
Hua B, Tran M, Yeung S (2018) Pointwise convolutional neural networks. IEEE/CVF Conf Computer Vision Pattern Recognit 2018:984–993
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3D ShapeNets: A deep representation for volumetric shapes. IEEE Conf Computer Vision Pattern Recognit (CVPR) 2015:1912–1920
Maturana D, Scherer S (2015) VoxNet: A 3D convolutional neural network for real-time object recognition. IEEE/RSJ Int Conf Intell Robots Syst (IROS) 2015:922–928
Brock A, Lim T, Ritchie J, Weston N (2016) Generative and discriminative voxel modeling with convolutional neural networks.arXiv:1608.04236
Eldar Y, Lindenbaum M, Porat M, Zeevi Y (1997) The farthest point strategy for progressive image sampling. IEEE Trans Image Process: Publ IEEE Signal Process Soc 6(9):1305–15
Li Y, Pirk S, Su H, Qi C, Guibas L (2016) FPNN: Field probing neural networks for 3D Data.arXiv:1605.06240
Klokov R, Lempitsky V (2017) Escape from cells: deep Kd-networks for the recognition of 3D point cloud models. IEEE Int Conf Computer Vision (ICCV) 2017:863–872
Wang P-S, Liu Y, Guo Y-X, Sun C-Y, Tong X (2017) O-CNN: Octree-based convolutional neural networks for 3d shape analysis. ACM Trans Gr 36(4):1–11
Riegler G, Ulusoy AO, Geiger A (2017) OctNet: Learning deep 3D representations at high resolutions. IEEE Conf Computer Vision Pattern Recognit (CVPR) 2017:6620–6629
Masci J, Boscaini D, Bronstein M, Vandergheynst P (2015) Geodesic convolutional neural networks on riemannian manifolds. IEEE Int Conf Computer Vision Workshop (ICCVW) 2015:832–840
Boscaini D, Masci J, Rodolá E, Bronstein M (2016) Learning shape correspondence with anisotropic convolutional neural networks. In: NIPS
Bai S, Bai X, Zhou Z, Zhang Z, Latecki L (2016) GIFT: A real-time and scalable 3D shape search engine. IEEE Conf Computer Vision Pattern Recognit (CVPR) 2016:5023–5032
Shi B, Bai S, Zhou Z, Bai X (2015) DeepPano: Deep panoramic representation for 3-D shape recognition. IEEE Signal Process Lett 22:2339–2343
Su H, Maji S, Kalogerakis E, Learned-Miller E (2015) Multi-view convolutional neural networks for 3D shape recognition. IEEE Int Conf Computer Vision (ICCV) 2015:945–953
Alexa M, Behr J, Cohen-Or D, Fleishman S, Levin D, Silva CT (2001). Point set surfaces. In: Proceedings of the conference on Visualization ’01 (VIS ’01). IEEE Computer Society, USA, 21–28
Lars L (2001) Point cloud representation, Technical Report, Faculty of Computer Science, University of Karlsruhe
Guo K, Zou D, Chen X (2015) 3D Mesh labeling via deep convolutional neural networks. ACM Trans Gr (TOG) 35:1–12
Sinha A, Bai J, Ramani K (2016) Deep learning 3D shape surfaces using geometry images. In: ECCV
Steinke F, Schölkopf B, Blanz V (2006) Learning dense 3D correspondence. In: NIPS
Sun J, Ovsjanikov M, Guibas L (2009) A concise and provably informative multi-scale signature based on heat diffusion. Computer Gr Forum 28:1383–1392
Rustamov R (2007) Laplace-Beltrami eigenfunctions for deformation invariant shape representation. In: Symposium on Geometry Processing
Ovsjanikov M, Bronstein A, Bronstein M, Guibas L (2009) Shape google: a computer vision approach to isometry invariant shape retrieval. In: 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, 320–327
Golovinskiy A, Kim VG, Funkhouser T (2009) Shape-based recognition of 3D point clouds in urban environments. In: 2009 IEEE 12th International Conference on Computer Vision, 2154–2161
Wu Z, Shou R, Wang Y, Liu X (2014) Interactive shape co-segmentation via label propagation. Comput Gr 38:248–254
Yamauchi H, Lee S, Lee Y, Ohtake Y, Belyaev A, Seidel H (2005) Feature sensitive mesh segmentation with mean shift. In: International Conference on Shape Modeling and Applications 2005 (SMI’ 05), 236–243
Vieira M, Shimada K (2005) Surface mesh segmentation and smooth surface extraction through region growing. Comput Aided Geom Des 22:771–792
Kazmi IK, You L, Zhang J (2013) A survey of 2D and 3D shape descriptors. In: 2013 10th International Conference Computer Graphics, Imaging and Visualization, 1–10
Rostami R, Bashiri FS, Rostami B, Yu Z (2019) A survey on data-driven 3D shape descriptors. Computer Gr Forum 38:356–393
Toldo R, Castellani U, Fusiello A (2009) Visual vocabulary signature for 3D object retrieval and partial matching. In: 3DOR@Eurographics
Nair V, Hinton GE (2009) 3D Object recognition with deep belief nets. NIPS 22:1339–1347
Alain G, Bengio Y (2014) What regularized auto-encoders learn from the data-generating distribution. J Mach Learn Res 15:3563–3593
Socher R, Huval B, Bath BP, Manning CD, Ng A (2012) Convolutional-recursive deep learning for 3D object classification. NIPS 25:656–664
Graham B (2015) Sparse 3D convolutional neural networks. BMVC
Riegler G, Ulusoy AO, Bischof H, Geiger A (2017) OctNetFusion: Learning depth fusion from data. In: 2017 International Conference on 3D Vision (3DV),pp. 57–66
Wang P, Liu Y, Tong X (2020) Deep octree-based CNNs with output-guided skip connections for 3D shape and scene completion. IEEE/CVF Conf Computer Vision Pattern Recognit Workshops (CVPRW) 2020:1074–1081
Bribiesca E (2008) A method for representing 3D tree objects using chain coding. J Vis Commun Image Represent 19:184–198
Zhi S, Liu Y, Li X, Guo Y (2018) Toward real-time 3D object recognition: a lightweight volumetric CNN framework using multitask learning. Comput Graph 71:199–207
Wang C, Cheng M, Sohel F, Bennamoun M, Li J (2019) NormalNet: A voxel-based CNN for 3D object classification and retrieval. Neurocomputing 323:139–147
Han Z, Shang M, Liu Y, Zwicker M (2019) View inter-prediction GAN: unsupervised representation learning for 3D shapes by learning global shape memories to support local view predictions. In: AAAI
Kanezaki A, Matsushita Y, Nishida Y (2018) RotationNet: Joint object categorization and pose estimation using multiviews from unsupervised viewpoints. IEEE/CVF Conf Computer Vision Pattern Recognit 2018:5010–5019
Feng Y, Zhang Z, Zhao X, Ji R, Gao Y (2018) GVCNN: Group-view convolutional neural networks for 3D shape recognition. IEEE/CVF Conf Computer Vision Pattern Recognit 2018:264–272
Bronstein M, Bruna J, LeCun Y, Szlam AD, Vandergheynst P (2017) Geometric deep learning: going beyond euclidean data. IEEE Signal Process Mag 34:18–42
Yi L, Su H, Guo X, Guibas L (2017) SyncSpecCNN: Synchronized Spectral CNN for 3D shape segmentation. IEEE Conf Computer Vision Pattern Recognit (CVPR) 2017:6584–6592
Engelmann F, Kontogianni T, Schult J, Leibe B (2018) Know what your neighbors Do: 3D semantic segmentation of point clouds.arXiv:1810.01151
Jiang M, Wu Y, Lu C (2018) PointSIFT: A SIFT-like network module for 3D point cloud semantic segmentation.arXiv:1807.00652
Pauly M, Gross M, Kobbelt L (2002) Efficient simplification of point-sampled surfaces. IEEE Visualization 2002. VIS 2002:163–170
Moenning C, Dodgson N (2003) A new point cloud simplification algorithm
Katz S, Tal A (2013) Improving the visual comprehension of point sets. IEEE Conf Computer Vision Pattern Recognit 2013:121–128
Chen S, Tian D, Feng C, Vetro A, Kovacevic J (2018) Fast resampling of three-dimensional point clouds via graphs. IEEE Trans Signal Process 66:666–681
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICML
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift.arXiv:1502.03167
Yi L, Kim VG, Ceylan D, Shen I, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L (2016) A scalable active framework for region annotation in 3D shape collections. ACM Trans Gr (TOG) 35:1–12
Wu J, Zhang C, Xue T, Freeman B, Tenenbaum J (2016) Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In: NIPS
Liu S, Giles CL, Ororbia A (2018) Learning a hierarchical latent-variable model of 3D shapes. In: 2018 International Conference on 3D Vision (3DV), pp. 542–551
Ma C, An W, Lei Y, Guo Y (2017) BV-CNNs: Binary volumetric convolutional networks for 3D object recognition. BMVC 1:4
Dominguez M, Dhamdhere R, Petkar A, Jain S, Sah S, Ptucha R (2018) General-purpose deep Point cloud feature extractor. In: IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Tahoe, NV, USA, pp. 1972–1981, https://doi.org/10.1109/WACV.2018.00218.
Simonovsky M, Komodakis N (2017) Dynamic edge-conditioned filters in convolutional neural networks on graphs. IEEE Conf Computer Vision Pattern Recognit (CVPR) 2017:29–38
Kasaei H (2019) OrthographicNet: A deep learning approach for 3D object recognition in open-ended domains.arXiv:1902.03057
Han Z, Shang M, Liu Z, Vong C, Liu Y, Zwicker M, Han J, Chen C (2019) SeqViews2SeqLabels: Learning 3D global features via aggregating sequential views by RNN with attention. IEEE Trans Image Process 28:658–672
Liu X, Han Z, Liu Y, Zwicker M (2019) Point2Sequence: Learning the shape representation of 3D point clouds with an attention-based sequence to sequence network. In: AAAI
Arshad S, Shahzad M, Riaz Q, Fraz M (2019) DPRNet: Deep 3D point based residual network for semantic segmentation and classification of 3D point clouds. IEEE Access 7:68892–68904
Song Y, Gao L, Li X, Shen W (2020) A novel point cloud encoding method based on local information for 3D classification and segmentation. Sensors (Basel, Switzerland) 20:2501
Lyu Y, Huang X, Zhang Z (2020) Learning to segment 3D point clouds in 2D image space. IEEE/CVF Conf Computer Vision Pattern Recognit (CVPR) 2020:12252–12261
Leng B, Liu Y, Yu K, Zhang X, Xiong Z (2016) 3D object understanding with 3D convolutional neural networks. Inf Sci 366:188–201
Le T, Duan Y (2018) PointGrid: A deep network for 3D shape understanding. IEEE/CVF Conf Computer Vision Pattern Recognit 2018:9204–9214
Acknowledgements
This work was supported by the National Natural Science Foundation of China (61671397). We thank all anonymous reviewers for their constructive comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Gezawa, A.S., Bello, Z.A., Wang, Q. et al. A voxelized point clouds representation for object classification and segmentation on 3D data. J Supercomput 78, 1479–1500 (2022). https://doi.org/10.1007/s11227-021-03899-x
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-03899-x