Abstract
In this paper, we propose a novel and effective approach, namely GridNet, to hierarchically learn deep representation of 3D point clouds. It incorporates the ability of regular holistic description and fast data processing in a single framework, which is able to abstract powerful features progressively in an efficient way. Moreover, to capture more accurate internal geometry attributes, anchors are inferred within local neighborhoods, in contrast to the fixed or the sampled ones used in existing methods, and the learned features are thus more representative and discriminative to local point distribution. GridNet delivers very competitive results compared with the state of the art methods in both the object classification and segmentation tasks.
Similar content being viewed by others
References
Maturana D, Scherer S. Voxnet: a 3D convolutional neural network for real-time object recognition. In: Proceedings of the IEEE International Conference on Intelligent Robots and Systems. 2015, 922–928
Brock A, Lim T, Ritchie J, Weston N. Generative and discriminative voxel modeling with convolutional neural networks. In: Proceedings of Neural Information Processing Conference: 3D Deep Learning. 2016
Hegde V, Zadeh R. Fusionnet: 3D object classification using multiple data representations. 2016, arXiv preprint arXiv:1607.05695
Su H, Maji S, Kalogerakis E, Learned-Miller E. Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 945–953
Yu T, Meng J, Yuan J. Multi-view harmonized bilinear network for 3D object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 186–194
Feng Y, Zhang Z, Zhao X, Ji R, Gao Y. GVCNN: group-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 264–272
Qi C, Su H, Nießner M, Dai A, Yan M, Guibas L. Volumetric and multi-view cnns for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 5648–5656
Kanezaki A, Matsushita Y, Nishida Y. Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition. 2018, 5010–5019
Zhou Y, Tuzel O. Voxelnet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 4490–4499
Li Y, Bu R, Sun M, Chen B. Pointcnn: convolution on χ-transformed points. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 828–838
Qi C, Su H, Mo K, Guibas L. Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 77–85
Le T, Duan Y. Pointgrid: a deep network for 3D shape understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 9204–9214
Ye X, Li J, Huang H, Du L, Zhang X. 3D recurrent neural networks with context fusion for point cloud semantic segmentation. In: Proceedings of European Conference on Computer Vision. 2018, 415–430
Qi C, Yi L, Su H, Guibas L. Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 5099–5108
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1912–1920
Yi L, Kim V, Ceylan D, Shen I, Yan M, Su H, Lu C, Huang Q, Alla Sheffer, Leonidas Guibas, et al. A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics, 2016, 35(6): 1–12
Rethage D, Wald J, Sturm J, Navab N, Tombari F. Fully-convolutional point networks for large-scale point clouds. In: Proceedings of European Conference on Computer Vision. 2018, 625–640
Liu Z, Tang H, Lin Y, Han S. Point-voxel CNN for efficient 3D deep learning. In: Proceedings of Annual Conference on Neural Information Processing Systems. 2019, 963–973
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1–9
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770–778
Klokov R, Lempitsky V. Escape from cells: deep KD-networks for the recognition of 3D point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 863–872
Wang P, Liu Y, Guo Y, Sun C, Tong X. O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Transactions on Graphics, 2017, 36(4): 1–11
Tatarchenko M, Dosovitskiy A, Brox T. Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 2107–2115
Riegler G, Ulusoy A, Geiger A. Octnet: learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6620–6629
Yi L, Su H, Guo X, Guibas L. Syncspeccnn: synchronized spectral CNN for 3D shape segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6584–6592
Zhi S, Liu Y, Li X, Guo Y. Lightnet: a lightweight 3D convolutional neural network for real-time 3D object recognition. In: Proceedings of the Workshop on 3D Object Retrieval. 2017, 9–16
Li Y, Pirk S, Su H, Qi C, Guibas L. FPNN: field probing neural networks for 3D data. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 307–315
Shen Y, Feng C, Yang Y, Tian D. Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 4548–4557
Yang Y, Feng C, Shen Y, Tian D. Foldingnet: point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 206–215
Hua B, Tran M, Yeung S. Pointwise convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 984–993
You H, Feng Y, Ji R, Gao Y. PVNet: a joint convolutional network of point cloud and multi-view for 3D shape recognition. In: Proceedings of the 26th ACM International Conference on Multimedia. 2018, 1310–1318
Jiang M, Wu Y, Lu C. Pointsift: a sift-like network module for 3D point cloud semantic segmentation. 2018, arXiv preprint arXiv:1807.00652
Li J, Chen B, Lee G. So-net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 9397–9406
Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278–2324
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant No. 61673033).
Author information
Authors and Affiliations
Corresponding author
Additional information
Huiqun Wang received the BS and MS degrees in computer science from Beihang University, China in 2015 and 2018, where he is currently pursuing the PhD degree. His current research interests is 3D computer vision.
Di Huang received the BS and MS degrees in computer science from Beihang University, China, and the PhD degree in computer science from Ecole Centrale de Lyon, France, in 2005, 2008, and 2011, respectively. He then joined the Laboratory of Intelligent Recognition and Image Processing, Beijing Key Laboratory of Digital Media, School of Computer Science and Engineering, Beihang University, as a Faculty Member. He is currently an associate professor, and his research interests include biometrics, 2-D/3-D face analysis, image/video processing, and pattern recognition.
Yunhong Wang received the BS degree in electronic engineering from Northwestern Polytechnical University, China in 1989, the MS degree and the PhD degree in electronic engineering from Nanjing University of Science and Technology, China in 1995, 1998 respectively. She worked at the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China from 1998 to 2004. Since 2004, she has been a professor with the School of Computer Science and Engineering, Beihang University, China, where she is also the Director of Laboratory of Intelligent Recognition and Image Processing, Beijing Key Laboratory of Digital Media. Her research interests include biometrics, pattern recognition, computer vision, data fusion and image processing. She is a fellow of the IEEE Computer Society.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Wang, H., Huang, D. & Wang, Y. GridNet: efficiently learning deep hierarchical representation for 3D point cloud understanding. Front. Comput. Sci. 16, 161301 (2022). https://doi.org/10.1007/s11704-020-9521-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11704-020-9521-2