Skip to main content
Log in

GridNet: efficiently learning deep hierarchical representation for 3D point cloud understanding

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

In this paper, we propose a novel and effective approach, namely GridNet, to hierarchically learn deep representation of 3D point clouds. It incorporates the ability of regular holistic description and fast data processing in a single framework, which is able to abstract powerful features progressively in an efficient way. Moreover, to capture more accurate internal geometry attributes, anchors are inferred within local neighborhoods, in contrast to the fixed or the sampled ones used in existing methods, and the learned features are thus more representative and discriminative to local point distribution. GridNet delivers very competitive results compared with the state of the art methods in both the object classification and segmentation tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Maturana D, Scherer S. Voxnet: a 3D convolutional neural network for real-time object recognition. In: Proceedings of the IEEE International Conference on Intelligent Robots and Systems. 2015, 922–928

  2. Brock A, Lim T, Ritchie J, Weston N. Generative and discriminative voxel modeling with convolutional neural networks. In: Proceedings of Neural Information Processing Conference: 3D Deep Learning. 2016

  3. Hegde V, Zadeh R. Fusionnet: 3D object classification using multiple data representations. 2016, arXiv preprint arXiv:1607.05695

  4. Su H, Maji S, Kalogerakis E, Learned-Miller E. Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 945–953

  5. Yu T, Meng J, Yuan J. Multi-view harmonized bilinear network for 3D object recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 186–194

  6. Feng Y, Zhang Z, Zhao X, Ji R, Gao Y. GVCNN: group-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 264–272

  7. Qi C, Su H, Nießner M, Dai A, Yan M, Guibas L. Volumetric and multi-view cnns for object classification on 3D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 5648–5656

  8. Kanezaki A, Matsushita Y, Nishida Y. Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition. 2018, 5010–5019

  9. Zhou Y, Tuzel O. Voxelnet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 4490–4499

  10. Li Y, Bu R, Sun M, Chen B. Pointcnn: convolution on χ-transformed points. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 828–838

  11. Qi C, Su H, Mo K, Guibas L. Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 77–85

  12. Le T, Duan Y. Pointgrid: a deep network for 3D shape understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 9204–9214

  13. Ye X, Li J, Huang H, Du L, Zhang X. 3D recurrent neural networks with context fusion for point cloud semantic segmentation. In: Proceedings of European Conference on Computer Vision. 2018, 415–430

  14. Qi C, Yi L, Su H, Guibas L. Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 5099–5108

  15. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 1912–1920

  16. Yi L, Kim V, Ceylan D, Shen I, Yan M, Su H, Lu C, Huang Q, Alla Sheffer, Leonidas Guibas, et al. A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics, 2016, 35(6): 1–12

    Article  Google Scholar 

  17. Rethage D, Wald J, Sturm J, Navab N, Tombari F. Fully-convolutional point networks for large-scale point clouds. In: Proceedings of European Conference on Computer Vision. 2018, 625–640

  18. Liu Z, Tang H, Lin Y, Han S. Point-voxel CNN for efficient 3D deep learning. In: Proceedings of Annual Conference on Neural Information Processing Systems. 2019, 963–973

  19. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014, 1–9

  20. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770–778

  21. Klokov R, Lempitsky V. Escape from cells: deep KD-networks for the recognition of 3D point cloud models. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 863–872

  22. Wang P, Liu Y, Guo Y, Sun C, Tong X. O-CNN: octree-based convolutional neural networks for 3D shape analysis. ACM Transactions on Graphics, 2017, 36(4): 1–11

    Google Scholar 

  23. Tatarchenko M, Dosovitskiy A, Brox T. Octree generating networks: efficient convolutional architectures for high-resolution 3D outputs. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 2107–2115

  24. Riegler G, Ulusoy A, Geiger A. Octnet: learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6620–6629

  25. Yi L, Su H, Guo X, Guibas L. Syncspeccnn: synchronized spectral CNN for 3D shape segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017, 6584–6592

  26. Zhi S, Liu Y, Li X, Guo Y. Lightnet: a lightweight 3D convolutional neural network for real-time 3D object recognition. In: Proceedings of the Workshop on 3D Object Retrieval. 2017, 9–16

  27. Li Y, Pirk S, Su H, Qi C, Guibas L. FPNN: field probing neural networks for 3D data. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 307–315

  28. Shen Y, Feng C, Yang Y, Tian D. Mining point cloud local structures by kernel correlation and graph pooling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 4548–4557

  29. Yang Y, Feng C, Shen Y, Tian D. Foldingnet: point cloud auto-encoder via deep grid deformation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 206–215

  30. Hua B, Tran M, Yeung S. Pointwise convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 984–993

  31. You H, Feng Y, Ji R, Gao Y. PVNet: a joint convolutional network of point cloud and multi-view for 3D shape recognition. In: Proceedings of the 26th ACM International Conference on Multimedia. 2018, 1310–1318

  32. Jiang M, Wu Y, Lu C. Pointsift: a sift-like network module for 3D point cloud semantic segmentation. 2018, arXiv preprint arXiv:1807.00652

  33. Li J, Chen B, Lee G. So-net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018, 9397–9406

  34. Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278–2324

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 61673033).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Di Huang.

Additional information

Huiqun Wang received the BS and MS degrees in computer science from Beihang University, China in 2015 and 2018, where he is currently pursuing the PhD degree. His current research interests is 3D computer vision.

Di Huang received the BS and MS degrees in computer science from Beihang University, China, and the PhD degree in computer science from Ecole Centrale de Lyon, France, in 2005, 2008, and 2011, respectively. He then joined the Laboratory of Intelligent Recognition and Image Processing, Beijing Key Laboratory of Digital Media, School of Computer Science and Engineering, Beihang University, as a Faculty Member. He is currently an associate professor, and his research interests include biometrics, 2-D/3-D face analysis, image/video processing, and pattern recognition.

Yunhong Wang received the BS degree in electronic engineering from Northwestern Polytechnical University, China in 1989, the MS degree and the PhD degree in electronic engineering from Nanjing University of Science and Technology, China in 1995, 1998 respectively. She worked at the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China from 1998 to 2004. Since 2004, she has been a professor with the School of Computer Science and Engineering, Beihang University, China, where she is also the Director of Laboratory of Intelligent Recognition and Image Processing, Beijing Key Laboratory of Digital Media. Her research interests include biometrics, pattern recognition, computer vision, data fusion and image processing. She is a fellow of the IEEE Computer Society.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., Huang, D. & Wang, Y. GridNet: efficiently learning deep hierarchical representation for 3D point cloud understanding. Front. Comput. Sci. 16, 161301 (2022). https://doi.org/10.1007/s11704-020-9521-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-020-9521-2

Keywords

Navigation