Skip to main content

Advertisement

Log in

Point-Voxel Based Geometry-Adaptive Network for 3D Point Cloud Analysis

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Point cloud analysis is challenging because of the unordered and irregular data structure of point clouds. To describe geometric information in point clouds, existing methods mainly use convolution, graph, and attention operations to construct sophisticated local aggregation operators. These operators work well in extracting local information but bring unfavorable inference latency due to high computation complexity. To solve the above problem, this paper presents a novel point-voxel based geometry-adaptive network (PVGANet), which combines multiple representations of point and voxel to describe the point cloud from different granularities and can obtain features of different scales effectively. To extract fine-grained geometric features, we design the position-adaptive pooling operator, which uses point pairs’ relative position and feature similarity to weight and aggregate the point features at local areas of point clouds. To extract coarse-grained local features, we design a depth-wise convolution operator, which conducts the depth-wise convolution on voxel grids. With an easy addition, fine-grained geometric and coarse-grained local features can be fused, and we can use the geometry-adaptive fused features to complete the efficient shape analysis of point clouds, such as shape classification and part segmentation. Extensive experiments on ModelNet40, ScanObjectNN, and ShapeNet Part benchmarks demonstrate that our PVGANet achieves competitive performance compared with the related methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Wang S, Jiang Y, Hu J, Fan X, Luo Z, Liu Y, Liu L. Efficient representation and optimization of TPMS-based porous structures for 3D heat dissipation. Computer-Aided Design, 2022, 142: 103123. DOI: https://doi.org/10.1016/j.cad.2021.103123.

    Article  MathSciNet  Google Scholar 

  2. Qi C R, Su H, Mo K C, Guibas L J. PointNet: Deep learning on point sets for 3D classification and segmentation. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.77–85. DOI: https://doi.org/10.1109/CVPR.2017.16.

    Google Scholar 

  3. Su H, Maji S, Kalogerakis E, Learned-Miller E. Multi-view convolutional neural networks for 3D shape recognition. In Proc. the 2015 IEEE International Conference on Computer Vision, Dec. 2015, pp.945–953. DOI: https://doi.org/10.1109/ICCV.2015.114.

    Google Scholar 

  4. Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 3D ShapeNets: A deep representation for volumetric shapes. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2015, pp.1912–1920. DOI: https://doi.org/10.1109/CVPR.2015.7298801.

    Google Scholar 

  5. Maturana D, Scherer S. VoxNet: A 3D convolutional neural network for real-time object recognition. In Proc. the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sept. 28–Oct. 2, 2015, pp.922–928. DOI: https://doi.org/10.1109/IROS.2015.7353481.

    Google Scholar 

  6. Çiçek Ö, Abdulkadir A, Lienkamp S S, Brox T, Ronneberger O. 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In Proc. the 19th International Conference on Medical Image Computing and Computer-Assisted Intervention, Oct. 2016, pp.424–432. DOI: https://doi.org/10.1007/978-3-319-46723-8_49.

    Google Scholar 

  7. Goyal A, Law H, Liu B, Newell A, Deng J. Revisiting point cloud shape classification with a simple and effective baseline. In Proc. the 38th International Conference on Machine Learning, Jul. 2021, pp.3809–3820.

    Google Scholar 

  8. Hamdi A, Giancola S, Ghanem B. MVTN: Multi-view transformation network for 3D shape recognition. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.1–11. DOI: https://doi.org/10.1109/ICCV48922.2021.00007.

    Google Scholar 

  9. Choy C, Gwak J, Savarese S. 4D spatio-temporal convnets: Minkowski convolutional neural networks. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.3070–3079. DOI: https://doi.org/10.1109/CVPR.2019.00319.

    Google Scholar 

  10. Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S. Searching efficient 3D architectures with sparse point-voxel convolution. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.685–702. DOI: https://doi.org/10.1007/978-3-030-58604-1_41.

    Google Scholar 

  11. Qi C R, Yi L, Su H, Guibas L J. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.5105–5114.

    Google Scholar 

  12. Liu Y, Fan B, Xiang S, Pan C. Relation-shape convolutional neural network for point cloud analysis. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.8887–8896. DOI: https://doi.org/10.1109/CVPR.2019.00910.

    Google Scholar 

  13. Qian G, Hammoud H A A K, Li G, Thabet A, Ghanem B. ASSANet: An anisotropic separable set abstraction for efficient point cloud representation learning. In Proc. the 35th International Conference on Neural Information Processing Systems, Jun. 2021, Article No. 2154, pp.28119–28130.

    Google Scholar 

  14. Ma X, Qin C, You H, Ran H, Fu Y. Rethinking network design and local geometry in point cloud: A simple residual MLP framework. In Proc. the 10th International Conference on Learning Representations, Apr. 2022.

    Google Scholar 

  15. Qian G, Li Y, Peng H, Mai J, Hammoud H A A K, Elhoseiny M, Ghanem B. PointNeXt: Revisiting PointNet++ with improved training and scaling strategies. In Proc. the 36th Conference on Neural Information Processing Systems, Nov. 28–Dec. 9, 2022.

    Google Scholar 

  16. Xu Y, Fan T, Xu M, Zeng L, Qiao Y. SpiderCNN: Deep learning on point sets with parameterized convolutional filters. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.90–105. DOI: https://doi.org/10.1007/978-3030-01237-3_6.

    Google Scholar 

  17. Li Y, Bu R, Sun M, Wu W, Di X, Chen B. PointCNN: Convolution on X-transformed points. In Proc. the 32nd Conference on Neural Information Processing Systems, Dec. 2018, pp.820–830.

    Google Scholar 

  18. Qiu S, Anwar S, Barnes N. Dense-resolution network for point cloud classification and segmentation. In Proc. the 2021 IEEE Winter Conference on Applications of Computer Vision, Jan. 2021, pp.3812–3821. DOI: https://doi.org/10.1109/WACV48630.2021.00386.

    Chapter  Google Scholar 

  19. Cheng S, Chen X, He X, Liu Z, Bai X. PRA-Net: Point relation-aware network for 3D point cloud analysis. IEEE Trans. Image Processing, 2021, 30: 4436–4448. DOI: https://doi.org/10.1109/TIP.2021.3072214.

    Article  Google Scholar 

  20. Wang Y, Sun Y, Liu Z, Sarma S E, Bronstein M M, Solomon J M. Dynamic graph CNN for learning on point clouds. ACM Trans. Graphics, 2019, 38 (5): Article No. 146. DOI: https://doi.org/10.1145/3326362.

  21. Li G, Müller M, Thabet A, Ghanem B. DeepGCNs: Can GCNs go as deep as CNNs? In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27–Nov. 2, 2019, pp.9266–9275. DOI: https://doi.org/10.1109/ICCV.2019.00936.

    Google Scholar 

  22. Liu X, Han Z, Liu Y S, Zwicker M. Point2Sequence: Learning the shape representation of 3D point clouds with an attention-based sequence to sequence network. In Proc. the 33rd AAAI Conference on Artificial Intelligence, Jun. 2019, pp.8778–8785. DOI: https://doi.org/10.1609/aaai.v33i01.33018778.

    Google Scholar 

  23. Yan X, Zheng C, Li Z, Wang S, Cui S. PointASNL: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In Proc. the 2020 IEEE/ CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.5588–5597. DOI: https://doi.org/10.1109/CVPR42600.2020.00563.

    Google Scholar 

  24. Guo M H, Cai J X, Liu Z N, Mu T J, Martin R R, Hu S M. PCT: Point cloud transformer. Computational Visual Media, 2021, 7(2): 187–199. DOI: https://doi.org/10.1007/s41095-021-0229-5.

    Article  Google Scholar 

  25. Hu Q, Yang B, Xie L, Rosa S, Guo Y, Wang Z, Trigoni N, Markham A. RandLA-Net: Efficient semantic segmentation of large-scale point clouds. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.11105–11114. DOI: https://doi.org/10.1109/CVPR42600.2020.01112.

    Google Scholar 

  26. Qiu S, Anwar S, Barnes N. Semantic segmentation for real point cloud scenes via bilateral augmentation and adaptive fusion. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.1757–1767. DOI: https://doi.org/10.1109/CVPR46437.2021.00180.

    Google Scholar 

  27. Nie D, Lan R, Wang L, Ren X. Pyramid architecture for multi-scale processing in point cloud segmentation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.17263–17273. DOI: https://doi.org/10.1109/CVPR52688.2022.01677.

    Google Scholar 

  28. Zhao H, Jiang L, Jia J, Torr P H S, Koltun V. Point transformer. In Proc. the IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.16259–16268.

    Google Scholar 

  29. Qiu S, Anwar S, Barnes N. Geometric back-projection network for point cloud classification. IEEE Trans. Multimedia, 2021, 24: 1943–1955. DOI: https://doi.org/10.1109/TMM.2021.3074240.

    Article  Google Scholar 

  30. Lai X, Liu J, Jiang L, Wang L, Zhao H, Liu S, Qi X, Jia J. Stratified transformer for 3D point cloud segmentation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.8490–8499. DOI: https://doi.org/10.1109/CVPR52688.2022.00831.

    Google Scholar 

  31. Song Y, He F, Duan Y, Liang Y, Yan X. A kernel correlation-based approach to adaptively acquire local features for learning 3D point clouds. Computer-Aided Design, 2022, 146: 103196. DOI: https://doi.org/10.1016/j.cad.2022.103196.

    Article  MathSciNet  Google Scholar 

  32. Wang S, Liu Y, Wang L, Sun Y, Yin B. PASIFTNet: Scale-and-directional-aware semantic segmentation of point clouds. Computer-Aided Design, 2023, 156: 103462. DOI: https://doi.org/10.1016/j.cad.2022.103462.

    Article  Google Scholar 

  33. You H, Feng Y, Ji R, Gao Y. PVNet: A joint convolutional network of point cloud and multi-view for 3D shape recognition. In Proc. the 26th ACM International Conference on Multimedia, Oct. 2018, pp.1310–1318. DOI: https://doi.org/10.1145/3240508.3240702.

    Chapter  Google Scholar 

  34. You H, Feng Y, Zhao X, Zou C, Ji R, Gao Y. PVRNet: Point-view relation neural network for 3D shape recognition. In Proc. the 33rd AAAI Conference on Artificial Intelligence, Jul. 2019, pp.9119–9126. DOI: https://doi.org/10.1609/aaai.v33i01.33019119.

    Google Scholar 

  35. Le T, Duan Y. PointGrid: A deep network for 3D shape understanding. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.9204–9214. DOI: https://doi.org/10.1109/CVPR.2018.00959.

    Chapter  Google Scholar 

  36. Liu Z, Tang H, Lin Y, Han S. Point-voxel CNN for efficient 3D deep learning. In Proc. the 33rd International Conference on Neural Information Processing Systems, Dec. 2019, p.87.

    Google Scholar 

  37. Shi S, Guo C, Jiang L, Wang Z, Shi J, Wang X, Li H. PV-RCNN: Point-voxel feature set abstraction for 3D object detection. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.10526–10535. DOI: https://doi.org/10.1109/CVPR42600.2020.01054.

    Google Scholar 

  38. Noh J, Lee S, Ham B. HVPR: Hybrid voxel-point representation for single-stage 3D object detection. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.14600–14609. DOI: https://doi.org/10.1109/CVPR46437.2021.01437.

    Google Scholar 

  39. Xu J, Zhang R, Dou J, Zhu Y, Sun J, Pu S. RPVNet: A deep and efficient range-point-voxel fusion network for LiDAR point cloud segmentation. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.16004–16013. DOI: https://doi.org/10.1109/ICCV48922.2021.01572.

    Google Scholar 

  40. Howard A G, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv: 1704.04861, 2017. https://arxiv.org/abs/1704.04861, Sept. 2024.

    Google Scholar 

  41. Liu Z, Hu H, Cao Y, Zhang Z, Tong X. A closer look at local aggregation operators in point cloud analysis. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.326–342. DOI: https://doi.org/10.1007/978-3-030-58592-1_20.

    Google Scholar 

  42. Uy M A, Pham Q H, Hua B S, Nguyen T, Yeung S K. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27–Nov. 2, 2019, pp.1588–1597. DOI: https://doi.org/10.1109/ICCV.2019.00167.

    Google Scholar 

  43. Yi L, Kim V G, Ceylan D, Shen I C, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L. A scalable active framework for region annotation in 3D shape collections. ACM Trans. Graphics, 2016, 35 (6): Article No. 210. DOI: https://doi.org/10.1145/2980179.2980238.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui Zeng  (曾 慧).

Ethics declarations

Conflict of Interest The authors declare that they have no conflict of interest.

Additional information

Recommended by ChinaMM 2023

The work was supported by the National Natural Science Foundation of China under Grant Nos. 62273034, 61973029, and 62076026, and the Scientific and Technological Innovation Foundation of Foshan under Grant No. BK21BF004.

Tian-Meng Zhao received his B.S. degree from University of Science and Technology Beijing, Beijing, in 2020. Now, he is studying for his master’s degree at the same university. His main research interests include computer vision and point cloud processing.

Hui Zeng received her B.S. and M.S. degrees from Shandong University, Jinan, in 2001 and 2004, respectively, and received her Ph.D. degree from Institute of Automation, Chinese Academy of Sciences, Beijing, in 2007. She is currently a professor at the Beijing Engineering Research Center of Industrial Spectrum Imaging, School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing. Her main research interests include computer vision, pattern recognition, and machine learning.

Bao-Qing Zhang received his Ph.D. degree from University of Science and Technology Beijing, Beijing, in 2014. He is currently a senior engineer at Beijing Institute of Electronic System Engineering, Beijing. His main research interests include information processing.

Hong-Min Liu received her B.S. degree from Xidian University, Xi’an, in 2004, and her Ph.D. degree from the Institute of Electronics, Chinese Academy of Sciences, Beijing, in 2009. She is currently a professor with the School of Intelligence Science and Technology and the Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing. Her research focuses on image processing, computer vision, and pattern recognition.

Bin Fan received his B.S. degree from Beijing University of Chemical Technology, Beijing, in 2006, and his Ph.D. degree from the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, in 2011. He is currently a professor with School of Intelligence Science and Technology and the Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing. His research focuses on computer vision, pattern recognition, image processing, and multimedia.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, TM., Zeng, H., Zhang, BQ. et al. Point-Voxel Based Geometry-Adaptive Network for 3D Point Cloud Analysis. J. Comput. Sci. Technol. 39, 1167–1179 (2024). https://doi.org/10.1007/s11390-024-3521-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-024-3521-x

Keywords

Navigation