Abstract
Point cloud analysis is challenging because of the unordered and irregular data structure of point clouds. To describe geometric information in point clouds, existing methods mainly use convolution, graph, and attention operations to construct sophisticated local aggregation operators. These operators work well in extracting local information but bring unfavorable inference latency due to high computation complexity. To solve the above problem, this paper presents a novel point-voxel based geometry-adaptive network (PVGANet), which combines multiple representations of point and voxel to describe the point cloud from different granularities and can obtain features of different scales effectively. To extract fine-grained geometric features, we design the position-adaptive pooling operator, which uses point pairs’ relative position and feature similarity to weight and aggregate the point features at local areas of point clouds. To extract coarse-grained local features, we design a depth-wise convolution operator, which conducts the depth-wise convolution on voxel grids. With an easy addition, fine-grained geometric and coarse-grained local features can be fused, and we can use the geometry-adaptive fused features to complete the efficient shape analysis of point clouds, such as shape classification and part segmentation. Extensive experiments on ModelNet40, ScanObjectNN, and ShapeNet Part benchmarks demonstrate that our PVGANet achieves competitive performance compared with the related methods.
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Wang S, Jiang Y, Hu J, Fan X, Luo Z, Liu Y, Liu L. Efficient representation and optimization of TPMS-based porous structures for 3D heat dissipation. Computer-Aided Design, 2022, 142: 103123. DOI: https://doi.org/10.1016/j.cad.2021.103123.
Qi C R, Su H, Mo K C, Guibas L J. PointNet: Deep learning on point sets for 3D classification and segmentation. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.77–85. DOI: https://doi.org/10.1109/CVPR.2017.16.
Su H, Maji S, Kalogerakis E, Learned-Miller E. Multi-view convolutional neural networks for 3D shape recognition. In Proc. the 2015 IEEE International Conference on Computer Vision, Dec. 2015, pp.945–953. DOI: https://doi.org/10.1109/ICCV.2015.114.
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 3D ShapeNets: A deep representation for volumetric shapes. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2015, pp.1912–1920. DOI: https://doi.org/10.1109/CVPR.2015.7298801.
Maturana D, Scherer S. VoxNet: A 3D convolutional neural network for real-time object recognition. In Proc. the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sept. 28–Oct. 2, 2015, pp.922–928. DOI: https://doi.org/10.1109/IROS.2015.7353481.
Çiçek Ö, Abdulkadir A, Lienkamp S S, Brox T, Ronneberger O. 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In Proc. the 19th International Conference on Medical Image Computing and Computer-Assisted Intervention, Oct. 2016, pp.424–432. DOI: https://doi.org/10.1007/978-3-319-46723-8_49.
Goyal A, Law H, Liu B, Newell A, Deng J. Revisiting point cloud shape classification with a simple and effective baseline. In Proc. the 38th International Conference on Machine Learning, Jul. 2021, pp.3809–3820.
Hamdi A, Giancola S, Ghanem B. MVTN: Multi-view transformation network for 3D shape recognition. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.1–11. DOI: https://doi.org/10.1109/ICCV48922.2021.00007.
Choy C, Gwak J, Savarese S. 4D spatio-temporal convnets: Minkowski convolutional neural networks. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.3070–3079. DOI: https://doi.org/10.1109/CVPR.2019.00319.
Tang H, Liu Z, Zhao S, Lin Y, Lin J, Wang H, Han S. Searching efficient 3D architectures with sparse point-voxel convolution. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.685–702. DOI: https://doi.org/10.1007/978-3-030-58604-1_41.
Qi C R, Yi L, Su H, Guibas L J. PointNet++: Deep hierarchical feature learning on point sets in a metric space. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.5105–5114.
Liu Y, Fan B, Xiang S, Pan C. Relation-shape convolutional neural network for point cloud analysis. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp.8887–8896. DOI: https://doi.org/10.1109/CVPR.2019.00910.
Qian G, Hammoud H A A K, Li G, Thabet A, Ghanem B. ASSANet: An anisotropic separable set abstraction for efficient point cloud representation learning. In Proc. the 35th International Conference on Neural Information Processing Systems, Jun. 2021, Article No. 2154, pp.28119–28130.
Ma X, Qin C, You H, Ran H, Fu Y. Rethinking network design and local geometry in point cloud: A simple residual MLP framework. In Proc. the 10th International Conference on Learning Representations, Apr. 2022.
Qian G, Li Y, Peng H, Mai J, Hammoud H A A K, Elhoseiny M, Ghanem B. PointNeXt: Revisiting PointNet++ with improved training and scaling strategies. In Proc. the 36th Conference on Neural Information Processing Systems, Nov. 28–Dec. 9, 2022.
Xu Y, Fan T, Xu M, Zeng L, Qiao Y. SpiderCNN: Deep learning on point sets with parameterized convolutional filters. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.90–105. DOI: https://doi.org/10.1007/978-3030-01237-3_6.
Li Y, Bu R, Sun M, Wu W, Di X, Chen B. PointCNN: Convolution on X-transformed points. In Proc. the 32nd Conference on Neural Information Processing Systems, Dec. 2018, pp.820–830.
Qiu S, Anwar S, Barnes N. Dense-resolution network for point cloud classification and segmentation. In Proc. the 2021 IEEE Winter Conference on Applications of Computer Vision, Jan. 2021, pp.3812–3821. DOI: https://doi.org/10.1109/WACV48630.2021.00386.
Cheng S, Chen X, He X, Liu Z, Bai X. PRA-Net: Point relation-aware network for 3D point cloud analysis. IEEE Trans. Image Processing, 2021, 30: 4436–4448. DOI: https://doi.org/10.1109/TIP.2021.3072214.
Wang Y, Sun Y, Liu Z, Sarma S E, Bronstein M M, Solomon J M. Dynamic graph CNN for learning on point clouds. ACM Trans. Graphics, 2019, 38 (5): Article No. 146. DOI: https://doi.org/10.1145/3326362.
Li G, Müller M, Thabet A, Ghanem B. DeepGCNs: Can GCNs go as deep as CNNs? In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27–Nov. 2, 2019, pp.9266–9275. DOI: https://doi.org/10.1109/ICCV.2019.00936.
Liu X, Han Z, Liu Y S, Zwicker M. Point2Sequence: Learning the shape representation of 3D point clouds with an attention-based sequence to sequence network. In Proc. the 33rd AAAI Conference on Artificial Intelligence, Jun. 2019, pp.8778–8785. DOI: https://doi.org/10.1609/aaai.v33i01.33018778.
Yan X, Zheng C, Li Z, Wang S, Cui S. PointASNL: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In Proc. the 2020 IEEE/ CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.5588–5597. DOI: https://doi.org/10.1109/CVPR42600.2020.00563.
Guo M H, Cai J X, Liu Z N, Mu T J, Martin R R, Hu S M. PCT: Point cloud transformer. Computational Visual Media, 2021, 7(2): 187–199. DOI: https://doi.org/10.1007/s41095-021-0229-5.
Hu Q, Yang B, Xie L, Rosa S, Guo Y, Wang Z, Trigoni N, Markham A. RandLA-Net: Efficient semantic segmentation of large-scale point clouds. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.11105–11114. DOI: https://doi.org/10.1109/CVPR42600.2020.01112.
Qiu S, Anwar S, Barnes N. Semantic segmentation for real point cloud scenes via bilateral augmentation and adaptive fusion. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.1757–1767. DOI: https://doi.org/10.1109/CVPR46437.2021.00180.
Nie D, Lan R, Wang L, Ren X. Pyramid architecture for multi-scale processing in point cloud segmentation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.17263–17273. DOI: https://doi.org/10.1109/CVPR52688.2022.01677.
Zhao H, Jiang L, Jia J, Torr P H S, Koltun V. Point transformer. In Proc. the IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.16259–16268.
Qiu S, Anwar S, Barnes N. Geometric back-projection network for point cloud classification. IEEE Trans. Multimedia, 2021, 24: 1943–1955. DOI: https://doi.org/10.1109/TMM.2021.3074240.
Lai X, Liu J, Jiang L, Wang L, Zhao H, Liu S, Qi X, Jia J. Stratified transformer for 3D point cloud segmentation. In Proc. the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2022, pp.8490–8499. DOI: https://doi.org/10.1109/CVPR52688.2022.00831.
Song Y, He F, Duan Y, Liang Y, Yan X. A kernel correlation-based approach to adaptively acquire local features for learning 3D point clouds. Computer-Aided Design, 2022, 146: 103196. DOI: https://doi.org/10.1016/j.cad.2022.103196.
Wang S, Liu Y, Wang L, Sun Y, Yin B. PASIFTNet: Scale-and-directional-aware semantic segmentation of point clouds. Computer-Aided Design, 2023, 156: 103462. DOI: https://doi.org/10.1016/j.cad.2022.103462.
You H, Feng Y, Ji R, Gao Y. PVNet: A joint convolutional network of point cloud and multi-view for 3D shape recognition. In Proc. the 26th ACM International Conference on Multimedia, Oct. 2018, pp.1310–1318. DOI: https://doi.org/10.1145/3240508.3240702.
You H, Feng Y, Zhao X, Zou C, Ji R, Gao Y. PVRNet: Point-view relation neural network for 3D shape recognition. In Proc. the 33rd AAAI Conference on Artificial Intelligence, Jul. 2019, pp.9119–9126. DOI: https://doi.org/10.1609/aaai.v33i01.33019119.
Le T, Duan Y. PointGrid: A deep network for 3D shape understanding. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.9204–9214. DOI: https://doi.org/10.1109/CVPR.2018.00959.
Liu Z, Tang H, Lin Y, Han S. Point-voxel CNN for efficient 3D deep learning. In Proc. the 33rd International Conference on Neural Information Processing Systems, Dec. 2019, p.87.
Shi S, Guo C, Jiang L, Wang Z, Shi J, Wang X, Li H. PV-RCNN: Point-voxel feature set abstraction for 3D object detection. In Proc. the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.10526–10535. DOI: https://doi.org/10.1109/CVPR42600.2020.01054.
Noh J, Lee S, Ham B. HVPR: Hybrid voxel-point representation for single-stage 3D object detection. In Proc. the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2021, pp.14600–14609. DOI: https://doi.org/10.1109/CVPR46437.2021.01437.
Xu J, Zhang R, Dou J, Zhu Y, Sun J, Pu S. RPVNet: A deep and efficient range-point-voxel fusion network for LiDAR point cloud segmentation. In Proc. the 2021 IEEE/CVF International Conference on Computer Vision, Oct. 2021, pp.16004–16013. DOI: https://doi.org/10.1109/ICCV48922.2021.01572.
Howard A G, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv: 1704.04861, 2017. https://arxiv.org/abs/1704.04861, Sept. 2024.
Liu Z, Hu H, Cao Y, Zhang Z, Tong X. A closer look at local aggregation operators in point cloud analysis. In Proc. the 16th European Conference on Computer Vision, Aug. 2020, pp.326–342. DOI: https://doi.org/10.1007/978-3-030-58592-1_20.
Uy M A, Pham Q H, Hua B S, Nguyen T, Yeung S K. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In Proc. the 2019 IEEE/CVF International Conference on Computer Vision, Oct. 27–Nov. 2, 2019, pp.1588–1597. DOI: https://doi.org/10.1109/ICCV.2019.00167.
Yi L, Kim V G, Ceylan D, Shen I C, Yan M, Su H, Lu C, Huang Q, Sheffer A, Guibas L. A scalable active framework for region annotation in 3D shape collections. ACM Trans. Graphics, 2016, 35 (6): Article No. 210. DOI: https://doi.org/10.1145/2980179.2980238.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest The authors declare that they have no conflict of interest.
Additional information
Recommended by ChinaMM 2023
The work was supported by the National Natural Science Foundation of China under Grant Nos. 62273034, 61973029, and 62076026, and the Scientific and Technological Innovation Foundation of Foshan under Grant No. BK21BF004.
Tian-Meng Zhao received his B.S. degree from University of Science and Technology Beijing, Beijing, in 2020. Now, he is studying for his master’s degree at the same university. His main research interests include computer vision and point cloud processing.
Hui Zeng received her B.S. and M.S. degrees from Shandong University, Jinan, in 2001 and 2004, respectively, and received her Ph.D. degree from Institute of Automation, Chinese Academy of Sciences, Beijing, in 2007. She is currently a professor at the Beijing Engineering Research Center of Industrial Spectrum Imaging, School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing. Her main research interests include computer vision, pattern recognition, and machine learning.
Bao-Qing Zhang received his Ph.D. degree from University of Science and Technology Beijing, Beijing, in 2014. He is currently a senior engineer at Beijing Institute of Electronic System Engineering, Beijing. His main research interests include information processing.
Hong-Min Liu received her B.S. degree from Xidian University, Xi’an, in 2004, and her Ph.D. degree from the Institute of Electronics, Chinese Academy of Sciences, Beijing, in 2009. She is currently a professor with the School of Intelligence Science and Technology and the Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing. Her research focuses on image processing, computer vision, and pattern recognition.
Bin Fan received his B.S. degree from Beijing University of Chemical Technology, Beijing, in 2006, and his Ph.D. degree from the National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, in 2011. He is currently a professor with School of Intelligence Science and Technology and the Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing. His research focuses on computer vision, pattern recognition, image processing, and multimedia.
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Zhao, TM., Zeng, H., Zhang, BQ. et al. Point-Voxel Based Geometry-Adaptive Network for 3D Point Cloud Analysis. J. Comput. Sci. Technol. 39, 1167–1179 (2024). https://doi.org/10.1007/s11390-024-3521-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-024-3521-x