Skip to main content

Frugal 3D Point Cloud Model Training via Progressive Near Point Filtering and Fused Aggregation

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15124))

Included in the following conference series:

Abstract

The increasing demand on higher accuracy and the rapid growth of 3D point cloud datasets have led to significantly higher training costs for 3D point cloud models in terms of both computation and memory bandwidth. Despite this, research on reducing this cost is relatively sparse. This paper identifies inefficiencies of unique operations in the 3D point cloud training pipeline: farthest point sampling (FPS) and forward and backward aggregation passes. To address the inefficiencies, we propose novel training optimizations that reduce redundant computation and memory accesses resulting from the operations. Firstly, we introduce Lightweight FPS (L-FPS), which employs progressive near point filtering to eliminate the redundant distance calculations inherent in the original farthest point sampling. Secondly, we introduce the fused aggregation technique, which utilizes kernel fusion to reduce redundant memory accesses during the forward and backward aggregation passes. We apply these techniques to state-of-the-art PointNet-based models and evaluate their performance on NVIDIA RTX 3090 GPU. Our experimental results demonstrate 2.25\(\times \) training time reduction on average with no accuracy drop.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Quickfps. http://github.com/hanm2019/bucket-based_farthest-point-sampling_GPU

  2. Armeni, I., et al.: 3D semantic parsing of large-scale indoor spaces. In: CVPR (2016)

    Google Scholar 

  3. Choy, C., Gwak, J., Savarese, S.: 4D spatio-temporal convnets: minkowski convolutional neural networks. In: CVPR (2019)

    Google Scholar 

  4. Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: richly-annotated 3D reconstructions of indoor scenes. In: CVPR (2017)

    Google Scholar 

  5. Dao, T., Fu, D.Y., Ermon, S., Rudra, A., Ré, C.: FlashAttention: fast and memory-efficient exact attention with IO-awareness. In: NeurIPS (2022)

    Google Scholar 

  6. Evci, U., Gale, T., Menick, J., Castro, P.S., Elsen, E.: Rigging the lottery: making all tickets winners. In: Proceedings of the 37th International Conference on Machine Learning (ICML) (2020)

    Google Scholar 

  7. Fan, L., et al.: Embracing single stride 3D object detector with sparse transformer. In: CVPR (2022)

    Google Scholar 

  8. Feng, Y., Hammonds, G., Gan, Y., Zhu, Y.: Crescent: taming memory irregularities for accelerating deep point cloud analytics. In: Proceedings of the 49th Annual International Symposium on Computer Architecture (ISCA) (2022)

    Google Scholar 

  9. Feng, Y., Tian, B., Xu, T., Whatmough, P., Zhu, Y.: Mesorasi: architecture support for point cloud analytics via delayed-aggregation. In: Proceedings of the 53th International Symposium on Microarchitecture (MICRO) (2020)

    Google Scholar 

  10. Fey, M., Lenssen, J.E.: Fast graph representation learning with PyTorch geometric. In: ICLR Workshop on Representation Learning on Graphs and Manifolds (2019)

    Google Scholar 

  11. Graham, B., Engelcke, M., van der Maaten, L.: 3D semantic segmentation with submanifold sparse convolutional networks. In: CVPR (2018)

    Google Scholar 

  12. Han, M., et al.: Quickfps: Architecture and algorithm co-design for farthest point sampling in large-scale point clouds. IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. (2023)

    Google Scholar 

  13. Hu, Q., et al: RandLA-Net: efficient semantic segmentation of large-scale point clouds. In: CVPR (2020)

    Google Scholar 

  14. Junyuan Ouyang, Xiao Liu, H.C.: Hierarchical adaptive voxel-guided sampling for real-time applications in large-scale point clouds. arXiv preprint arXiv:2305.14306 (2023)

  15. Le, E.T., Kokkinos, I., Mitra, N.J.: Going deeper with lean point networks. In: CVPR (2020)

    Google Scholar 

  16. Li, J., Zhou, J., Xiong, Y., Chen, X., Chakrabarti, C.: An adjustable farthest point sampling method for approximately-sorted point cloud data. In: 2022 IEEE Workshop on Signal Processing Systems (SiPS) (2022)

    Google Scholar 

  17. Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: convolution on X-transformed points. In: NeurIPS (2018)

    Google Scholar 

  18. Lin, H., Zheng, X., Li, L., Chao, F., Wang, S., Wang, Y., Tian, Y., Ji, R.: Meta architecture for point cloud analysis. In: CVPR (2023)

    Google Scholar 

  19. Liu, Y., Fan, B., Meng, G., Lu, J., Xiang, S., Pan, C.: DensePoint: learning densely contextual representation for efficient point cloud processing. In: ICCV (2019)

    Google Scholar 

  20. Liu, Z., Tang, H., Lin, Y., Han, S.: Point-voxel CNN for efficient 3D deep learning. In: NeurIPS (2019)

    Google Scholar 

  21. Liu, Z., Yang, X., Tang, H., Yang, S., Han, S.: FlatFormer: flattened window attention for efficient point cloud transformer. In: CVPR (2023)

    Google Scholar 

  22. Nekrasov, A., Schult, J., Litany, O., Leibe, B., Engelmann, F.: Mix3D: out-of-context data augmentation for 3D scenes. In: International Conference on 3D Vision (3DV) (2021)

    Google Scholar 

  23. NVIDIA geforce RTX 3090 (2020). https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3090/

  24. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)

    Google Scholar 

  25. Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum pointnets for 3D object detection from RGB-D data. In: CVPR (2018)

    Google Scholar 

  26. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. arXiv preprint arXiv:1612.00593 (2016)

  27. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413 (2017)

  28. Qian, G., Hammoud, H., Li, G., Thabet, A., Ghanem, B.: ASSANet: an anisotropical separable set abstraction for efficient point cloud representation learning. In: NeurIPS (2021)

    Google Scholar 

  29. Qian, G., et al.: PointNext: revisiting PointNet++ with improved training and scaling strategies. In: NeurIPS (2022)

    Google Scholar 

  30. Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL: incremental classifier and representation learning. In: CVPR (2017)

    Google Scholar 

  31. Tang, H., et al.: Searching efficient 3D architectures with sparse point-voxel convolution. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 685–702. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_41

    Chapter  Google Scholar 

  32. Wang, M., et al.: Deep graph library: a graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315 (2019)

  33. Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (TOG) (2019)

    Google Scholar 

  34. Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: CVPR (2015)

    Google Scholar 

  35. Xu, Q., Sun, X., Wu, C.Y., Wang, P., Neumann, U.: Grid-GCN for fast and scalable point cloud learning (2020)

    Google Scholar 

  36. Yang, Y.Q., et al.: Swin3D: a pretrained transformer backbone for 3D indoor scene understanding. arXiv preprint arXiv:2304.06906 (2023)

  37. Ying, Z., Bhuyan, S., Kang, Y., Zhang, Y., Kandemir, M.T., Das, C.R.: EdgePC: efficient deep learning analytics for point clouds on edge devices. In: Proceedings of the 50th Annual International Symposium on Computer Architecture (ISCA) (2023)

    Google Scholar 

  38. Zhang, J.F., Zhang, Z.: Point-X: a spatial-locality-aware architecture for energy-efficient graph-based point-cloud deep learning. In: Proceedings of the 54th International Symposium on Microarchitecture (MICRO) (2021)

    Google Scholar 

  39. Zhu, X., et al.: Cylindrical and asymmetrical 3D convolution networks for LiDAR segmentation. arXiv preprint arXiv:2011.10033 (2020)

Download references

Acknowledgements

This work was supported by a research grant from Samsung Advanced Institute of Technology (SAIT) and the National Research Foundation of Korea (NRF) grant funded by the Korea Government (MSIT) (RS-2024-00340008). Additionally, this work was supported by the Institute of Information & Communications Technology Planning & Evaluation (IITP) under the artificial intelligence semiconductor support program (IITP-2023-RS-2023-00256081) and an IITP grant (No. 2021-0-02068, Artificial Intelligence Innovation Hub), both funded by the Korea Government (MSIT). The source code is available at https://github.com/SNU-ARC/Frugal_PN_Training.git. This work was conducted while Yejin Lee was with Seoul National University.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jae W. Lee or Hongil Yoon .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7303 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lee, D., Lee, Y., Lee, J.W., Yoon, H. (2025). Frugal 3D Point Cloud Model Training via Progressive Near Point Filtering and Fused Aggregation. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15124. Springer, Cham. https://doi.org/10.1007/978-3-031-72848-8_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72848-8_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72847-1

  • Online ISBN: 978-3-031-72848-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics