PointCSE: Context-sensitive encoders for efficient 3D object detection from point cloud

Wu, Kuoliang; Xu, Guodong; Liu, Zili; Liu, Haifeng; Cai, Deng; He, Xiaofei

doi:10.1007/s13042-021-01342-4

PointCSE: Context-sensitive encoders for efficient 3D object detection from point cloud

Original Article
Published: 29 October 2021

Volume 13, pages 39–47, (2022)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Kuoliang Wu ORCID: orcid.org/0000-0003-4201-7567¹^na1,
Guodong Xu¹^na1,
Zili Liu¹,
Haifeng Liu¹,
Deng Cai¹ &
…
Xiaofei He¹

486 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Few modern 3D object detectors achieve fast inference speed and high accuracy at the same time. To achieve high performance, they usually directly operate on raw point clouds, or convert point clouds to 3D representation and then apply 3D convolution. However, those methods come with sizable computation overhead and complex operations. As for high-speed 2D-representation-based 3D detectors, their performance is still restricted. In this paper, we investigate how to leverage context knowledge to empower the 2D representation of point clouds for computation and memory-efficient 3D object detection with state-of-the-art performance. The proposed encoder has two parts: a context-sensitive point sampling network and a point set learning network. Specifically, our point sampling network samples points with dense localization information. With high-quality sampled points, we are allowed to utilize a deeper point set learning network to aggregate semantic details in a light manner. The proposed encoder is lightweight and very supportive of hardware acceleration like TensorRT and TVM. Extensive experiments on the KITTI benchmark show the proposed encoder called PointCSE outperforms prior real-time encoders by a large margin with 1.5\(\times\) memory reduction; it also achieves state-of-the-art performance with 49 FPS inference speed (4\(\times\) speedup on average compared to previous best methods).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection

Article Open access 24 November 2022

Shaoshuai Shi, Li Jiang, … Hongsheng Li

Facilitating 3D Object Tracking in Point Clouds with Image Semantics and Geometry

SPAN: siampillars attention network for 3D object tracking in point clouds

Article 08 February 2022

Yi Zhuang & Haitao Zhao

References

Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3d object detection network for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1907–1915
Chen Y, Liu S, Shen X, Jia J (2019) Fast point r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 9775–9784
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? the kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 3354–3361
He C, Zeng H, Huang J, Hua XS, Zhang L (2020) Structure aware single-stage 3d object detection from point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11873–11882
Ku J, Mozifian M, Lee J, Harakeh A, Waslander SL (2018) Joint 3d proposal generation and object detection from view aggregation. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp 1–8
Lang AH, Vora S, Caesar H, Zhou L, Yang J, Beijbom O (2019) Pointpillars: Fast encoders for object detection from point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 12697–12705
Li Y, Bu R, Sun M, Wu W, Di X, Chen B (2018) Pointcnn: Convolution on x-transformed points. In: Advances in Neural Information Processing Systems, pp 820–830
Liang M, Yang B, Chen Y, Hu R, Urtasun R (2019) Multi-task multi-sensor fusion for 3d object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7345–7353
Liang M, Yang B, Wang S, Urtasun R (2018) Deep continuous fusion for multi-sensor 3d object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 641–656
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Ngiam J, Caine B, Han W, Yang B, Chai Y, Sun P, Zhou Y, Yi X, Alsharif O, Nguyen P, et al (2019) Starnet: Targeted computation for object detection in point clouds. arXiv preprint arXiv:1908.11069
Qi CR, Litany O, He K, Guibas LJ (2019) Deep hough voting for 3d object detection in point clouds. arXiv preprint arXiv:1904.09664
Qi CR, Liu W, Wu C, Su H, Guibas LJ (2018) Frustum pointnets for 3d object detection from rgb-d data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 918–927
Qi CR, Su H, Mo K, Guibas LJ (2017) Pointnet: Deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 652–660
Qi CR, Yi L, Su H, Guibas LJ (2017) Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In: Advances in neural information processing systems, pp 5099–5108
Shi S, Wang X, Li H (2019) Pointrcnn: 3d object proposal generation and detection from point cloud. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–779
Simony M, Milzy S, Amendey K, Gross HM (2018) Complex-yolo: an euler-region-proposal for real-time 3d object detection on point clouds. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 0–0
Wang Z, Jia K (2019) Frustum convnet: Sliding frustums to aggregate local point-wise features for amodal 3d object detection. arXiv preprint arXiv:1903.01864
Yan Y, Mao Y, Li B (2018) Second: Sparsely embedded convolutional detection. Sensors 18(10):3337
Article Google Scholar
Yang B, Liang M, Urtasun R (2018) Hdnet: Exploiting hd maps for 3d object detection. In: Conference on Robot Learning, pp 146–155
Yang B, Luo W, Urtasun R (2018) Pixor: Real-time 3d object detection from point clouds. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 7652–7660
Yang Z, Sun Y, Liu S, Shen X, Jia J (2019) Std: Sparse-to-dense 3d object detector for point cloud. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1951–1960
Ye M, Xu S, Cao T (2020) Hvnet: Hybrid voxel network for lidar based 3d object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 1631–1640
Zhao W, Zhang S, Guan Z, Luo H, Tang L, Peng J, Fan J (2020) 6d object pose estimation via viewpoint relation reasoning. Neurocomputing 389:9–17
Article Google Scholar
Zhao W, Zhang S, Guan Z, Zhao W, Peng J, Fan J (2020) Learning deep network for detecting 3d object keypoints and 6d poses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Zhou Y, Tuzel O (2018) Voxelnet: End-to-end learning for point cloud based 3d object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4490–4499

Download references

Author information

Kuoliang Wu and Guodong Xu have contributed equally to this work.

Authors and Affiliations

State Key Lab of CAD&CG, Zhejiang University, No.388 Yu Hang Tang Road, Hangzhou, 310058, China
Kuoliang Wu, Guodong Xu, Zili Liu, Haifeng Liu, Deng Cai & Xiaofei He

Authors

Kuoliang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Guodong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zili Liu
View author publications
You can also search for this author in PubMed Google Scholar
Haifeng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Deng Cai
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofei He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kuoliang Wu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by The National Key Research and Development Program of China (Grant Nos: 2018AAA0101400), in part by The National Nature Science Foundation of China (Grant Nos: 62036009, U1909203, 61936006, 61973271).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, K., Xu, G., Liu, Z. et al. PointCSE: Context-sensitive encoders for efficient 3D object detection from point cloud. Int. J. Mach. Learn. & Cyber. 13, 39–47 (2022). https://doi.org/10.1007/s13042-021-01342-4

Download citation

Received: 27 December 2020
Accepted: 28 April 2021
Published: 29 October 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s13042-021-01342-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

PointCSE: Context-sensitive encoders for efficient 3D object detection from point cloud

Abstract

Access this article

Similar content being viewed by others

PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection

Facilitating 3D Object Tracking in Point Clouds with Image Semantics and Geometry

SPAN: siampillars attention network for 3D object tracking in point clouds

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PointCSE: Context-sensitive encoders for efficient 3D object detection from point cloud

Abstract

Access this article

Similar content being viewed by others

PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection

Facilitating 3D Object Tracking in Point Clouds with Image Semantics and Geometry

SPAN: siampillars attention network for 3D object tracking in point clouds

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation