OKGR: Occluded Keypoint Generation and Refinement for 3D Object Detection

Ji, Mingqian; Yang, Jian; Zhang, Shanshan

doi:10.1007/978-981-99-8555-5_1

Mingqian Ji¹⁵,
Jian Yang¹⁵ &
Shanshan Zhang¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14436))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

850 Accesses

Abstract

Lidar-based 3D object detectors utilize point clouds to detect objects in autonomous driving. However, the point clouds are sparse and incomplete, which affects the detectors’ learning of shape knowledge and limits the 3D detection performance. Previous works improve performance through completing object shape at the point level or representation level, such as voxel. The former increases computational burden, while the latter has poor generalization ability to point-based detectors. In this paper, we present an approach, namely Occluded Keypoint Generation and Refinement (OKGR), which is effective to improve 3D detection performance by completing object features at the keypoint level. Specifically, Occluded Keypoint Generation (OKG) generates occluded keypoints to densify raw keypoints and learns the offsets between the generated keypoints and prototypes, while retaining the raw keypoints unchanged. Occluded Keypoint Refinement (OKR) assigns weights to the generated keypoints and conducts these weights to features to obtain high-quality complete features for detection. We apply our approach to two representative detectors, PV-RCNN++ and PDV, and evaluate the detectors on KITTI and Waymo Open Dataset. The experiments show significant performance improvement. Particularly, our OKGR applied on PV-RCNN++ achieves improvements of Pedestrian and Cyclist of +3.19%, +2.53% AP on average difficulty levels on KITTI, and +2.18%, +2.29% mAPH on Waymo Open Dataset. For more information, the supplementary material and code are available at https://github.com/Mingqj/OKGR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Structure Guided Proposal Completion for 3D Object Detection

CasFormer: Cascaded Transformer Based on Dynamic Voxel Pyramid for 3D Object Detection from Point Clouds

F-3DNet: Extracting inner order of point cloud for 3D object detection in autonomous driving

Article 18 June 2023

References

Shi, S., Wang, X., Li, H.: PointRCNN: 3D object proposal generation and detection from point cloud. In: CVPR (2019)
Google Scholar
Yang, Z., Sun, Y., Liu, S., Shen, X., Jia, J.: IPOD: intensive point-based object detector for point cloud. arXiv:1812.05276 (2018)
Yang, Z., Sun, Y., Liu, S., Shen, X., Jia, J.: STD: sparse-to-dense 3D object detector for point cloud. In: ICCV (2019)
Google Scholar
Ngiam, J., et al.: StarNet: targeted computation for object detection in point clouds. arXiv:1908.11069 (2019)
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: CVPR (2018)
Google Scholar
Yan, Y., Mao, Y., Li, B.: Second: sparsely embedded convolutional detection. Sensors (2018)
Google Scholar
Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: PointPillars: fast encoders for object detection from point clouds. In: CVPR (2019)
Google Scholar
Li, Z., Yao, Y., Quan, Z., Xie, J., Yang, W.: Spatial information enhancement network for 3D object detection from point cloud. PR (2022)
Google Scholar
Xu, Q., Zhou, Y., Wang, W., Qi, C.R., Anguelov, D.: SPG: unsupervised domain adaptation for 3D object detection via semantic point generation. In: ICCV (2021)
Google Scholar
Yuan, W., Khot, T., Held, D., Mertz, C., Hebert, M.: PCN: point completion network. In: 3DV (2018)
Google Scholar
Shi, S., et al.: PV-RCNN++: point-voxel feature set abstraction with local vector representation for 3D object detection. IJCV (2022)
Google Scholar
Hu, J.S., Kuai, T., Waslander, S.L.: Point density-aware voxels for lidar 3D object detection. In: CVPR (2022)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR (2012)
Google Scholar
Sun, P., et al.: Scalability in perception for autonomous driving: waymo open dataset. In: CVPR (2020)
Google Scholar
Mao, J., Shi, S., Wang, X., Li, H.: 3D object detection for autonomous driving: a review and new outlooks. arXiv:2206.09474 (2022)
Shi, W., Rajkumar, R.: Point-GNN: graph neural network for 3D object detection in a point cloud. In: CVPR (2020)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: NeurIPS (2017)
Google Scholar
Shi, S., Wang, Z., Shi, J., Wang, X., Li, H.: From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network. TPAMI (2020)
Google Scholar
Yin, T., Zhou, X., Krahenbuhl, P.: Center-based 3D object detection. In: CVPR (2021)
Google Scholar
Mao, J., et al.: Voxel transformer for 3D object detection. In: ICCV (2021)
Google Scholar
Deng, J., Shi, S., Li, P., Zhou, W., Zhang, Y., Li, H.: Voxel R-CNN: towards high performance voxel-based 3D object detection. In: AAAI (2021)
Google Scholar
Graham, B., Engelcke, M., Van Der Maaten, L.: 3D semantic segmentation with submanifold sparse convolutional networks. In: CVPR (2018)
Google Scholar
Shi, S., et al.: PV-RCNN: point-voxel feature set abstraction for 3D object detection. In: CVPR (2020)
Google Scholar
Li, Z., Wang, F., Wang, N.: Lidar R-CNN: an efficient and universal 3D object detector. In: CVPR (2021)
Google Scholar
Mao, J., Niu, M., Bai, H., Liang, X., Xu, H., Xu, C.: Pyramid R-CNN: towards better performance and adaptability for 3D object detection. In: ICCV (2021)
Google Scholar
Eldar, Y., Lindenbaum, M., Porat, M., Zeevi, Y.Y.: The farthest point strategy for progressive image sampling. TIP (1997)
Google Scholar
Zhu, X., Ma, Y., Wang, T., Xu, Y., Shi, J., Lin, D.: SSN: shape signature networks for multi-class object detection from point clouds. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 581–597. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_35
Chapter Google Scholar
Xu, Q., Zhong, Y., Neumann, U.: Behind the curtain: learning occluded shapes for 3D object detection. In: AAAI (2022)
Google Scholar
Wang, T., Hu, X., Liu, Z., Fu, C.W.: Sparse2dense: learning to densify 3D features for 3D object detection. In: NeurIPS (2022)
Google Scholar
Wen, X., et al.: PMP-Net: point cloud completion by learning multi-step point moving paths. In: CVPR (2021)
Google Scholar
Wen, X., et al.: PMP-Net++: point cloud completion by transformer-enhanced multi-step point moving paths. TPAMI (2022)
Google Scholar
Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. (1956)
Google Scholar
Baudat, G., Anouar, F.: Generalized discriminant analysis using a kernel approach. Neural Comput. (2000)
Google Scholar

Download references

Acknowledgement

This work is supported by the National Natural Science Foundation of China (Grant No. 62322602, Grant No. 62172225), CAAI-Huawei MindSpore Open Fund.

Author information

Authors and Affiliations

PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Mingqian Ji, Jian Yang & Shanshan Zhang

Authors

Mingqian Ji
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yang
View author publications
You can also search for this author in PubMed Google Scholar
Shanshan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shanshan Zhang .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ji, M., Yang, J., Zhang, S. (2024). OKGR: Occluded Keypoint Generation and Refinement for 3D Object Detection. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14436. Springer, Singapore. https://doi.org/10.1007/978-981-99-8555-5_1

Download citation

DOI: https://doi.org/10.1007/978-981-99-8555-5_1
Published: 28 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8554-8
Online ISBN: 978-981-99-8555-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

OKGR: Occluded Keypoint Generation and Refinement for 3D Object Detection