Abstract
Efficient fusion on LiDAR-camera data for 3D object detection is a challenging task. Although RGB image provides sufficient texture and semantic features, some of them are unrelated to the targeted objects, which are useless and even misleading for detection task. In this paper, point semantic saliency (PSS) is proposed for precise fusion. In this scheme, physical receptive field (PRF) constraint is built to establish the relation of 2D and 3D receptive fields from camera projection model. To increase the saliency of the pixel from targeted object, we propose PSS to extract salient point feature with the guidance of RGB and semantic segmentation images, which provides 2D supplementary information for 3D detection. Comparison results and ablation studies demonstrate that PSS improves the detection performance in both localization and classification. Among the current single stage detectors, our method improves APs by \(0.62\%\) for hard level and mAP by \(0.31\%\).
This work is supported by the National Natural Science Foundation of China (61991412).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, L., Zhu, Y., Papandreou, G., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of ECCV, pp. 833ā851 (2018)
Chen, X., Ma, H., Wan, J., et al.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of CVPR, pp. 1ā10 (2017)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of CVPR, pp. 3354ā3361 (2012)
Graham, B., Engelcke, M., van der Maaten, L.: 3D semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of CVPR, pp. 9224ā9232 (2018)
He, C., Zeng, H., Huang, J., et al.: Structure aware single-stage 3D object detection from point cloud. In: Proceedings of CVPR, pp. 11870ā11879 (2020)
Ku, J., Mozifian, M., Lee, J., et al.: Joint 3D proposal generation and object detection from view aggregation. In: Proceedings of IROS, pp. 1ā8 (2018)
Liang, M., Yang, B., Chen, Y., et al.: Multi-task multi-sensor fusion for 3D object detection. In: Proceedings of CVPR, pp. 7345ā7353 (2019)
Liang, M., Yang, B., Wang, S., Urtasun, R.: Deep continuous fusion for multi-sensor 3D object detection. In: Proceedings of ECCV, pp. 663ā678 (2018)
Lin, T., Goyal, P., Girshick, R.B., et al.: Focal loss for dense object detection. In: Proceedings of CVPR, pp. 2999ā3007 (2017)
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21ā37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Ma, X., Wang, Z., Li, H., et al.: Accurate monocular 3D object detection via color-embedded 3D reconstruction for autonomous driving. In: Proceedings of ICCV, pp. 6850ā6859 (2019)
Meyer, G.P., Laddha, A., Kee, E., et al.: LaserNet: an efficient probabilistic 3D object detector for autonomous driving. In: Proceedings of CVPR, pp. 12677ā12686 (2019)
Pang, S., Morris, D., Radha, H.: CLOCs: camera-LiDAR object candidates fusion for 3D object detection. In: Proceedings of IROS, pp. 1ā8 (2020)
Qi, C.R., Liu, W., Wu, C., et al.: Frustum PointNets for 3D object detection from RGB-D data. In: Proceedings of CVPR, pp. 918ā927 (2018)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of CVPR, pp. 77ā85 (2017)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of NIPS, pp. 5099ā5108 (2017)
Shi, S., Guo, C., Jiang, L., et al.: PV-RCNN: point-voxel feature set abstraction for 3D object detection. In: Proceedings of CVPR, pp. 10526ā10535 (2020)
Sindagi, V.A., Zhou, Y., Tuzel, O.: MVX-Net: multimodal VoxelNet for 3D object detection. In: Proceedings of ICRA, pp. 7276ā7282 (2019)
Vora, S., Lang, A.H., Helou, B., Beijbom, O.: PointPainting: sequential fusion for 3D object detection. In: Proceedings of CVPR, pp. 4603ā4611 (2020)
Xie, L., Xiang, C., Yu, Z., et al.: PI-RCNN: an efficient multi-sensor 3D object detector with point-based attentive Cont-conv fusion module. In: Proceedings of AAAI, pp. 12460ā12467 (2020)
Xu, D., Anguelov, D., Jain, A.: PointFusion: deep sensor fusion for 3D bounding box estimation. In: Proceedings of CVPR, pp. 244ā253 (2018)
Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3D object detection from point clouds. In: Proceedings of CVPR, pp. 7652ā7660 (2018)
Yang, Z., Sun, Y., Liu, S., et al.: IPOD: intensive point-based object detector for point cloud. CoRR arXiv:1812.05276 (2018)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818ā833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330ā1334 (2000)
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of CVPR, pp. 4490ā4499 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Cen, J., An, P., Chen, G., Liang, J., Ma, J. (2021). PSS: Point Semantic Saliency forĀ 3D Object Detection. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds) Artificial Intelligence. CICAI 2021. Lecture Notes in Computer Science(), vol 13069. Springer, Cham. https://doi.org/10.1007/978-3-030-93046-2_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-93046-2_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93045-5
Online ISBN: 978-3-030-93046-2
eBook Packages: Computer ScienceComputer Science (R0)