PSS: Point Semantic Saliency for 3D Object Detection

Cen, Jiajing; An, Pei; Chen, Gaojie; Liang, Junxiong; Ma, Jie

doi:10.1007/978-3-030-93046-2_35

Jiajing Cen¹⁴,
Pei An¹⁴,
Gaojie Chen¹⁴,
Junxiong Liang¹⁴ &
…
Jie Ma¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13069))

Included in the following conference series:

CAAI International Conference on Artificial Intelligence

1986 Accesses

Abstract

Efficient fusion on LiDAR-camera data for 3D object detection is a challenging task. Although RGB image provides sufficient texture and semantic features, some of them are unrelated to the targeted objects, which are useless and even misleading for detection task. In this paper, point semantic saliency (PSS) is proposed for precise fusion. In this scheme, physical receptive field (PRF) constraint is built to establish the relation of 2D and 3D receptive fields from camera projection model. To increase the saliency of the pixel from targeted object, we propose PSS to extract salient point feature with the guidance of RGB and semantic segmentation images, which provides 2D supplementary information for 3D detection. Comparison results and ablation studies demonstrate that PSS improves the detection performance in both localization and classification. Among the current single stage detectors, our method improves APs by \(0.62\%\) for hard level and mAP by \(0.31\%\).

This work is supported by the National Natural Science Foundation of China (61991412).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chen, L., Zhu, Y., Papandreou, G., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of ECCV, pp. 833–851 (2018)
Google Scholar
Chen, X., Ma, H., Wan, J., et al.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of CVPR, pp. 1–10 (2017)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of CVPR, pp. 3354–3361 (2012)
Google Scholar
Graham, B., Engelcke, M., van der Maaten, L.: 3D semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of CVPR, pp. 9224–9232 (2018)
Google Scholar
He, C., Zeng, H., Huang, J., et al.: Structure aware single-stage 3D object detection from point cloud. In: Proceedings of CVPR, pp. 11870–11879 (2020)
Google Scholar
Ku, J., Mozifian, M., Lee, J., et al.: Joint 3D proposal generation and object detection from view aggregation. In: Proceedings of IROS, pp. 1–8 (2018)
Google Scholar
Liang, M., Yang, B., Chen, Y., et al.: Multi-task multi-sensor fusion for 3D object detection. In: Proceedings of CVPR, pp. 7345–7353 (2019)
Google Scholar
Liang, M., Yang, B., Wang, S., Urtasun, R.: Deep continuous fusion for multi-sensor 3D object detection. In: Proceedings of ECCV, pp. 663–678 (2018)
Google Scholar
Lin, T., Goyal, P., Girshick, R.B., et al.: Focal loss for dense object detection. In: Proceedings of CVPR, pp. 2999–3007 (2017)
Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Ma, X., Wang, Z., Li, H., et al.: Accurate monocular 3D object detection via color-embedded 3D reconstruction for autonomous driving. In: Proceedings of ICCV, pp. 6850–6859 (2019)
Google Scholar
Meyer, G.P., Laddha, A., Kee, E., et al.: LaserNet: an efficient probabilistic 3D object detector for autonomous driving. In: Proceedings of CVPR, pp. 12677–12686 (2019)
Google Scholar
Pang, S., Morris, D., Radha, H.: CLOCs: camera-LiDAR object candidates fusion for 3D object detection. In: Proceedings of IROS, pp. 1–8 (2020)
Google Scholar
Qi, C.R., Liu, W., Wu, C., et al.: Frustum PointNets for 3D object detection from RGB-D data. In: Proceedings of CVPR, pp. 918–927 (2018)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of CVPR, pp. 77–85 (2017)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of NIPS, pp. 5099–5108 (2017)
Google Scholar
Shi, S., Guo, C., Jiang, L., et al.: PV-RCNN: point-voxel feature set abstraction for 3D object detection. In: Proceedings of CVPR, pp. 10526–10535 (2020)
Google Scholar
Sindagi, V.A., Zhou, Y., Tuzel, O.: MVX-Net: multimodal VoxelNet for 3D object detection. In: Proceedings of ICRA, pp. 7276–7282 (2019)
Google Scholar
Vora, S., Lang, A.H., Helou, B., Beijbom, O.: PointPainting: sequential fusion for 3D object detection. In: Proceedings of CVPR, pp. 4603–4611 (2020)
Google Scholar
Xie, L., Xiang, C., Yu, Z., et al.: PI-RCNN: an efficient multi-sensor 3D object detector with point-based attentive Cont-conv fusion module. In: Proceedings of AAAI, pp. 12460–12467 (2020)
Google Scholar
Xu, D., Anguelov, D., Jain, A.: PointFusion: deep sensor fusion for 3D bounding box estimation. In: Proceedings of CVPR, pp. 244–253 (2018)
Google Scholar
Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3D object detection from point clouds. In: Proceedings of CVPR, pp. 7652–7660 (2018)
Google Scholar
Yang, Z., Sun, Y., Liu, S., et al.: IPOD: intensive point-based object detector for point cloud. CoRR arXiv:1812.05276 (2018)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Chapter Google Scholar
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000)
Article Google Scholar
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of CVPR, pp. 4490–4499 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

National Key Laboratory of Science and Technology on Multi-spectral Information Processing, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
Jiajing Cen, Pei An, Gaojie Chen, Junxiong Liang & Jie Ma

Authors

Jiajing Cen
View author publications
You can also search for this author in PubMed Google Scholar
Pei An
View author publications
You can also search for this author in PubMed Google Scholar
Gaojie Chen
View author publications
You can also search for this author in PubMed Google Scholar
Junxiong Liang
View author publications
You can also search for this author in PubMed Google Scholar
Jie Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Ma .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Lu Fang
Duke University, Durham, NC, USA
Yiran Chen
Shanghai Jiao Tong University, Shanghai, China
Guangtao Zhai
University of British Columbia, Vancouver, BC, Canada
Jane Wang
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Ruiping Wang
Xidian University, Xi’an, China
Weisheng Dong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cen, J., An, P., Chen, G., Liang, J., Ma, J. (2021). PSS: Point Semantic Saliency for 3D Object Detection. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds) Artificial Intelligence. CICAI 2021. Lecture Notes in Computer Science(), vol 13069. Springer, Cham. https://doi.org/10.1007/978-3-030-93046-2_35

Download citation

DOI: https://doi.org/10.1007/978-3-030-93046-2_35
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93045-5
Online ISBN: 978-3-030-93046-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

PSS: Point Semantic Saliency for 3D Object Detection