Skip to main content

PSS: Point Semantic Saliency forĀ 3D Object Detection

  • Conference paper
  • First Online:
Artificial Intelligence (CICAI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13069))

Included in the following conference series:

  • 1986 Accesses

Abstract

Efficient fusion on LiDAR-camera data for 3D object detection is a challenging task. Although RGB image provides sufficient texture and semantic features, some of them are unrelated to the targeted objects, which are useless and even misleading for detection task. In this paper, point semantic saliency (PSS) is proposed for precise fusion. In this scheme, physical receptive field (PRF) constraint is built to establish the relation of 2D and 3D receptive fields from camera projection model. To increase the saliency of the pixel from targeted object, we propose PSS to extract salient point feature with the guidance of RGB and semantic segmentation images, which provides 2D supplementary information for 3D detection. Comparison results and ablation studies demonstrate that PSS improves the detection performance in both localization and classification. Among the current single stage detectors, our method improves APs by \(0.62\%\) for hard level and mAP by \(0.31\%\).

This work is supported by the National Natural Science Foundation of China (61991412).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chen, L., Zhu, Y., Papandreou, G., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of ECCV, pp. 833ā€“851 (2018)

    Google ScholarĀ 

  2. Chen, X., Ma, H., Wan, J., et al.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of CVPR, pp. 1ā€“10 (2017)

    Google ScholarĀ 

  3. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of CVPR, pp. 3354ā€“3361 (2012)

    Google ScholarĀ 

  4. Graham, B., Engelcke, M., van der Maaten, L.: 3D semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of CVPR, pp. 9224ā€“9232 (2018)

    Google ScholarĀ 

  5. He, C., Zeng, H., Huang, J., et al.: Structure aware single-stage 3D object detection from point cloud. In: Proceedings of CVPR, pp. 11870ā€“11879 (2020)

    Google ScholarĀ 

  6. Ku, J., Mozifian, M., Lee, J., et al.: Joint 3D proposal generation and object detection from view aggregation. In: Proceedings of IROS, pp. 1ā€“8 (2018)

    Google ScholarĀ 

  7. Liang, M., Yang, B., Chen, Y., et al.: Multi-task multi-sensor fusion for 3D object detection. In: Proceedings of CVPR, pp. 7345ā€“7353 (2019)

    Google ScholarĀ 

  8. Liang, M., Yang, B., Wang, S., Urtasun, R.: Deep continuous fusion for multi-sensor 3D object detection. In: Proceedings of ECCV, pp. 663ā€“678 (2018)

    Google ScholarĀ 

  9. Lin, T., Goyal, P., Girshick, R.B., et al.: Focal loss for dense object detection. In: Proceedings of CVPR, pp. 2999ā€“3007 (2017)

    Google ScholarĀ 

  10. Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21ā€“37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    ChapterĀ  Google ScholarĀ 

  11. Ma, X., Wang, Z., Li, H., et al.: Accurate monocular 3D object detection via color-embedded 3D reconstruction for autonomous driving. In: Proceedings of ICCV, pp. 6850ā€“6859 (2019)

    Google ScholarĀ 

  12. Meyer, G.P., Laddha, A., Kee, E., et al.: LaserNet: an efficient probabilistic 3D object detector for autonomous driving. In: Proceedings of CVPR, pp. 12677ā€“12686 (2019)

    Google ScholarĀ 

  13. Pang, S., Morris, D., Radha, H.: CLOCs: camera-LiDAR object candidates fusion for 3D object detection. In: Proceedings of IROS, pp. 1ā€“8 (2020)

    Google ScholarĀ 

  14. Qi, C.R., Liu, W., Wu, C., et al.: Frustum PointNets for 3D object detection from RGB-D data. In: Proceedings of CVPR, pp. 918ā€“927 (2018)

    Google ScholarĀ 

  15. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of CVPR, pp. 77ā€“85 (2017)

    Google ScholarĀ 

  16. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of NIPS, pp. 5099ā€“5108 (2017)

    Google ScholarĀ 

  17. Shi, S., Guo, C., Jiang, L., et al.: PV-RCNN: point-voxel feature set abstraction for 3D object detection. In: Proceedings of CVPR, pp. 10526ā€“10535 (2020)

    Google ScholarĀ 

  18. Sindagi, V.A., Zhou, Y., Tuzel, O.: MVX-Net: multimodal VoxelNet for 3D object detection. In: Proceedings of ICRA, pp. 7276ā€“7282 (2019)

    Google ScholarĀ 

  19. Vora, S., Lang, A.H., Helou, B., Beijbom, O.: PointPainting: sequential fusion for 3D object detection. In: Proceedings of CVPR, pp. 4603ā€“4611 (2020)

    Google ScholarĀ 

  20. Xie, L., Xiang, C., Yu, Z., et al.: PI-RCNN: an efficient multi-sensor 3D object detector with point-based attentive Cont-conv fusion module. In: Proceedings of AAAI, pp. 12460ā€“12467 (2020)

    Google ScholarĀ 

  21. Xu, D., Anguelov, D., Jain, A.: PointFusion: deep sensor fusion for 3D bounding box estimation. In: Proceedings of CVPR, pp. 244ā€“253 (2018)

    Google ScholarĀ 

  22. Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3D object detection from point clouds. In: Proceedings of CVPR, pp. 7652ā€“7660 (2018)

    Google ScholarĀ 

  23. Yang, Z., Sun, Y., Liu, S., et al.: IPOD: intensive point-based object detector for point cloud. CoRR arXiv:1812.05276 (2018)

  24. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818ā€“833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    ChapterĀ  Google ScholarĀ 

  25. Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330ā€“1334 (2000)

    ArticleĀ  Google ScholarĀ 

  26. Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of CVPR, pp. 4490ā€“4499 (2018)

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cen, J., An, P., Chen, G., Liang, J., Ma, J. (2021). PSS: Point Semantic Saliency forĀ 3D Object Detection. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds) Artificial Intelligence. CICAI 2021. Lecture Notes in Computer Science(), vol 13069. Springer, Cham. https://doi.org/10.1007/978-3-030-93046-2_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93046-2_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93045-5

  • Online ISBN: 978-3-030-93046-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics