3D object detection based on point cloud in automatic driving scene

Li, Hai-Sheng; Lu, Yan-Ling

doi:10.1007/s11042-023-15963-0

3D object detection based on point cloud in automatic driving scene

Published: 03 July 2023

Volume 83, pages 13029–13044, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

461 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

In many real-time applications such as autonomous driving and robotics, 3D object detection algorithms represented by PointPillars have great potential to design fast and reliable 3D object detection algorithms by using point cloud columns (Pillars) to represent point clouds. However, this kind of algorithm still has some shortcomings, such as poor detection results for some small objects or distant objects and the existence of wrong detection, missing detection and other problems. In order to solve these problems, we design a three-branch extended convolutional network in the 3D object detection algorithm, which can alleviate the insensitivity of the original network to targets of different sizes, especially small targets. Then, we design an improved hybrid attention mechanism network in 3D object detection algorithm to solve the problem of missing detection and error detection in long-distance vehicle detection. From the experimental verification of KITTI dataset, we draw the following conclusion: Our network has great advantages compared with PointPillars, especially the big improvement in the mAP(mean Average Precision) of vehicle detection and pedestrian and rider detection, in the case that the detection speed is basically equal to PointPillars.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 5

A 3D Point Cloud Object Detection Algorithm Based on MSCS-Pointpillars

A Two-Stage 3D Object Detection Algorithm Based on Deep Learning

ESA-SSD: single-stage object detection network using deep hierarchical feature learning

Article 07 December 2023

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

Chen X, Kundu K, Zhang Z, Ma H, Fidler S, Urtasun R (2016) Monocular 3D object detection for autonomous driving. IEEE Conference on Computer Vision and Pattern Recognition, pp 2147–2156
Chen X, Kundu K, Zhu Y, Berneshawi A, Ma H, Fidler S, Urtasun R (2015) 3D object proposals for accurate object class detection. Conference and Workshop on Neural Information Processing Systems
Chen X, Ma H, Wan J, Li B, Xia T (2017) Multi-view 3D object detection network for autonomous driving. IEEE Conference on Computer Vision and Pattern Recognition, pp 1907–1915
Chen X, Ma H, Wan J, Li B, Xia T (2017) Multiview 3D object detection network for autonomous driving. IEEE Conference on Computer Vision and Pattern Recognition, pp 2980–2988
Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. Advances in neural information processing systems, pp 379–387
Du L, Ye X, Tan X, Feng J, Xu Z, Ding E, Wen S (2020) Associate-3Ddet: Perceptual-to-conceptual association for 3D point cloud object detection. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13329–13338
Enzweiler M, Gavrila D (2011) A multilevel mixture-of-experts framework for pedestrian classification. IEEE Transactions on Image Processing, pp 2967–2979
Everingham M, Gool L, Williams C, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. International Journal of Computer Vision, pp 303–338
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The KITTI vision benchmark suite. IEEE Conference on Computer Vision and Pattern Recognition, pp 3354–3361
Gonzalez A, Vazquez D, Lopez A, Amores J (2017) Onboard object detection: Multicue, multimodal, and multiview random forest of local experts. IEEE Transactions on Cybernetics, pp 3980–3990
Graham B, Engelcke M, Maaten L (2018) 3D semantic segmentation with submanifold sparse convolutional networks. IEEE Conference on Computer Vision and Pattern Recognition, pp 9224–9232
He Y, Xia G, Luo Y, Su L, Zhang Z, Li W, Wang P (2021) DVFENet: Dual-branch voxel feature extraction network for 3D object detection. Neurocomputing 459:201–211
Article Google Scholar
He C, Zeng H, Huang J, Hua X, Zhang L (2020) Structure aware single-stage 3D object detection from point cloud. IEEE Conference on Computer Vision and Pattern Recognition, pp 11873–11882
Jaderberg M, Simonyan K, Zisserman A et al (2015) Spatial transformer networks. In: Advances in neural information processing systems, pp 2017–2025
Ku J, Mozifian M, Lee J, Harakeh A, Waslander S (2018) Joint 3d proposal generation and object detection from view aggregation. International Conference Intelligent Robots and Systems, pp 1–8
Lamas D, Soilan M, Grandio J, Riveiro B (2021) Automatic point cloud semantic segmentation of complex railway environments. Remote Sens 2332
Lang A, Vora S, Caesar H, Zhou L, Yang J, Beijbom O (2019) PointPillars: Fast encoders for object detection from point clouds. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 12697–12705
Li B (2017) 3D fully convolutional network for vehicle detection in point cloud. International Conference on Intelligent Robots and Systems, pp 1513–1518
Li Y, Chen Y, Wang N, Zhang Z (2019) Scale-aware trident networks for object detection. IEEE Conference on International Conference on Computer Vision, pp 6054–6063
Lin T, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2980–2988
Lin T, Goyal P, Girshick R, He K, Dollár P (2018) Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp 2980–2988
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C, Berg A (2016) SSD: Single shot multibox detector. European Conference on Computer Vision
Liu Z, Zhao X, Huang T, Hu R, Zhou Y, Bai X (2020) TANet: Robust 3D object detection from point clouds with triple attention. Proceedings of the AAAI Conference on Artificial Intelligence, pp 11677–11684
Mousavian A, Anguelov D, Flynn J (2017) 3D bounding box estimation using deep learning and geometry. IEEE Conference on Computer Vision and Pattern Recognition, pp 5632–5640
Park Y, Lepetit V, Woo W (2008) Multiple 3D object tracking for augmented reality. IEEE/ACM International Symposium on Mixed and Augmented Reality
Qi C, Su H, Mo K, Guibas J (2017) Pointnet: Deep learning on point sets for 3D classification and segmentation. IEEE Conference on Computer Vision and Pattern Recognition, pp 652–660
Shi S, Guo C, Jiang L, Wang Z, Shi J, Wang X, Li H (2020) PV-RCNN: Point-voxel feature set abstraction for 3D object detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 10529–10538
Song S, Chandraker M (2015) Joint SFM and detection cues for monocular 3D localization in road scenes. IEEE Conference on Computer Vision and Pattern Recognition, pp 3734–3742
Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell G (2018) Understanding convolution for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition, pp 1451–1460
Wang D, Posner I (2015) Voting for voting in online point cloud object detection. Robotics: Science and Systems XI, pp 1156–1165
Wang Q, Wu B, Zhu P, Li P, Zuo W, Hu Q (2020) ECA-Net: Efficient channel attention for deep convolutional neural networks. IEEE Conference on Computer Vision and Pattern Recognition, pp 2235–2239
Woo S, Park J, Lee J, Kweon IS (2018) CBAM: Convolutional Block Attention Module. IEEE Conference on Computer Vision and Pattern Recognition, pp 3–19
Xiang Y, Choi W, Lin Y, Savarese S (2015) Data-driven 3D voxel patterns for object category recognition. IEEE International Conference on Computer Vision and Pattern Recognition, pp 1903–1911
Yan Y, Mao Y, Li B (2018) Second: Sparsely embedded convolutional detection. Sensors 18(10):3337
Article Google Scholar
Yang Z, Sun Y, Liu S, Jia J (2020) 3DSSD: Point-based 3D single stage object detector. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 11040–11048
Zhang L, Van Oosterom P, Liu H (2020) Visualization of point cloud models in mobile augmented reality using continuous level of detail method. International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, pp 167–170
Google Scholar
Zheng W, Tang W, Jiang L, Chi-Wing F (2021) SE-SSD: Self-ensembling single-stage object detector from point cloud. IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14494–14503
Zhou Y, Tuzel O (2017) Voxelnet: End-to-end learning for point cloud based 3d object detection. IEEE Conference on Computer Vision and Pattern Recognition, pp 4490–4499
Zia M, Stark M, Schiele B, Schindler K (2013) Detailed 3D representations for object recognition and modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp 2608–2623
Zia M, Stark M, Schindler K (2014) Are cars just 3D boxes? Jointly estimating the 3D shape of multiple objects. IEEE Conference on Computer Vision and Pattern Recognition, pp 3678–3685

Download references

Acknowledgements

This work is supported by the Science and Technology Project of Guangxi under Grant No. 2020GXNSFDA238023, the National Natural Science Foundation of China under Grant no. 61762012.

Author information

Authors and Affiliations

College of Electronic Engineering, Guangxi Normal University, Guilin, 541004, Guangxi, China
Hai-Sheng Li & Yan-Ling Lu

Authors

Hai-Sheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Yan-Ling Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hai-Sheng Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, HS., Lu, YL. 3D object detection based on point cloud in automatic driving scene. Multimed Tools Appl 83, 13029–13044 (2024). https://doi.org/10.1007/s11042-023-15963-0

Download citation

Received: 03 October 2022
Revised: 11 March 2023
Accepted: 22 May 2023
Published: 03 July 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11042-023-15963-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D object detection based on point cloud in automatic driving scene

Abstract

Access this article

Similar content being viewed by others

A 3D Point Cloud Object Detection Algorithm Based on MSCS-Pointpillars

A Two-Stage 3D Object Detection Algorithm Based on Deep Learning

ESA-SSD: single-stage object detection network using deep hierarchical feature learning

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

3D object detection based on point cloud in automatic driving scene

Abstract

Access this article

Similar content being viewed by others

A 3D Point Cloud Object Detection Algorithm Based on MSCS-Pointpillars

A Two-Stage 3D Object Detection Algorithm Based on Deep Learning

ESA-SSD: single-stage object detection network using deep hierarchical feature learning

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation