Skip to main content

BVNet: A 3D End-to-End Model Based on Point Cloud

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12509))

Abstract

Point cloud LiDAR data are increasingly used for detecting road situations for autonomous driving. The most important issues here are the detection accuracy and the processing time. In this study, we propose a new model which can improve the detection performance based on point cloud. A well-known difficulty in processing 3D point cloud is that the point data are unordered. To address this problem, we define 3D point cloud features in the grid cells of the bird’s view according to the distribution of the points. In particular, we introduce the average and standard deviation of the heights as well as a distance-related density of the points as new features inside a cell. The resulting feature map is fed into a conventional neural network to obtain the outcomes, thus realizing an end-to-end real-time detection framework, called BVNet (Bird’s-View-Net). The proposed model is tested on the KITTI benchmark suit and the results show considerable improvement for the detection accuracy compared with the models without the newly introduced features.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.cvlibs.net/datasets/kitti/eval_3dobject.php.

  2. 2.

    Huawei atlas support: https://support.huawei.com/enterprise/en.

References

  1. Ali, W., Abdelkarim, S., Zidan, M., Zahran, M., El Sallab, A.: Yolo3D: end-to-end real-time 3D oriented object bounding box detection from lidar point cloud. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)

    Google Scholar 

  2. Beltran, J., Guindel, C., Moreno, F.M., Cruzado, D., Garcia, F., De La Escalera, A.: Birdnet: a 3D object detection framework from lidar information. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 3517–3523. IEEE (2018)

    Google Scholar 

  3. Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., Urtasun, R.: Monocular 3D object detection for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2147–2156 (2016)

    Google Scholar 

  4. El Sallab, A., Sobh, I., Zidan, M., Zahran, M., Abdelkarim, S.: Yolo4D: a spatio-temporal approach for real-time multi-object detection and classification from lidar point clouds (2018)

    Google Scholar 

  5. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE (2012)

    Google Scholar 

  6. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

    Google Scholar 

  7. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition CVPR, June 2014

    Google Scholar 

  8. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

    Google Scholar 

  9. Jiang, M., Wu, Y., Zhao, T., Zhao, Z., Lu, C.: Pointsift: a sift-like network module for 3D point cloud semantic segmentation. arXiv preprint arXiv:1807.00652 (2018)

  10. Ku, J., Mozifian, M., Lee, J., Harakeh, A., Waslander, S.L.: Joint 3D proposal generation and object detection from view aggregation. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–8. IEEE (2018)

    Google Scholar 

  11. Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: Pointpillars: fast encoders for object detection from point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12697–12705 (2019)

    Google Scholar 

  12. Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)

    Google Scholar 

  13. Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  14. Qi, C.R., Litany, O., He, K., Guibas, L.J.: Deep hough voting for 3D object detection in point clouds. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9277–9286 (2019)

    Google Scholar 

  15. Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum pointnets for 3D object detection from RGB-D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 918–927 (2018)

    Google Scholar 

  16. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)

    Google Scholar 

  17. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)

    Google Scholar 

  18. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: The IEEE Conference on Computer Vision and Pattern Recognition, CVPR, June 2016

    Google Scholar 

  19. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: The IEEE Conference on Computer Vision and Pattern Recognition, CVPR, July 2017

    Google Scholar 

  20. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  21. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  22. Shi, S., Wang, X., Li, H.: Pointrcnn: 3D object proposal generation and detection from point cloud. In: The IEEE Conference on Computer Vision and Pattern Recognition, CVPR, June 2019

    Google Scholar 

  23. Shi, S., Wang, Z., Wang, X., Li, H.: Part-A\(^{\wedge }\) 2 net: 3D part-aware and aggregation neural network for object detection from point cloud. arXiv preprint arXiv:1907.03670 (2019)

  24. Simon, M., et al.: Complexer-yolo: real-time 3D object detection and tracking on semantic point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)

    Google Scholar 

  25. Simony, M., Milzy, S., Amendey, K., Gross, H.M.: Complex-yolo: an Euler-region-proposal for real-time 3D object detection on point clouds. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)

    Google Scholar 

  26. Wang, Z., Jia, K.: Frustum convnet: sliding frustums to aggregate local point-wise features for amodal 3D object detection. arXiv preprint arXiv:1903.01864 (2019)

  27. Zarzar, J., Giancola, S., Ghanem, B.: PointRGCN: graph convolution networks for 3D vehicles detection refinement. arXiv preprint arXiv:1911.12236 (2019)

  28. Zeng, Y., et al.: Rt3D: real-time 3D vehicle detection in lidar point cloud for autonomous driving. IEEE Robot. Autom. Lett. 3(4), 3434–3440 (2018)

    Article  Google Scholar 

  29. Zhou, Y., Tuzel, O.: Voxelnet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nuo Cheng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cheng, N., Li, X., Lei, S., Li, P. (2020). BVNet: A 3D End-to-End Model Based on Point Cloud. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2020. Lecture Notes in Computer Science(), vol 12509. Springer, Cham. https://doi.org/10.1007/978-3-030-64556-4_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64556-4_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64555-7

  • Online ISBN: 978-3-030-64556-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics