BVNet: A 3D End-to-End Model Based on Point Cloud

Cheng, Nuo; Li, Xiaohan; Lei, Shengguang; Li, Pu

doi:10.1007/978-3-030-64556-4_33

BVNet: A 3D End-to-End Model Based on Point Cloud

Nuo Cheng^17,18,
Xiaohan Li¹⁸,
Shengguang Lei¹⁸ &
…
Pu Li¹⁷

Conference paper
First Online: 07 December 2020

1323 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12509))

Abstract

Point cloud LiDAR data are increasingly used for detecting road situations for autonomous driving. The most important issues here are the detection accuracy and the processing time. In this study, we propose a new model which can improve the detection performance based on point cloud. A well-known difficulty in processing 3D point cloud is that the point data are unordered. To address this problem, we define 3D point cloud features in the grid cells of the bird’s view according to the distribution of the points. In particular, we introduce the average and standard deviation of the heights as well as a distance-related density of the points as new features inside a cell. The resulting feature map is fed into a conventional neural network to obtain the outcomes, thus realizing an end-to-end real-time detection framework, called BVNet (Bird’s-View-Net). The proposed model is tested on the KITTI benchmark suit and the results show considerable improvement for the detection accuracy compared with the models without the newly introduced features.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
http://www.cvlibs.net/datasets/kitti/eval_3dobject.php.
2.
Huawei atlas support: https://support.huawei.com/enterprise/en.

References

Ali, W., Abdelkarim, S., Zidan, M., Zahran, M., El Sallab, A.: Yolo3D: end-to-end real-time 3D oriented object bounding box detection from lidar point cloud. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Google Scholar
Beltran, J., Guindel, C., Moreno, F.M., Cruzado, D., Garcia, F., De La Escalera, A.: Birdnet: a 3D object detection framework from lidar information. In: 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 3517–3523. IEEE (2018)
Google Scholar
Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., Urtasun, R.: Monocular 3D object detection for autonomous driving. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2147–2156 (2016)
Google Scholar
El Sallab, A., Sobh, I., Zidan, M., Zahran, M., Abdelkarim, S.: Yolo4D: a spatio-temporal approach for real-time multi-object detection and classification from lidar point clouds (2018)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354–3361. IEEE (2012)
Google Scholar
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition CVPR, June 2014
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
Jiang, M., Wu, Y., Zhao, T., Zhao, Z., Lu, C.: Pointsift: a sift-like network module for 3D point cloud semantic segmentation. arXiv preprint arXiv:1807.00652 (2018)
Ku, J., Mozifian, M., Lee, J., Harakeh, A., Waslander, S.L.: Joint 3D proposal generation and object detection from view aggregation. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1–8. IEEE (2018)
Google Scholar
Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: Pointpillars: fast encoders for object detection from point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12697–12705 (2019)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Qi, C.R., Litany, O., He, K., Guibas, L.J.: Deep hough voting for 3D object detection in point clouds. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 9277–9286 (2019)
Google Scholar
Qi, C.R., Liu, W., Wu, C., Su, H., Guibas, L.J.: Frustum pointnets for 3D object detection from RGB-D data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 918–927 (2018)
Google Scholar
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: The IEEE Conference on Computer Vision and Pattern Recognition, CVPR, June 2016
Google Scholar
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: The IEEE Conference on Computer Vision and Pattern Recognition, CVPR, July 2017
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Shi, S., Wang, X., Li, H.: Pointrcnn: 3D object proposal generation and detection from point cloud. In: The IEEE Conference on Computer Vision and Pattern Recognition, CVPR, June 2019
Google Scholar
Shi, S., Wang, Z., Wang, X., Li, H.: Part-A\(^{\wedge }\) 2 net: 3D part-aware and aggregation neural network for object detection from point cloud. arXiv preprint arXiv:1907.03670 (2019)
Simon, M., et al.: Complexer-yolo: real-time 3D object detection and tracking on semantic point clouds. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2019)
Google Scholar
Simony, M., Milzy, S., Amendey, K., Gross, H.M.: Complex-yolo: an Euler-region-proposal for real-time 3D object detection on point clouds. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Google Scholar
Wang, Z., Jia, K.: Frustum convnet: sliding frustums to aggregate local point-wise features for amodal 3D object detection. arXiv preprint arXiv:1903.01864 (2019)
Zarzar, J., Giancola, S., Ghanem, B.: PointRGCN: graph convolution networks for 3D vehicles detection refinement. arXiv preprint arXiv:1911.12236 (2019)
Zeng, Y., et al.: Rt3D: real-time 3D vehicle detection in lidar point cloud for autonomous driving. IEEE Robot. Autom. Lett. 3(4), 3434–3440 (2018)
Article Google Scholar
Zhou, Y., Tuzel, O.: Voxelnet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Process Optimization Group, Technische Universität Ilmenau, 98693, Ilmenau, Germany
Nuo Cheng & Pu Li
LiangDao GmbH, 12099, Berlin, Germany
Nuo Cheng, Xiaohan Li & Shengguang Lei

Authors

Nuo Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohan Li
View author publications
You can also search for this author in PubMed Google Scholar
Shengguang Lei
View author publications
You can also search for this author in PubMed Google Scholar
Pu Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nuo Cheng .

Editor information

Editors and Affiliations

University of Nevada Reno, Reno, NV, USA
George Bebis
Stony Brook University, Stony Brook, NY, USA
Zhaozheng Yin
Drexel University, Philadelphia, PA, USA
Edward Kim
RWTH Aachen University, Aachen, Germany
Jan Bender
University of Edinburgh, Edinburgh, UK
Kartic Subr
IBM Research – Cambridge, Cambridge, MA, USA
Bum Chul Kwon
University of Waterloo, Waterloo, ON, Canada
Jian Zhao
Graz University of Technology, Graz, Austria
Denis Kalkofen
The Hong Kong Polytechnic University, Hong Kong, Hong Kong
George Baciu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cheng, N., Li, X., Lei, S., Li, P. (2020). BVNet: A 3D End-to-End Model Based on Point Cloud. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2020. Lecture Notes in Computer Science(), vol 12509. Springer, Cham. https://doi.org/10.1007/978-3-030-64556-4_33

Download citation

DOI: https://doi.org/10.1007/978-3-030-64556-4_33
Published: 07 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64555-7
Online ISBN: 978-3-030-64556-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics