Semantic frustum-based sparsely embedded convolutional detection

Feng, Yujian; Yu, Jian; Xu, Jing; Yuan, Rong

doi:10.1007/s11760-021-01854-0

Semantic frustum-based sparsely embedded convolutional detection

Original Paper
Published: 19 January 2021

Volume 15, pages 1239–1246, (2021)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Yujian Feng¹,
Jian Yu ORCID: orcid.org/0000-0002-0832-7775²,
Jing Xu³ &
…
Rong Yuan¹

280 Accesses
1 Citation
Explore all metrics

Abstract

Frustum-based 3D detection methods suffer from the ignorance of a 2D detector for that the object will never be detected in point cloud if it is omitted by a 2D image proposal. In this work, we propose a novel method named semantic frustum-based sparsely embedded convolutional detection (SFB-SECOND) for 3D object detection, which is devoted to solving the limitation of frustum-based methods, i.e., heavily relying on the accurate 2D detector. Specifically, for the image and LIDAR describing the same scene, we initially use developed methods of semantic segmentation and object detection to generate the object mask, selecting all potential targets within two confidence-related regions. Through this object mask, we quickly locate the objects of interest in LIDAR and dig them up as semantic frustum. This selected frustum not only rules out more background and irrelevant objects in LIDAR but also maximizes the use of rich 3D information. Then, to accurate the orientation estimation, we introduce a refined form of region-aware loss regression to cooperate with the region-aware frustum. Besides, a new data augmentation strategy is proposed to further make haste the convergence speed and improve detection performance. In addition, the proposed SFB-SECOND achieves state-of-the-art performances on the 3D object detection benchmark KITTI with real-time speed, showing superiority over previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SRIF-RCNN: Sparsely represented inputs fusion of different sensors for 3D object detection

Article 27 June 2022

Reinforcing LiDAR-Based 3D Object Detection with RGB and 3D Information

SODet: A LiDAR-Based Object Detector in Bird’s-Eye View

References

Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: The kitti dataset. The International Journal of Robotics Research 32(11), 1231–1237 (2013)
Article Google Scholar
X. Chen, H. Ma, J. Wan, B. Li, T. Xia, Multi-view 3d object detection network for autonomous driving, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1907–1915
B. Yang, W. Luo, R. Urtasun, Pixor: Real-time 3d object detection from point clouds, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7652–7660
Poullis, C.: A framework for automatic modeling from point cloud data. IEEE Transactions on Pattern Analysis & Machine Intelligence 35(11), 2563–2575 (2013)
Article Google Scholar
Yan, Y., Mao, Y., Li, B.: Second: Sparsely embedded convolutional detection. Sensors 18(10), 3337 (2018)
Article Google Scholar
C. R. Qi, W. Liu, C. Wu, H. Su, L. J. Guibas, Frustum pointnets for 3d object detection from rgb-d data, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 918–927
Z. Wang, K. Jia, Frustum convnet: Sliding frustums to aggregate local point-wise features for amodal 3d object detection, arXiv preprint arXiv:1903.01864, 2019
J. Ren, X. Chen, J. Liu, W. Sun, J. Pang, Q. Yan, Y.-W. Tai, and L. Xu, Accurate single stage detector using recurrent rolling convolution, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5420–5428
Gomaa, A., Abdelwahab, M.M., Abo-Zahhad, M. Real-time algorithm for simultaneous vehicle detection, and tracking in aerial view videos, In, : IEEE 61st International Midwest Symposium on Circuits and Systems (MWSCAS). IEEE 2018, 222–225 (2018)
Gomaa, A., Abdelwahab, M.M., Abo-Zahhad, M., Minematsu, T., Taniguchi, R.-I.: Robust vehicle detection and counting algorithm employing a convolution neural network and optical flow. Sensors 19(20), 4588 (2019)
Article Google Scholar
A. Gomaa, M. M. Abdelwahab, M. Abo-Zahhad, Efficient vehicle detection and tracking strategy in aerial videos by employing morphological operations and feature points motion analysis, Multimedia Tools and Applications, vol. 79, no. 35, pp. 26 023–26 043, 2020
M. Simon, K. Amende, A. Kraus, J. Honer, T. Samann, H. Kaulbersch, S. Milz, H. Michael Gross, Complexer-yolo: Real-time 3d object detection and tracking on semantic point clouds, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019, pp. 0–0
A. Ošep, P. Voigtlaender, J. Luiten, S. Breuers, B. Leibe, Large-scale object mining for object discovery from unlabeled video, In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 5502–5508
J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, H. Lu, Dual attention network for scene segmentation, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 3146–3154
Shelhamer, E., Long, J., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis & Machine Intelligence 4, 640–651 (2017)
Article Google Scholar
J. Dai, Y. Li, K. He, J. Sun, R-fcn: Object detection via region-based fully convolutional networks, In: Advances in Neural Information Processing Systems, 2016, pp. 379–387
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(4), 834–848 (2017)
Article Google Scholar
L.-C. Chen, G. Papandreou, F. Schroff, H. Adam, Rethinking atrous convolution for semantic image segmentation, arXiv preprint arXiv:1706.05587, 2017
Y. Liu, K. Chen, C. Liu, Z. Qin, Z. Luo, J. Wang, Structured knowledge distillation for semantic segmentation, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 2604–2613
Hu, Z., Tang, J., Wang, Z., Zhang, K., Zhang, L., Sun, Q.: Deep learning for image-based cancer detection and diagnosis- a survey. Pattern Recognition 83, 134–149 (2018)
Article Google Scholar
Y. Zhou, O. Tuzel, Voxelnet: End-to-end learning for point cloud based 3d object detection, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4490–4499
A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, O. Beijbom, Pointpillars: Fast encoders for object detection from point clouds, In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 12 697–12 705
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A. C. Berg, Ssd: Single shot multibox detector, In: European Conference on Computer Vision, 2016, pp. 21–37
T.-Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár, Focal loss for dense object detection, In: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2980–2988
Z. Yang, Y. Sun, S. Liu, X. Shen, J. Jia, Ipod: Intensive point-based object detector for point cloud, arXiv preprint arXiv:1812.05276, 2018
Ku, J., Mozifian, M., Lee, J., Harakeh, A., Waslander, S.L.: Joint 3d proposal generation and object detection from view aggregation, In. IEEE/RSJ International Conference on Intelligent Robots and Systems 2018, 1–8 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Automation and College of Artificial Intelligence, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
Yujian Feng & Rong Yuan
College of Computer Science and Technology/College of Atrificial Intelligence, Nanjing University Of Aeronautics And Astronautics, Nanjing, 21106, China
Jian Yu
School of Law, Hohai University, Nanjing, 21106, China
Jing Xu

Authors

Yujian Feng
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Xu
View author publications
You can also search for this author in PubMed Google Scholar
Rong Yuan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Yu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Feng, Y., Yu, J., Xu, J. et al. Semantic frustum-based sparsely embedded convolutional detection. SIViP 15, 1239–1246 (2021). https://doi.org/10.1007/s11760-021-01854-0

Download citation

Received: 01 August 2020
Revised: 16 November 2020
Accepted: 04 January 2021
Published: 19 January 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s11760-021-01854-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic frustum-based sparsely embedded convolutional detection

Abstract

Access this article

Similar content being viewed by others

SRIF-RCNN: Sparsely represented inputs fusion of different sensors for 3D object detection

Reinforcing LiDAR-Based 3D Object Detection with RGB and 3D Information

SODet: A LiDAR-Based Object Detector in Bird’s-Eye View

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Semantic frustum-based sparsely embedded convolutional detection

Abstract

Access this article

Similar content being viewed by others

SRIF-RCNN: Sparsely represented inputs fusion of different sensors for 3D object detection

Reinforcing LiDAR-Based 3D Object Detection with RGB and 3D Information

SODet: A LiDAR-Based Object Detector in Bird’s-Eye View

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation