research-article

An Multi-Sensors 3D Detection Network Using Guidance-Point-Based Feature Fusion

Authors:
Chao Wu

School of Mechanical Engineering, Beijing Institute of Technology, China

School of Mechanical Engineering, Beijing Institute of Technology, China
View Profile

,
Bin Shao Wu

School of Mechanical Engineering, Beijing Institute of Technology, China

School of Mechanical Engineering, Beijing Institute of Technology, China
View Profile

ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial IntelligenceMarch 2022Pages 705–710https://doi.org/10.1145/3532213.3532320

Published:13 July 2022Publication History

ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial Intelligence

Pages 705–710

ABSTRACT

An accurate and efficient 3D object detection system is crucial to for the autonomous vehicle. However, due to the complexity of the environment, a single sensor, such as LIDAR or camera, cannot meet the safety requirements of autonomous driving. In this paper, a two stage 3D detection network using Guidance-Point-Based feature fusion is proposed. For the first stage network, firstly, the features in the image space are converted to BEV(bird's-eye-view) through the Guidance-Point-Based feature mapping module designed in this paper. Secondly, the LIDAR feature and the camera feature in BEV are fused through the adaptive fusion module, and finally a Cneter-Based strategy is used for detection. In the second stage, keypointed features are used to further refine the objects output by the first stage network. Evaluation on the nuScenes dataset shows that the network we proposed achieves higher accuracy with less additional time.

Supplemental Material

Available for Download

pdf

Presentation slides (2.3 MB)

References

J. Shen, Q. Liu and H. Chen. 2020. An Optimized Multi-sensor Fused Object Detection Method for Intelligent Vehicles0. In 2020 IEEE 5th International Conference on Intelligent Transportation Engineering (ICITE), 2020,Beijing China, 265-270Google ScholarCross Ref
Redmon J, Divvala S, Girshick R, 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, Las Vegas,USA, 779-788Google ScholarCross Ref
Redmon J, Farhadi A. 2017. YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, Hawaii, USA, 7263-7271Google ScholarCross Ref
Farhadi A, Redmon J. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018. 2, 4, 7, 11Google Scholar
Bochkovskiy A, Wang C Y, Liao H Y M. 2020. YOLOv4: Optimal speed and accuracy of object detection. J. arXiv preprint arXiv:2004.10934, 2020: 1-17.Google Scholar
H. Law and J. Deng. 2018. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision (ECCV), 2018, Munich, GermanyGoogle Scholar
Xingyi Zhou, Dequan Wang, and Philipp Kr¨ahenb¨uhl. 2019. Objects as points. arXiv:1904.07850, 2019. 2, 3Google Scholar
Y. Kim and D. Kum. 2019. Deep Learning based Vehicle Position and Orientation Estimation via Inverse Perspective Mapping Image. In 2019 IEEE Intelligent Vehicles Symposium (IV), 2019, Paris, France, 317-323, doi: 10.1109/IVS.2019.8814050.Google ScholarDigital Library
T. Roddick, A. Kendall, and R. Cipolla. 2018. Orthographic feature transform for monocular 3d object detection. arXiv preprint arXiv:1811.08188, 2018. 2.Google Scholar
Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. 2019. Multi-view 3d object detection network for autonomous driving. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, CA, USA:IEEE,2019: 1907-1915.Google Scholar
Yin Zhou,Oncel Tuzel. 2018. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA: IEEE,2018:4490-4499.Google ScholarCross Ref
Alex H Lang, Sourabh Vora, Holger Caesar,Lubing Zhou,Jiong Yang, and Oscar Beijbom. 2019. Pointpillars: Fast Encoders for Object Detection from Point Clouds[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition,Long Beach, CA, USA:IEEE,2019:12689-12697.Google Scholar
T. Yin, X. Zhou and P. Krähenbühl. 2021. "Center-based 3D Object Detection and Tracking," 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11779-11788, doi: 10.1109/CVPR46437.2021.01161.Google Scholar
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Computer Vision and Pattern Recognition (CVPR),Hawaii,USA, IEEE, 1(2):4, 2017.Google Scholar
Charles Ruizhongtai Qi, Li Yi, Hao Su, Leonidas J Guibas. 2017. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in Neural Information Processing Systems, Long Beach, 5099-5108Google Scholar
Shaoshuai Shi, Xiaogang Wang, and Hongsheng Li. 2019. Pointrcnn: 3d object proposal generation and detection from point cloud. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach,USA, 770–779Google ScholarCross Ref
Shi S, Guo C, Jiang L, 2020. Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, Seattle, USA, 10529-10538Google ScholarCross Ref
Charles Ruizhongtai Qi, Wei Liu, Chenxia Wu, Hao Su, and Leonidas J. Guibas. 2018. Frustum pointnets for 3d object detection from RGBD data. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition,Salt Lake City, UT, USA:IEEE,2018:918-927.Google ScholarCross Ref
Liang, M, Yang, B, Wang, S, Urtasun, R. 2018. Deep continuous fusion for multi-sensor 3d object detection. In Proceedings of the European Conference on Computer Vision (ECCV), 2018, Munich, Germany, 641–656Google Scholar
J. H. Yoo, Y. Kim, J. S. Kim, and J. W. Choi. 2020. 3d-cvf: Generating joint camera and lidar features using cross-view spatial feature fusion for 3d object detection, arXiv preprint arXiv:2004.12636, 2020Google Scholar
M. Ding 2020. Learning Depth-Guided Convolutions for Monocular 3D Object Detection. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA,11669-11678, doi: 10.1109/CVPR42600.2020.01169.Google ScholarCross Ref
Tsung-Yi Lin, Priyal Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. 2018. Focal loss for dense object detection. J. IEEE transactions on pattern analysis and machine intelligence,2018,318-327.Google Scholar
Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., Beijbom, O. 2019. nuscenes: A multimodal dataset for autonomous driving. arXiv preprint arXiv:1903.11027Google Scholar

Recommendations

Two-Stage Feature Attention Fusion for Radar-Camera 3D Object Detection
ADMIT '23: Proceedings of the 2023 2nd International Conference on Algorithms, Data Mining, and Information Technology

Multi-sensor fusion is essential for 3D object detection in intelligent transportation due to it makes best use of cross-modality information, in which feature-level fusion of millimeter-wave radar and camera has been a hot topic. Existing research ...
Read More
3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-view Spatial Feature Fusion for 3D Object Detection
Computer Vision – ECCV 2020
Abstract
In this paper, we propose a new deep architecture for fusing camera and LiDAR sensors for 3D object detection. Because the camera and LiDAR sensor signals have different characteristics and distributions, fusing these two modalities is expected to ...
Read More
3D object detection based on the fusion of projected point cloud and image features
EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

The complementary advantages of point cloud and image can provide more accurate 3D and semantic information to the model. Aiming at the problems that most existing methods adopt a single fusion strategy and thus fail to achieve deep fusion of image and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial Intelligence
March 2022
809 pages
ISBN:9781450396110
DOI:10.1145/3532213

Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 July 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
3D Object Detection
Guidance-Point-Based Feature Maping
Multi-Sensors
adaptive fusion
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 51
  Total Downloads
- Downloads (Last 12 months)20
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

An Multi-Sensors 3D Detection Network Using Guidance-Point-Based Feature Fusion

ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial Intelligence

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Recommendations

Two-Stage Feature Attention Fusion for Radar-Camera 3D Object Detection

3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-view Spatial Feature Fusion for 3D Object Detection

3D object detection based on the fusion of projected point cloud and image features

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

An Multi-Sensors 3D Detection Network Using Guidance-Point-Based Feature Fusion

ICCAI '22: Proceedings of the 8th International Conference on Computing and Artificial Intelligence

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Recommendations

Two-Stage Feature Attention Fusion for Radar-Camera 3D Object Detection

3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-view Spatial Feature Fusion for 3D Object Detection

3D object detection based on the fusion of projected point cloud and image features

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media