Abstract
Robust and accurate object detection on roads with various objects is essential for automated driving. The radar has been employed in commercial advanced driver assistance systems (ADAS) for a decade due to its low-cost and high-reliability advantages. However, the radar has been used only in limited driving conditions such as highways to detect a few forwarding vehicles because of the limited performance of radar due to low resolution or poor classification. We propose a learning-based detection network using radar range-azimuth heatmap and monocular image in order to fully exploit the radar in complex road environments. We show that radar-image fusion can overcome the inherent weakness of the radar by leveraging camera information. Our proposed network has a two-stage architecture that combines radar and image feature representations rather than fusing each sensor’s prediction results to improve detection performance over a single sensor. To demonstrate the effectiveness of the proposed method, we collected radar, camera, and LiDAR data in various driving environments in terms of vehicle speed, lighting conditions, and traffic volume. Experimental results show that the proposed fusion method outperforms the radar-only and the image-only method.
J. Kim and Y. Kim—Contributed equally to this work.
This work was done when Jinhyeong Kim was at KAIST, prior to joining SOCAR.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rohling, H.: Radar CFAR thresholding in clutter and multiple target situations. IEEE Trans. Aerosp. Electron. Syst. (1983)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR (2012)
Huang, X., et al.: The apolloscape dataset for autonomous driving. In: CVPR Workshop (2018)
Caesar, H., et al.: nuScenes: a multimodal dataset for autonomous driving. In: CVPR (2020)
Barnes, D., Gadd, M., Murcutt, P., Newman, P., Posner, I.: The Oxford radar robotcar dataset: a radar extension to the Oxford robotcar dataset. In: ICRA (2020)
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtualworlds as proxy for multi-object tracking analysis. In: CVPR (2016)
He, Y., Yang, Y., Lang, Y., Huang, D., Jing, X., Hou, C.: Deep learning based human activity classification in radar micro-doppler image. In: EuRAD (2018)
Jihoon, K., Seungeui, L., Nojun, K.: Human detection by deep neural networks recognizing micro-doppler signals of radar. In: EuRAD (2018)
Brodeski, D., Bilik, I., Giryes, R.: Deep radar detector. In: RadarConf (2019)
Zhang, G., Li, H., Wenger, F.: Object detection and 3D estimation via an FMCW radar using a fully convolutional network. In: ICASSP (2020)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Major, B., Fontijne, D., Sukhavasi, R.T., Hamilton, M.: Vehicle detection with automotive radar using deep learning on range-azimuth-doppler tensors. In: ICCV Workshop (2019)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chen, X., Ma, H., Wan, J., Li, B., Xia, T.: Multi-view 3D object detection network for autonomous driving. In: CVPR (2017)
Ku, J., Mozifian, M., Lee, J., Harakeh, A., Waslander, S.: Joint 3D proposal generation and object detection from view aggregation. In: IROS (2018)
Chadwick, S., Maddern, W., Newman, P.: Distant vehicle detection using radar and vision. In: ICRA (2019)
Meyer, M., Kuschk, G.: Deep learning based 3D object detection for automotive radar and camera. In: EuRAD (2019)
Lim, T., Major, B., Fontijne, D., Hamilton, M.: Radar and camera early fusion for vehicle detection in advanced driver assistance systems. In: NeurIPS Workshop (2019)
Huang, J.K., Grizzle, J.W.: Improvements to target-based 3D lidar to camera calibration. IEEE Access (2020)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN. In: ICCV (2017)
Lin, T., Girshick, R., Doll, P., He, K., Dollar, P.: Focal loss for dense object detection. In: ICCV (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NeurIPS (2015)
Brazil, G., Liu, X.: M3D-RPN: Monocular 3D region proposal network for object detection. In: ICCV (2019)
Acknowledgement
This research was supported by the Technology Innovation Program (No. 10083646) funded By the Ministry of Trade, Industry & Energy, Korea and the KAIST-KU Joint Research Center, KAIST, Korea.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Kim, J., Kim, Y., Kum, D. (2021). Low-Level Sensor Fusion for 3D Vehicle Detection Using Radar Range-Azimuth Heatmap and Monocular Image. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12624. Springer, Cham. https://doi.org/10.1007/978-3-030-69535-4_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-69535-4_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69534-7
Online ISBN: 978-3-030-69535-4
eBook Packages: Computer ScienceComputer Science (R0)