RVNet: Deep Sensor Fusion of Monocular Camera and Radar for Image-Based Obstacle Detection in Challenging Environments

John, Vijay; Mita, Seiichi

doi:10.1007/978-3-030-34879-3_27

Vijay John¹¹ &
Seiichi Mita¹¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11854))

Included in the following conference series:

Pacific-Rim Symposium on Image and Video Technology

2883 Accesses
46 Citations

Abstract

Camera and radar-based obstacle detection are important research topics in environment perception for autonomous driving. Camera-based obstacle detection reports state-of-the-art accuracy, but the performance is limited in challenging environments. In challenging environments, the camera features are noisy, limiting the detection accuracy. In comparison, the radar-based obstacle detection methods using the 77 GHZ long-range radar are not affected by these challenging environments. However, the radar features are sparse with no delineation of the obstacles. The camera and radar features are complementary, and their fusion results in robust obstacle detection in varied environments. Once calibrated, the radar features can be used for localization of the image obstacles, while the camera features can be used for the delineation of the localized obstacles. We propose a novel deep learning-based sensor fusion framework, termed as the “RVNet”, for the effective fusion of the monocular camera and long-range radar for obstacle detection. The RVNet is a single shot object detection network with two input branches and two output branches. The RVNet input branches contain separate branches for the monocular camera and the radar features. The radar features are formulated using a novel feature descriptor, termed as the “sparse radar image”. For the output branches, the proposed network contains separate branches for small obstacles and big obstacles, respectively. The validation of the proposed network with state-of-the-art baseline algorithm is performed on the Nuscenes public dataset. Additionally, a detailed parameter analysis is performed with several variants of the RVNet. The experimental results show that the proposed network is better than baseline algorithms in varying environmental conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bombini, L., Cerri, P., Medici, P., Aless, G.: Radar-vision fusion for vehicle detection. In: International Workshop on Intelligent Transportation, pp. 65–70 (2006)
Google Scholar
Caesar, H., et al.: nuScenes: A multimodal dataset for autonomous driving. CoRR abs/1903.11027 (2019)
Google Scholar
Chadwick, S., Maddern, W., Newman, P.: Distant vehicle detection using radar and vision. CoRR abs/1901.10951 (2019)
Google Scholar
Everingham, M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
Fang, Y., Masaki, I., Horn, B.: Depth-based target segmentation for intelligent vehicles: fusion of radar and binocular stereo. IEEE Trans. Intell. Transp. Syst. 3(3), 196–202 (2002)
Article Google Scholar
Gaisser, F., Jonker, P.P.: Road user detection with convolutional neural networks: an application to the autonomous shuttle WEpod. In: International Conference on Machine Vision Applications (MVA), pp. 101–104 (2017)
Google Scholar
Garcia, F., Cerri, P., Broggi, A., de la Escalera, A., Armingol, J.M.: Data fusion for overtaking vehicle detection based on radar and optical flow. In: 2012 IEEE Intelligent Vehicles Symposium, pp. 494–499 (2012)
Google Scholar
Jazayeri, A., Cai, H., Zheng, J.Y., Tuceryan, M.: Vehicle detection and tracking in car video based on motion model. IEEE Trans. Intell. Transp. Syst. 12(2), 583–595 (2011)
Article Google Scholar
John, V., Karunakaran, N.M., Guo, C., Kidono, K., Mita, S.: Free space, visible and missing lane marker estimation using the PsiNet and extra trees regression. In: 24th International Conference on Pattern Recognition, pp. 189–194 (2018)
Google Scholar
Kato, T., Ninomiya, Y., Masaki, I.: An obstacle detection method by fusion of radar and motion stereo. IEEE Trans. Intell. Transp. Syst. 3(3), 182–188 (2002)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2. CoRR abs/1512.02325
Chapter Google Scholar
Macaveiu, A., Campeanu, A., Nafornita, I.: Kalman-based tracker for multiple radar targets. In: 2014 10th International Conference on Communications (COMM), pp. 1–4 (2014)
Google Scholar
Manjunath, A., Liu, Y., Henriques, B., Engstle, A.: Radar based object detection and tracking for autonomous driving. In: 2018 IEEE MTT-S International Conference on Microwaves for Intelligent Mobility (ICMIM), pp. 1–4 (2018)
Google Scholar
Milch, S., Behrens, M.: Pedestrian detection with radar and computer vision (2001)
Google Scholar
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. CoRR abs/1505.04366 (2015)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018). http://arxiv.org/abs/1804.02767
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28, 91–99 (2015)
Google Scholar
Sugimoto, S., Tateda, H., Takahashi, H., Okutomi, M.: Obstacle detection using millimeter-wave radar and its visualization on image sequence. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3, pp. 342–345 (2004)
Google Scholar
Wang, X., Xu, L., Sun, H., Xin, J., Zheng, N.: On-road vehicle detection and tracking using MMW radar and monovision fusion. IEEE Trans. Intell. Transp. Syst. 17(7), 2075–2084 (2016)
Article Google Scholar
Zhong, Z., Liu, S., Mathew, M., Dubey, A.: Camera radar fusion for increased reliability in ADAS applications. Electron. Imaging Auton. Veh. Mach. 1(4), 258-1–258-4 (2018)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Toyota Technological Institute, Nagoya, Japan
Vijay John & Seiichi Mita

Authors

Vijay John
View author publications
You can also search for this author in PubMed Google Scholar
Seiichi Mita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vijay John .

Editor information

Editors and Affiliations

Chonnam National University, Gwangju, Korea (Republic of)
Chilwoo Lee
Dalian University of Technology, Dalian, China
Zhixun Su
National Institute of Informatics, Tokyo, Japan
Akihiro Sugimoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

John, V., Mita, S. (2019). RVNet: Deep Sensor Fusion of Monocular Camera and Radar for Image-Based Obstacle Detection in Challenging Environments. In: Lee, C., Su, Z., Sugimoto, A. (eds) Image and Video Technology. PSIVT 2019. Lecture Notes in Computer Science(), vol 11854. Springer, Cham. https://doi.org/10.1007/978-3-030-34879-3_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-34879-3_27
Published: 11 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34878-6
Online ISBN: 978-3-030-34879-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics