Road segmentation with image-LiDAR data fusion in deep neural network

Liu, Huafeng; Yao, Yazhou; Sun, Zeren; Li, Xiangrui; Jia, Ke; Tang, Zhenming

doi:10.1007/s11042-019-07870-0

Road segmentation with image-LiDAR data fusion in deep neural network

Published: 27 July 2019

Volume 79, pages 35503–35518, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Huafeng Liu ORCID: orcid.org/0000-0001-5396-3183¹,
Yazhou Yao¹,
Zeren Sun¹,
Xiangrui Li¹,
Ke Jia² &
…
Zhenming Tang¹

1098 Accesses
11 Citations
Explore all metrics

Abstract

Robust road segmentation is a key challenge in self-driving research. Though many image based methods have been studied and high performances in dataset evaluations have been reported, developing robust and reliable road segmentation is still a major challenge. Data fusion across different sensors to improve the performance of road segmentation is widely considered an important and irreplaceable solution. In this paper, we propose a novel structure to fuse image and LiDAR point cloud in an end-to-end semantic segmentation network, in which the fusion is performed at decoder stage instead of at, more commonly, encoder stage. During fusion, we improve the multi-scale LiDAR map generation to increase the precision of multi-scale LiDAR map by introducing pyramid projection method. Additionally, we adapted the multi-path refinement network with our fusion strategy and improve the road prediction compared with transpose convolution with skip layers. Our approach has been tested on KITTI ROAD dataset and have a competitive performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

U-Net-based RGB and LiDAR image fusion for road segmentation

Article 29 January 2023

Deep representation learning for road detection using Siamese network

Article 15 December 2018

Camera and LiDAR Fusion for Point Cloud Semantic Segmentation

References

Asvadi A, Garrote L, Premebida C, Peixoto P, Nunes U (2017) Multi-modal vehicle detection: fusing 3d-LiDAR and color camera data. Pattern Recogn Lett 115:20–29
Google Scholar
Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481–2495
Google Scholar
Caltagirone L, Bellone M, Svensson L, Wahde M (2019) LIDAR–camera fusion for road detection using fully convolutional neural networks. Robot Auton Syst 2019:125–131
Google Scholar
Caltagirone L, Scheidegger S, Svensson L, Wahda M (2017) Fast LIDAR-based road detection using fully convolutional neural networks. IEEE Intelligent Vehicles Symposium 2017:1019–1024
Google Scholar
Chen C, Seff A, Kornhauser A, Xiao J (2015) Deepdriving: learning affordance for direct perception in autonomous driving. IEEE International Conference on Computer Vision 2015:2722–2730
Google Scholar
Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille L (2015) Semantic image segmentation with deep convolutional nets and fully connected CRFs. International Conference on Learning Representations 2015:1–1
Google Scholar
Chen L, Yang J, Kong H (2017) LiDAR-histogram for fast road and obstacle detection. IEEE International Conference on Robotics and Automation 2017:1343–1348
Google Scholar
Chen L, Zhu Y, George P, Florian S (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. European Conference on Computer Vision 2018:833–851
Google Scholar
Chen Z, Chen Z (2017) RBNet: a deep neural network for unified road and road boundary detection. Neural Information Processing 2017:677–687
Google Scholar
Deng J, Dong W, Socher R, Li L, Li K, Li F (2009) ImageNet: a large-scale hierarchical image database. IEEE International Conference on Computer Vision and Pattern Recognition 2009:248–255
Google Scholar
Fritsch J, Kuhnl T, Geiger A (2014) A new performance measure and evaluation benchmark for road detection algorithms. IEEE Conference on Intelligent Transportation Systems 2014:1693–1700
Google Scholar
Han X, Wang H, Lu J, Zhao C (2017) Road detection based on the fusion of Lidar and image data. Int J Adv Robot Syst 14:1–10
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc. IEEE Conf Comput Vis Pattern Recognit 2016:770–778
Google Scholar
Lin G, Milan A, Shen C, Reid I (2017) RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. Proc IEEE Conf Comput Vis Pattern Recognit 2017:5168–5177
Google Scholar
Liu H, Han X, Li X, Yao Y, Huang P, Tang Z (2018) Deep representation learning for road detection using siamese network. Multimed Tools Appl 2018:1–15
Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. Proc IEEE Conf Comput Vis Pattern Recognit 2015:3431–3440
Google Scholar
Lu K, Li J, An X, He H (2014) A hierarchical approach for road detection. IEEE International Conference on Robotics and Automation 2014:517–522
Google Scholar
Muñoz-Bulnes J, Fernandez C, Parra I, Fernández-Llorca D, Sotelo M (2017) Deep fully convolutional networks with random data augmentation for enhanced generalization in road detection. IEEE International Conference on Intelligent Transportation Systems 2017:366–371
Google Scholar
Olaf R, Philipp F, Thomas B (2015) U-Net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer Assisted Intervention 2015:234–241
Google Scholar
Oliveira G, Burgard W, Brox T (2016) Efficient deep methods for monocular road segmentation. International Conference on Intelligent Robots and Systems 2016:9–14
Google Scholar
Premebida C, Carreira J, Batista J, Nunes U (2014) Pedestrian detection combining RGB and dense LIDAR data. IEEE International Conference on Intelligent Robots and Systems 2014:4112–4117
Google Scholar
Schlosser J, Chow C, Kira Z (2016) Fusing LIDAR and images for pedestrian detection using convolutional neural networks. IEEE International Conference on Robotics and Automation 2016:2198–2205
Google Scholar
Shen F, Xu Y, Liu L, Yang Y, Huang Z, Shen H (2018) Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE Trans Pattern Anal Mach Intell 40(12):3034–3044
Google Scholar
Shen F, Yang Y, Liu L, Liu W, Tao D, Shen H (2017) Asymmetric binary coding for image search. IEEE Trans Multimedia 19(9):2022–2032
Google Scholar
Shen F, Zhou X, Yang Y, Song J, Shen H, Tao D (2016) A fast optimization method for general binary code learning. IEEE Trans Image Process 25 (12):5610–5621
MathSciNet MATH Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations 2015:1–14
Google Scholar
Treml M, Arjona-Medina J, Unterthiner T, Durgesh R, Friedmann F, Schuberth P, Mayr A, Heusel M, Hofmarcher M, Widrich M, Bodenhofer U, Nessler B, Hochreiter S (2016) Speeding up semantic segmentation for autonomous driving. NIPS Workshop 2016:96–108
Google Scholar
Xiao L, Wang R, Dai B, Fang Y, Liu D (2018) Hybrid conditional random field based camera-LIDAR fusion for road detection. Inf Sci 432:543–558
MathSciNet Google Scholar
Xie G, Zhang X, Shu X, Yan S, Liu C (2015) Task-driven feature pooling for image classification. IEEE International Conference on Computer Vision 2015:1179–1187
Google Scholar
Xie G, Zhang X, Yan S, Liu C (2017) Hybrid CNN and dictionary-based models for scene recognition and domain adaptation. IEEE Trans Circuits Syst Video Technol 27(6):1263–1274
Google Scholar
Xie G, Zhang X, Yan S, Liu C (2017) SDE: a novel selective, discriminative and equalizing feature representation for visual recognition. Int J Comput Vis 124 (2):145–168
MathSciNet Google Scholar
Yang W, Li J, Zheng H, Xu R (2017) A nuclear norm based matrix regression based projections method for feature extraction. IEEE Access 6:7445–7451
Google Scholar
Yang W, Wang Z, Sun C (2015) A collaborative representation based projections method for feature extraction. Pattern Recogn 48(1):20–27
Google Scholar
Yang W, Wang Z, Yin J, Sun C, Ricanek K (2013) Image classification using kernel collaborative representation with regularized least square. Appl Math Comput 222:13–28
MathSciNet MATH Google Scholar
Yao Y, Shen F, Zhang J, Liu L, Tang Z, Shao L (2019) Extracting multiple visual senses for web learning. IEEE Trans Multimedia 21(1):184–196
Google Scholar
Yao Y, Shen F, Zhang J, Liu L, Tang Z, Shao L (2019) Extracting privileged information for enhancing classifier learning. IEEE Trans Image Process 28 (1):436–450
MathSciNet MATH Google Scholar
Yao Y, Zhang J, Shen F, Hua X, Xu J, Tang Z (2016) Automatic image dataset construction with multiple textual metadata. IEEE International Conference on Multimedia and Expo 2016:1–6
Google Scholar
Yao Y, Zhang J, Shen F, Hua X, Xu J, Tang Z (2017) Exploiting web images for dataset construction a domain robust approach. IEEE Trans Multimedia 19 (8):1771–1784
Google Scholar
Yao Y, Zhang J, Shen F, Yang W, Hua X, Tang Z (2018) Extracting privileged information from untagged corpora for classifier learning. International Joint Conference on Artificial Intelligence 2018:1085–1091
Google Scholar
Yao Y, Zhang J, Shen F, Yang W, Huang P, Tang Z (2018) Discovering and distinguishing multiple visual senses for polysemous words. AAAI Conference on Artificial Intelligence 2018:523–530
Google Scholar
Zhao M, Zhang J, Porikli F, Zhang C, Zhang W (2017) Learning a perspective-embedded deconvolution network for crowd counting. IEEE International Conference on Multimedia and Expo 2017:403–408
Google Scholar
Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P (2015) Conditional random fields as recurrent neural networks. IEEE International Conference on Computer Vision 2015:1529–1537
Google Scholar
Zheng W (2017) Multichannel EEG-based emotion recognition via group sparse canonical correlation analysis. IEEE Transactions on Cognitive and Developmental Systems 19(3):281–290
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing City, 210094, China
Huafeng Liu, Yazhou Yao, Zeren Sun, Xiangrui Li & Zhenming Tang
School of Computer Science, Chengdu University of Information Technology, Chengdu City, China
Ke Jia

Authors

Huafeng Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yazhou Yao
View author publications
You can also search for this author in PubMed Google Scholar
Zeren Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xiangrui Li
View author publications
You can also search for this author in PubMed Google Scholar
Ke Jia
View author publications
You can also search for this author in PubMed Google Scholar
Zhenming Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huafeng Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, H., Yao, Y., Sun, Z. et al. Road segmentation with image-LiDAR data fusion in deep neural network. Multimed Tools Appl 79, 35503–35518 (2020). https://doi.org/10.1007/s11042-019-07870-0

Download citation

Received: 26 January 2019
Revised: 16 April 2019
Accepted: 05 June 2019
Published: 27 July 2019
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11042-019-07870-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Road segmentation with image-LiDAR data fusion in deep neural network

Abstract

Access this article

Similar content being viewed by others

U-Net-based RGB and LiDAR image fusion for road segmentation

Deep representation learning for road detection using Siamese network

Camera and LiDAR Fusion for Point Cloud Semantic Segmentation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Road segmentation with image-LiDAR data fusion in deep neural network

Abstract

Access this article

Similar content being viewed by others

U-Net-based RGB and LiDAR image fusion for road segmentation

Deep representation learning for road detection using Siamese network

Camera and LiDAR Fusion for Point Cloud Semantic Segmentation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation