Skip to main content
Log in

Road segmentation with image-LiDAR data fusion in deep neural network

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Robust road segmentation is a key challenge in self-driving research. Though many image based methods have been studied and high performances in dataset evaluations have been reported, developing robust and reliable road segmentation is still a major challenge. Data fusion across different sensors to improve the performance of road segmentation is widely considered an important and irreplaceable solution. In this paper, we propose a novel structure to fuse image and LiDAR point cloud in an end-to-end semantic segmentation network, in which the fusion is performed at decoder stage instead of at, more commonly, encoder stage. During fusion, we improve the multi-scale LiDAR map generation to increase the precision of multi-scale LiDAR map by introducing pyramid projection method. Additionally, we adapted the multi-path refinement network with our fusion strategy and improve the road prediction compared with transpose convolution with skip layers. Our approach has been tested on KITTI ROAD dataset and have a competitive performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Asvadi A, Garrote L, Premebida C, Peixoto P, Nunes U (2017) Multi-modal vehicle detection: fusing 3d-LiDAR and color camera data. Pattern Recogn Lett 115:20–29

    Google Scholar 

  2. Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481–2495

    Google Scholar 

  3. Caltagirone L, Bellone M, Svensson L, Wahde M (2019) LIDAR–camera fusion for road detection using fully convolutional neural networks. Robot Auton Syst 2019:125–131

    Google Scholar 

  4. Caltagirone L, Scheidegger S, Svensson L, Wahda M (2017) Fast LIDAR-based road detection using fully convolutional neural networks. IEEE Intelligent Vehicles Symposium 2017:1019–1024

    Google Scholar 

  5. Chen C, Seff A, Kornhauser A, Xiao J (2015) Deepdriving: learning affordance for direct perception in autonomous driving. IEEE International Conference on Computer Vision 2015:2722–2730

    Google Scholar 

  6. Chen L, Papandreou G, Kokkinos I, Murphy K, Yuille L (2015) Semantic image segmentation with deep convolutional nets and fully connected CRFs. International Conference on Learning Representations 2015:1–1

    Google Scholar 

  7. Chen L, Yang J, Kong H (2017) LiDAR-histogram for fast road and obstacle detection. IEEE International Conference on Robotics and Automation 2017:1343–1348

    Google Scholar 

  8. Chen L, Zhu Y, George P, Florian S (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. European Conference on Computer Vision 2018:833–851

    Google Scholar 

  9. Chen Z, Chen Z (2017) RBNet: a deep neural network for unified road and road boundary detection. Neural Information Processing 2017:677–687

    Google Scholar 

  10. Deng J, Dong W, Socher R, Li L, Li K, Li F (2009) ImageNet: a large-scale hierarchical image database. IEEE International Conference on Computer Vision and Pattern Recognition 2009:248–255

    Google Scholar 

  11. Fritsch J, Kuhnl T, Geiger A (2014) A new performance measure and evaluation benchmark for road detection algorithms. IEEE Conference on Intelligent Transportation Systems 2014:1693–1700

    Google Scholar 

  12. Han X, Wang H, Lu J, Zhao C (2017) Road detection based on the fusion of Lidar and image data. Int J Adv Robot Syst 14:1–10

    Google Scholar 

  13. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proc. IEEE Conf Comput Vis Pattern Recognit 2016:770–778

    Google Scholar 

  14. Lin G, Milan A, Shen C, Reid I (2017) RefineNet: Multi-path refinement networks for high-resolution semantic segmentation. Proc IEEE Conf Comput Vis Pattern Recognit 2017:5168–5177

    Google Scholar 

  15. Liu H, Han X, Li X, Yao Y, Huang P, Tang Z (2018) Deep representation learning for road detection using siamese network. Multimed Tools Appl 2018:1–15

    Google Scholar 

  16. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. Proc IEEE Conf Comput Vis Pattern Recognit 2015:3431–3440

    Google Scholar 

  17. Lu K, Li J, An X, He H (2014) A hierarchical approach for road detection. IEEE International Conference on Robotics and Automation 2014:517–522

    Google Scholar 

  18. Muñoz-Bulnes J, Fernandez C, Parra I, Fernández-Llorca D, Sotelo M (2017) Deep fully convolutional networks with random data augmentation for enhanced generalization in road detection. IEEE International Conference on Intelligent Transportation Systems 2017:366–371

    Google Scholar 

  19. Olaf R, Philipp F, Thomas B (2015) U-Net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer Assisted Intervention 2015:234–241

    Google Scholar 

  20. Oliveira G, Burgard W, Brox T (2016) Efficient deep methods for monocular road segmentation. International Conference on Intelligent Robots and Systems 2016:9–14

    Google Scholar 

  21. Premebida C, Carreira J, Batista J, Nunes U (2014) Pedestrian detection combining RGB and dense LIDAR data. IEEE International Conference on Intelligent Robots and Systems 2014:4112–4117

    Google Scholar 

  22. Schlosser J, Chow C, Kira Z (2016) Fusing LIDAR and images for pedestrian detection using convolutional neural networks. IEEE International Conference on Robotics and Automation 2016:2198–2205

    Google Scholar 

  23. Shen F, Xu Y, Liu L, Yang Y, Huang Z, Shen H (2018) Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE Trans Pattern Anal Mach Intell 40(12):3034–3044

    Google Scholar 

  24. Shen F, Yang Y, Liu L, Liu W, Tao D, Shen H (2017) Asymmetric binary coding for image search. IEEE Trans Multimedia 19(9):2022–2032

    Google Scholar 

  25. Shen F, Zhou X, Yang Y, Song J, Shen H, Tao D (2016) A fast optimization method for general binary code learning. IEEE Trans Image Process 25 (12):5610–5621

    MathSciNet  MATH  Google Scholar 

  26. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. International Conference on Learning Representations 2015:1–14

    Google Scholar 

  27. Treml M, Arjona-Medina J, Unterthiner T, Durgesh R, Friedmann F, Schuberth P, Mayr A, Heusel M, Hofmarcher M, Widrich M, Bodenhofer U, Nessler B, Hochreiter S (2016) Speeding up semantic segmentation for autonomous driving. NIPS Workshop 2016:96–108

    Google Scholar 

  28. Xiao L, Wang R, Dai B, Fang Y, Liu D (2018) Hybrid conditional random field based camera-LIDAR fusion for road detection. Inf Sci 432:543–558

    MathSciNet  Google Scholar 

  29. Xie G, Zhang X, Shu X, Yan S, Liu C (2015) Task-driven feature pooling for image classification. IEEE International Conference on Computer Vision 2015:1179–1187

    Google Scholar 

  30. Xie G, Zhang X, Yan S, Liu C (2017) Hybrid CNN and dictionary-based models for scene recognition and domain adaptation. IEEE Trans Circuits Syst Video Technol 27(6):1263–1274

    Google Scholar 

  31. Xie G, Zhang X, Yan S, Liu C (2017) SDE: a novel selective, discriminative and equalizing feature representation for visual recognition. Int J Comput Vis 124 (2):145–168

    MathSciNet  Google Scholar 

  32. Yang W, Li J, Zheng H, Xu R (2017) A nuclear norm based matrix regression based projections method for feature extraction. IEEE Access 6:7445–7451

    Google Scholar 

  33. Yang W, Wang Z, Sun C (2015) A collaborative representation based projections method for feature extraction. Pattern Recogn 48(1):20–27

    Google Scholar 

  34. Yang W, Wang Z, Yin J, Sun C, Ricanek K (2013) Image classification using kernel collaborative representation with regularized least square. Appl Math Comput 222:13–28

    MathSciNet  MATH  Google Scholar 

  35. Yao Y, Shen F, Zhang J, Liu L, Tang Z, Shao L (2019) Extracting multiple visual senses for web learning. IEEE Trans Multimedia 21(1):184–196

    Google Scholar 

  36. Yao Y, Shen F, Zhang J, Liu L, Tang Z, Shao L (2019) Extracting privileged information for enhancing classifier learning. IEEE Trans Image Process 28 (1):436–450

    MathSciNet  MATH  Google Scholar 

  37. Yao Y, Zhang J, Shen F, Hua X, Xu J, Tang Z (2016) Automatic image dataset construction with multiple textual metadata. IEEE International Conference on Multimedia and Expo 2016:1–6

    Google Scholar 

  38. Yao Y, Zhang J, Shen F, Hua X, Xu J, Tang Z (2017) Exploiting web images for dataset construction a domain robust approach. IEEE Trans Multimedia 19 (8):1771–1784

    Google Scholar 

  39. Yao Y, Zhang J, Shen F, Yang W, Hua X, Tang Z (2018) Extracting privileged information from untagged corpora for classifier learning. International Joint Conference on Artificial Intelligence 2018:1085–1091

    Google Scholar 

  40. Yao Y, Zhang J, Shen F, Yang W, Huang P, Tang Z (2018) Discovering and distinguishing multiple visual senses for polysemous words. AAAI Conference on Artificial Intelligence 2018:523–530

    Google Scholar 

  41. Zhao M, Zhang J, Porikli F, Zhang C, Zhang W (2017) Learning a perspective-embedded deconvolution network for crowd counting. IEEE International Conference on Multimedia and Expo 2017:403–408

    Google Scholar 

  42. Zheng S, Jayasumana S, Romera-Paredes B, Vineet V, Su Z, Du D, Huang C, Torr P (2015) Conditional random fields as recurrent neural networks. IEEE International Conference on Computer Vision 2015:1529–1537

    Google Scholar 

  43. Zheng W (2017) Multichannel EEG-based emotion recognition via group sparse canonical correlation analysis. IEEE Transactions on Cognitive and Developmental Systems 19(3):281–290

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huafeng Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, H., Yao, Y., Sun, Z. et al. Road segmentation with image-LiDAR data fusion in deep neural network. Multimed Tools Appl 79, 35503–35518 (2020). https://doi.org/10.1007/s11042-019-07870-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-07870-0

Keywords

Navigation