ABSTRACT
In this paper we attempt to fuse both airborne Lidar and high resolution images (0.5 feet per pixel) to identify road networks in a large geographic region. We perform pixel-wise segmentation to classify each pixel as road or non-road based on color and depth features in a larger neighborhood context. This constitutes a bimodal setting because the RGB pixels represent the color space and the depth values come from three dimensional Lidar readings. We present multiple strategies for fusing Lidar and images. We describe a cost-effective, modular, deep convolution network design, TriSeg which gives better IoU metric for the aerial road segmentation problem than the state of the art RGB only architectures. We report on many other architectures as well as release our dataset for further research: https://bitbucket.org/biswas/fusion_lidar_images.
- Vijay Badrinarayanan, Alex Kendall, and Roberto Cipolla. 2017. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017).Google Scholar
- Ruzena Bajcsy and Mohamad Tavakoli. 1976. Computer recognition of roads from satellite pictures. IEEE Transactions on Systems, Man, and Cybernetics 9 (1976), 623--637.Google ScholarCross Ref
- Favyen Bastani, Songtao He, Sofiane Abbar, Mohammad Alizadeh, Hari Balakrishnan, Sanjay Chawla, Sam Madden, and David DeWitt. 2018. RoadTracer: Automatic Extraction of Road Networks from Aerial Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4720--4728.Google ScholarCross Ref
- Carlos Becker, Nicolai Häni, Elena Rosinskaya, Emmanuel d'Angelo, and Christoph Strecha. 2017. Classification of aerial photogrammetric 3D point clouds. arXiv preprint arXiv:1705.08374 (2017).Google Scholar
- James Bergstra, Daniel Yamins, and David Cox. 2013. Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In International Conference on Machine Learning. 115--123. Google ScholarDigital Library
- Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. 2018. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence 40, 4 (2018), 834--848.Google Scholar
- Guangliang Cheng, Ying Wang, Shibiao Xu, Hongzhen Wang, Shiming Xiang, and Chunhong Pan. 2017. Automatic road detection and centerline extraction via cascaded end-to-end convolutional neural network. IEEE Transactions on Geoscience and Remote Sensing 55, 6 (2017), 3322--3337.Google ScholarCross Ref
- David Eigen and Rob Fergus. 2014. Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture. CoRR abs/1411.4734 (2014). arXiv:1411.4734 http://arxiv.org/abs/1411.4734Google Scholar
- Andreas Eitel, Jost Tobias Springenberg, Luciano Spinello, Martin Riedmiller, and Wolfram Burgard. 2015. Multimodal deep learning for robust rgb-d object recognition. In Intelligent Robots and Systems (IROS), 2015 IEEE/RSJ International Conference on. IEEE, 681--687.Google ScholarCross Ref
- Saurabh Gupta, Pablo Arbelaez, and Jitendra Malik. 2013. Perceptual organization and recognition of indoor scenes from RGB-D images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 564--571. Google ScholarDigital Library
- Caner Hazirbas, Lingni Ma, Csaba Domokos, and Daniel Cremers. 2016. Fusenet: Incorporating depth into semantic segmentation via fusion-based cnn architecture. In Asian Conference on Computer Vision. Springer, 213--228.Google Scholar
- Xiangyun Hu, C Vincent Tao, and Yong Hu. 2004. Automatic road extraction from dense urban area by integrated processing of high resolution imagery and lidar data. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences. Istanbul, Turkey 35 (2004), B3.Google Scholar
- Fu Jie Huang, Y-Lan Boureau, Yann LeCun, et al. 2007. Unsupervised learning of invariant feature hierarchies with applications to object recognition. In Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on. IEEE, 1--8.Google Scholar
- Xin Huang and Liangpei Zhang. 2009. Road centreline extraction from high-resolution imagery based on multiscale structural features and support vector machines. International Journal of Remote Sensing 30, 8 (2009), 1977--1987. Google ScholarDigital Library
- Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:1408.5093 (2014).Google Scholar
- Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431--3440.Google ScholarCross Ref
- Gellért Máttyus, Wenjie Luo, and Raquel Urtasun. 2017. Deeproadmapper: Extracting road topology from aerial images. In International Conference on Computer Vision, Vol. 2.Google ScholarCross Ref
- Volodymyr Mnih and Geoffrey Hinton. 2010. Learning to detect roads in high-resolution aerial images. Computer Vision-ECCV 2010 (2010), 210--223. Google ScholarDigital Library
- Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal deep learning. In Proceedings of the 28th international conference on machine learning (ICML-11). 689--696. Google ScholarDigital Library
Index Terms
- Fusion of aerial lidar and images for road segmentation with deep CNN
Recommendations
Fusion of color images and LiDAR data for lane classification
SIGSPATIAL '15: Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information SystemsLane classification is a fundamental problem for autonomous driving and map-aided localization. Many existing algorithms rely on special designed 1D or 2D filters to extract features of lane markings from either color images or LiDAR data. However, ...
Road segmentation with image-LiDAR data fusion in deep neural network
AbstractRobust road segmentation is a key challenge in self-driving research. Though many image based methods have been studied and high performances in dataset evaluations have been reported, developing robust and reliable road segmentation is still a ...
Robust approach for suburban road segmentation in high-resolution aerial images
The goal of this research is to develop an algorithm that accurately segments high-resolution images, where linear features, such as roads, are corrupted by noise. In high-resolution images, there are two types of noises that are obstacles to road ...
Comments