LiDAR-based vehicle localization on the satellite image via a neural network

https://doi.org/10.1016/j.robot.2020.103519Get rights and content

Abstract

We present a novel method to localize the vehicle on an easily accessible geo-referenced satellite image based on LiDAR. We first design a neural network to extract and compare the spatial-discriminative feature maps of the satellite image patch and the LiDAR points, and obtain the probability of correspondence. Then based on the outputs of the network, a particle filter is used to obtain the probability distribution of the vehicle pose. This method can use LiDAR points and any type of odometry as input to localize the vehicle. The experimental results show that our model can generalize well on several datasets. Compared with other methods, ours is more robust in some challenging scenarios such as the occluded or shadowed area on the satellite image.

Introduction

Vehicle localization is a key issue in the field of autonomous driving, and one of the predominant methods of estimating the vehicle position is via Global Navigation Satellite System (GNSS). However, GNSS may suffer from a decrease in positioning accuracy, or even a failure, when its signal is blocked or reflected by surrounding buildings (the so-called urban canyon effect).

In order to improve the reliability and accuracy of the localization system, a prior map can be used as a complement to the GNSS. One way to acquire the prior map is to use Mobile Mapping System [1]. In this system, a vehicle equipped with the Light Detection and Ranging (LiDAR) or other sensors travels along the road to construct the map [2]. The process of constructing the map can be defined as a Simultaneous Localization and Mapping (SLAM) problem in which a vehicle constructs the map of an unknown environment while simultaneously locates itself within the map [3]. However, due to the huge amount of work required to build the map, the cost of implementing this process in a large-scale environment is very high.

In contrast, satellite images are easy to access, and they almost cover the world. Therefore, many approaches seek to use satellite image as the prior map. These works register ground-level image or LiDAR points acquired by the vehicle to the satellite map in order to obtain the geo-referenced position from the map. However, the registering process is challenging because of sensor modality, time of acquisition, varying viewpoints, illuminations change, frequent object occlusions, motions, etc [4]. Compared with cameras, LiDAR is less sensitive to light, so some works use LiDAR as a sensor to localize the vehicle on the satellite image. These traditional methods are not particularly generalizable because due to illumination changing in different regions or time of the satellite image, it is hard to find the parameter of image preprocessing that performs well in various scenarios. Besides, trees, buildings or bridges cause occlusions or shadows on the satellite image, bringing about noise which may lead to a mismatch between satellite image and the LiDAR grid-map.

In this paper, we propose a learning-based method in order to address the challenges mentioned above. It uses a frame of LiDAR points to estimate the position and heading of the vehicle on a satellite image (see Fig. 1). We design a neural network that learns to compare the spatial-discriminative feature maps of the satellite image patch and the LiDAR points, and outputs their probability of correspondence. This output then serves as observations of a particle filter that estimates a distribution of the vehicle’s pose. We also present a way to train this neural model for better performance. We evaluate our method with several datasets and the result shows that the learned neural model generalizes well to different environments. Compared with other methods, ours can achieve stable localization even in some challenging scenarios mentioned above (occlusion or shadowed area on the satellite image).

Section snippets

Related work

There are many works seeking to achieve localization by matching satellite images with the sensor data from the vehicle. Many of these works are based on a particle filter framework [5] that obtains an estimation of the vehicle’s pose. The main difference among these works is the method to obtain the weights of particles.

Many approaches compare the image collected by the camera installed on the vehicle with the satellite image patches at the location of the vehicle. For example, camera images

Proposed method

As shown in Fig. 2, our method takes a single scan of LiDAR and the odometry data (e.g. from IMU, a wheel odometer or the LiDAR odometry) as input. A georeferenced satellite image is the only prior information in this method. The output of our method is an estimation of the vehicle’s position and orientation relative to this map. Our method can be divided into two main parts: a neural network for judging the matching degree of a satellite image patch with the LiDAR points, and a particle filter

Network training

We obtain the data from the KITTI dataset [25], collected by a vehicle driving through different scenes in Karlsruhe, Germany (The data is drawn from synced raw data instead of the odometry benchmark in KITTI). The vehicle is equipped with an OXTS-RT-3003 Integrated Navigation System, a Velodyne HDL-64E LiDAR and several other sensors. We sample 22 sequences as the training set, 10 sequences as the validation set. Satellite images at different times are drawn from Google Map which means there

Results and discussion

Fig. 6 shows the ROC curve of our neural model and the baseline methods. It can be found that our model outperforms the baselines. This means our neural model generalizes better when compared with baselines. When comparing the data augmentation methods, the performance of training without mirroring and rotating is the worst, and the performance improvement of training with mirroring is smaller than that of training with rotating.

Fig. 7, Fig. 8 show the localization trajectory on these sequences

Conclusion

We propose a learning-based method to localize the vehicle on a geo-referenced satellite image. Our method takes a single scan of LiDAR as input, and outputs the position and heading of the vehicle. In this method, a neural network is designed to learn to match the LiDAR points with their corresponding satellite image patch. This network outputs the probability of correspondence which serves as observations in a particle filter that estimates a distribution of the vehicle’s pose. We also

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported by the National Natural Science Foundation of China (NSFC, 61973034, U1913203, 61473042 and 61903034).

Mengyin Fu received the B.S. degree from Liaoning University, China, the M.S. degree from the Beijing Institute of Technology, China, and the Ph.D. degree from the Chinese Academy of Sciences. He was elected as the Yangtze River Scholar Distinguished Professor in 2009. He was a recipient of the Guanghua Engineering Science and Technology Award for Youth Award in 2010 and the National Science and Technology Progress Award for several times in recent years. Prof. Fu is the President of the

References (32)

  • PuenteI. et al.

    Review of mobile mapping and surveying technologies

    Measurement

    (2013)
  • ThrunS. et al.

    Robust Monte Carlo localization for mobile robots

    Artificial Intelligence

    (2001)
  • NovakK.

    Mobile mapping systems: new tools for the fast collection of gis information

  • Durrant-WhyteH. et al.

    Simultaneous localization and mapping: part i

    IEEE Robot. Autom. Mag.

    (2006)
  • LefèvreS. et al.

    Toward seamless multiview scene analysis from satellite to street level

    Proc. IEEE

    (2017)
  • NodaM. et al.

    Vehicle ego-localization by matching in-vehicle camera images to an aerial image

  • ViswanathanA. et al.

    Vision based robot localization by ground to satellite matching in gps-denied situations

  • VoN.N. et al.

    Localizing and orienting street views using overhead imagery

  • KimD.-K. et al.

    Satellite image-based localization via learned embeddings

  • S. Hu, M. Feng, R.M. Nguyen, G. Hee Lee, CVM-Net: Cross-view matching network for image-based ground-to-aerial...
  • WangX. et al.

    Flag: Feature-based localization between air and ground

  • SenletT. et al.

    A framework for global vehicle localization using stereo images and satellite and road maps

  • SenletT. et al.

    Satellite image based precise robot localization on sidewalks

  • ViswanathanA. et al.

    Vision-based robot localization across seasons and in remote locations

  • GawelA. et al.

    X-view: Graph-based semantic multi-view localization

    IEEE Robot. Autom. Lett.

    (2018)
  • G. Máttyus, S. Wang, S. Fidler, R. Urtasun, Hd maps: Fine-grained road segmentation by parsing ground and aerial...
  • Cited by (9)

    View all citing articles on Scopus

    Mengyin Fu received the B.S. degree from Liaoning University, China, the M.S. degree from the Beijing Institute of Technology, China, and the Ph.D. degree from the Chinese Academy of Sciences. He was elected as the Yangtze River Scholar Distinguished Professor in 2009. He was a recipient of the Guanghua Engineering Science and Technology Award for Youth Award in 2010 and the National Science and Technology Progress Award for several times in recent years. Prof. Fu is the President of the Nanjing University of Science and Technology. His research interest covers integrated navigation, intelligent navigation, image processing, learning, and recognition and their applications.

    Minzhao Zhu received the B.S. degree in automation from Beijing Institute of Technology, China, in 2017. He is currently pursuing the M.S. degree in control science and engineering in the school of Automation, Beijing Institute of Technology. His research interests include autonomous vehicle, computer vision, SLAM, and air-ground collaborative perception.

    Yi Yang received the Ph.D. degree in control science and engineering from the School of Automation, Beijing Institute of Technology, Beijing, China, in 2010. He is currently a Professor with the School of Automation, Beijing Institute of Technology. His research interests include the area of mobile robots and unmanned ground vehicles, with focus on sensor fusion, localization and mapping, machine learning, collaborative perception, motion planning and control for autonomous navigation.

    Wenjie Song received the B.S. degree and the Ph.D. degree from Beijing Institute of Technology, Beijing, China, in 2013 and 2019, respectively. He studied in Princeton University as a visiting scholar from 2016 to 2017. He is currently an Assistant Professor with the School of Automation, Beijing Institute of Technology. His research interests include autonomous driving, environmental perception, SLAM and path planning.

    Meiling Wang received the B.S. degree in automation from the Beijing Institute of Technology, China, in 1992, and the M.S. and Ph.D. degrees from the Beijing Institute of Technology, China, in 1995 and 2007, respectively. She has been teaching at the Beijing Institute of Technology since 1995, and worked at the University of California San Diegoas a Visiting Scholar in 2004. She was elected as the Yangtze River Scholar Distinguished Professor in 2014. She is currently the Director of the Integrated Navigation and Intelligent Navigation Laboratory, Beijing Institute of Technology, China. Her research interests include advanced technology of sensing and detecting and vehicle intelligent navigation.

    View full text