Elsevier

Computer Science Review

Volume 46, November 2022, 100510
Computer Science Review

Review Article
Visual SLAM for underwater vehicles: A survey

https://doi.org/10.1016/j.cosrev.2022.100510Get rights and content

Highlights

  • A comprehensive review of underwater visual SLAM.

  • Discusses the challenges of underwater visual SLAM, shares its prospects.

Abstract

Underwater scene is highly unstructured, full of various noise interferences. Moreover, GPS information is not available in the underwater environment, which thus brings huge challenges to the navigation of autonomous underwater vehicle. As an autonomous navigation technology, Simultaneous Localization and Mapping (SLAM) can deliver reliable localization to vehicles in unknown environment and generate models about their surrounding environment. With the development and utilization of marine and other underwater resources, underwater SLAM has become a hot research topic. By focusing on underwater visual SLAM, this paper reviews the basic theories and research progress regarding underwater visual SLAM modules, such as sensors, visual odometry, state optimization and loop closure detection, discusses the challenges faced by underwater visual SLAM, and shares the prospects of underwater visual SLAM. It is found that the traditional underwater visual SLAM based on filtering methods is gradually developing towards optimization-based methods. Underwater visual SLAM presents a diversified trend, and various new methods have emerged. This paper aims to provide researchers and practitioners with a better understanding of the current status and development trend of underwater visual SLAM, while offering help for collecting underwater vehicles intelligence.

Introduction

With the development of robot technology, Autonomous Underwater Vehicle (AUV) has become one of the important means of marine resources exploration and exploitation. Accurate positioning and navigation play a critical role in ensuring that underwater vehicles can move stably and complete the tasks successfully. Due to the rapid attenuation of radio signals, including GPS signals, it is difficult to use GPS for autonomous navigation of underwater vehicles in the water. Long Baseline (LBL), Short Baseline (SBL) and Ultra Short Baseline (USBL) [1] and other methods rely on nearby ships and other carriers to transmit signals to underwater vehicles, making them not suitable for long-distance operation. However, the technology of Simultaneous Localization and Mapping (SLAM) [2] can use sensors to collect different data for analysis and processing. Then, based on the results of data analysis and processing, SLAM can estimate the positions of vehicles, thus making the autonomous localization and navigation of the underwater vehicles possible [3]. As an automatic navigation method, SLAM has made great progress over recent years, with wide application in robot [4] and automatic driving [5]. Different from other methods that rely on external information, SLAM only needs its own sensors to obtain real-time information of the surrounding environment, and can create maps and locate vehicles without any prior input. Thus, vehicles can complete autonomous navigation and positioning in a real sense in any strange environments [3]. Underwater SLAM plays a more and more important role in underwater vehicles navigation.

According to the types of sensors, underwater SLAM can be divided into Light Detection and Ranging (LiDAR) SLAM, sonar SLAM, and visual SLAM. LiDAR and sonar equipment are too expensive, so they are not suitable for civil robots. LiDAR uses laser to analyze the contour and structure of targets. However, because of the existence of small particles, laser will produce absorption and scattering in water, which will affect the measurement results. Therefore, the working range of LiDAR in underwater environment is limited, and maps constructed by LiDAR lack semantic information. Sonar uses a transmitter to emit sound waves and a receiver to receive echo signals, then analyzes and processes the echo signals to describe the contour and structure of the target. Sound wave propagation under water is not affected by light, so sonar is a good choice for underwater SLAM [6]. However, acoustic waves are significantly affected by water flow, seismic activity, ship traffic, marine life and other factors. In addition, in some special cases, such as underwater caves and other closed space, sound waves will be rebounded many times, eventually causing interferences. All of these factors would bring big challenges to underwater positioning and mapping. In contrast, vision-based SLAM has become a hot research field over recent years due to its low cost and high portability, although it can be affected both by particles and light conditions under water. However, various underwater image enhancement algorithms [7] can relieve the difficulties with SLAM to some extent. A comparison of the three SLAM methods is presented in Table 1.

In order to position and map in unknown environment, sensors shall be used to obtain the key features of the environment, and estimate the current states of vehicles based on the information obtained as well as the previous states of vehicles. As the vehicles keep moving, estimation errors will inevitably appear. To calibrate the errors and ensure the long-term stable work of vehicles, loop closure detection is needed.

The process of visual SLAM can be simply divided into five sections: sensor data, front-end, back-end, loop closure detection, and mapping, as shown in Fig. 1. In visual SLAM, sensors mainly include cameras, as well as some internal sensors of vehicles, such as Inertial Measurement Unit (IMU), depth sensor and so on. The front-end is often called Visual Odometry (VO), which mainly provides optimized sensor data for the back-end system. The back-end optimizes and updates the state of vehicles based on the data from the front-end and loop closure detection, and then calculates the trajectory of vehicles and the map of their surrounding environment. Loop closure detection is used to decide whether a vehicle has reached the previous position and solve the problem of vehicles drift [8] over time.

For static, rigid and unobvious illumination transformation in the scenes without too much interference, the SLAM technology is quite mature [9]. However, different from the ground or indoor controllable environment, underwater environment is highly unstructured, with various kinds of noise interference, which brings multifarious difficulties and challenges to underwater visual SLAM. For instance, due to scattering and absorption, light attenuation exists in water, so the image contrast becomes low. Moreover, the attenuation degrees of different lights in water are different, and it is easier for the lights at higher frequencies to penetrate the particles in water, thus resulting in blue–green underwater images. The dissolved organic matters and suspended particles in water bring huge noise interference. Underwater scenes often show a single structure and lack rich features, which makes it difficult to conduct feature detection and matching. Therefore, underwater SLAM is often much more difficult to implement than on the ground. As shown in Fig. 2, unstructured underwater scenes often include underwater buses, caves, ships, rocks, seaweeds, corals, etc.

Underwater SLAM has attracted wide attention from researchers. For example, [3], [10] summarized the state optimization algorithms commonly used in underwater SLAM, and [11] reviewed underwater acoustic SLAM from the perspective of sonar image registration and loop closure detection. Over recent years, underwater visual SLAM has developed rapidly and played an important role in marine resources exploration. However, there is still a lack of systematic review of it. Therefore, after consulting the literature on underwater visual SLAM in recent 15 years on Web of Science, IEEE Xplore and Google Scholar, starting from the framework of visual SLAM, this paper summarizes the development of underwater visual SLAM in recent years, and mainly introduce the following four parts of underwater visual SLAM: related sensors, front-end visual odometry, back-end state optimization, and loop closure detection, as shown in Fig. 3. For positioning, a map can be a simple set of landmarks to meet the requirements of the task. Once the locations of landmarks are determined, the map is constructed. Therefore, this paper will not introduce the mapping process at great length. The structure of the paper is organized as follows. Chapter 1 summarizes the basic situation of underwater SLAM, compares three different underwater SLAM methods, introduces the basic framework of visual SLAM, and highlights the difficulties in special underwater environment; Chapters 2, 3, 4 and 5 introduce the basic content and research status of underwater visual SLAM, covering the related sensors, front-end visual odometry, back-end state optimization and loop closure detection, respectively; Chapter 6 discusses difficulties and challenges in the field of underwater visual SLAM; Chapter 7 gives a summary and prospect.

Section snippets

Proprioceptive sensors

The sensors for underwater vehicles can be divided into proprioceptive sensors and exteroceptive ones. The latter is mainly used to perceive the external environment, while the former to estimate the state and position of the vehicle itself without external assistance. In addition to necessary external information, the realization of underwater SLAM often requires proprioceptive sensors to provide information such as depth, orientation, and acceleration. Fig. 4 illustrates the common sensors

Front-end visual odometry

The front-end visual odometry is used to roughly estimates the pose of a camera based on the information acquired from adjacent images, and provides a better initial value for the back-end. Visual SLAM without loop closure detection is also called visual odometry. The pose of a camera can be obtained by the following formula: Ck=RCk1+twhere (R,t) is the rotation matrix and translation vector of the camera, at a known initial position C0, and the camera pose Ck corresponding to any time k can

Back-end state optimization

SLAM is essentially an estimation of the uncertainty of the agent itself and the surrounding space [2]. State optimization is the core content of SLAM. Visual odometry gives a short-time pose estimation of cameras, so this process will inevitably lead to cumulative errors. With the accumulation of time, this estimation will become more and more unreliable. On the basis of the visual odometry, the back-end can realize the state optimization in a larger scale and longer time. For underwater

Loop closure detection

In the process of vehicle motion, it is inevitable to produce cumulative error (according to Formula (1)) and thus lead to unreliable long-term estimations as well as failures in establishing globally consistent trajectories and maps. Loop closure detection determines whether a vehicle has reached the previous position by calculating the similarity between maps, and transmits the detection information to the back-end for optimization (the diagram of loop closure detection process is shown in

Challenges in underwater visual SLAM

Compared with the laser and sonar methods, visual SLAM is not only cheap, easy to implement and install, but also convenient to create dense maps, because it can capture rich features [88]. For more complex tasks, such as reconstruction [89] and interaction [90], visual SLAM has more advantages. However, underwater is often an unstructured dynamic environment full of various noises, which impose on underwater visual SLAM a lot of great challenges:

1. The sensor data are noisy. As a result, the

Conclusions and outlook

Starting from the basic framework of visual SLAM, this article explores sensors, front-end visual odometry, back-end state optimization and loop closure detection related to underwater visual SLAM. Further, this article reviews and analyzes the development of underwater visual SLAM in recent years and discusses the existing challenges faced by underwater visual SLAM. Despite this, underwater visual SLAM has made considerable progress and development. Compared with other SLAM solutions, such as

CRediT authorship contribution statement

Song Zhang: Searching and finalizing articles, Formulating research questions, Data extraction, Data cross-checking and analyzing, Writing initial draft, Revising and finalizing article. Shili Zhao: Searching and finalizing articles, Revising and finalizing article. Dong An: Searching and finalizing articles, Data cross-checking and analyzing, Revising and finalizing article, Supervision. Jincun Liu: Searching and finalizing articles, Data cross-checking and analyzing, Revising and finalizing

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This paper was supported by Ministry of Science and Technology of the People’s Republic of China (Grant No. 2019YFE0103700), Hebei Province Department of Science and Technology (Grant No. 20327217D) and Shandong Province Department of Science and Technology (Grant No. 2021TZXD006).

References (95)

  • W. Zhao, T. He, A.Y.M. Sani, T. Yao, Review of SLAM techniques for autonomous underwater vehicles, in: Proceedings of...
  • Torres-GonzálezA. et al.

    Range-only SLAM for robot-sensor network cooperation

    Auton. Robots

    (2018)
  • ChiangK.-W. et al.

    Navigation engine design for automated driving using INS/GNSS/3D LiDAR-SLAM and integrity assessment

    Remote Sens.

    (2020)
  • PalomerA. et al.

    Multibeam 3D underwater SLAM with probabilistic registration

    Sensors

    (2016)
  • IslamM.J. et al.

    Fast underwater image enhancement for improved visual perception

    IEEE Robot. Autom. Lett.

    (2020)
  • LabbeM. et al.

    Online global loop closure detection for large-scale multi-session graph-based SLAM

  • CadenaC. et al.

    Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age

    IEEE Trans. Robot.

    (2016)
  • HidalgoF. et al.

    Review of underwater SLAM techniques

  • JiangM. et al.

    A survey of underwater acoustic SLAM system

  • AnwerA. et al.

    Underwater 3-d scene reconstruction using kinect v2 based on physical models for refraction and time of flight correction

    IEEE Access

    (2017)
  • TsuiC.-L. et al.

    Using a time of flight method for underwater 3-dimensional depth measurements and point cloud imaging

  • Mur-ArtalR. et al.

    ORB-SLAM: a versatile and accurate monocular SLAM system

    IEEE Trans. Robot.

    (2015)
  • LoweD.G.

    Distinctive image features from scale-invariant keypoints

    Int. J. Comput. Vis.

    (2004)
  • RubleeE. et al.

    ORB: An efficient alternative to SIFT or SURF

  • KimA. et al.

    Pose-graph visual SLAM with geometric model selection for autonomous underwater ship hull inspection

  • EusticeR. et al.

    Visually augmented navigation in an unstructured environment using a delayed state history

  • RostenE. et al.

    Machine learning for high-speed corner detection

  • CalonderM. et al.

    Brief: Binary robust independent elementary features

  • TareenS.A.K. et al.

    A comparative analysis of sift, surf, kaze, akaze, orb, and brisk

  • IqbalA. et al.

    Data association and localization of classified objects in visual SLAM

    J. Intell. Robot. Syst.

    (2020)
  • AulinasJ. et al.

    Feature extraction for underwater visual SLAM

  • SchettiniR. et al.

    Underwater image processing: state of the art of restoration and image enhancement methods

    EURASIP J. Adv. Signal Process.

    (2010)
  • ChoY. et al.

    Visibility enhancement for underwater visual SLAM based on underwater light scattering model

  • SalviJ. et al.

    Visual slam for underwater vehicles using video velocity log and natural landmarks

  • ChoY. et al.

    Channel invariant online visibility enhancement for visual SLAM in a turbid environment

    J. Field Robotics

    (2018)
  • Durrant-WhyteH. et al.

    Simultaneous localization and mapping: part I

    IEEE Robot. Autom. Mag.

    (2006)
  • J. Aulinas, Y.R. Petillot, X. Lladó, J. Salvi, R. Garcia, Vision-based underwater SLAM for the SPARUS AUV, in:...
  • EusticeR.M. et al.

    Visually augmented navigation for autonomous underwater vehicles

    IEEE J. Ocean. Eng.

    (2008)
  • LiS. et al.

    Square-root unscented Kalman filter based simultaneous localization and mapping

  • MakiT. et al.

    Photo mosaicing of tagiri shallow vent area by the auv tri-dog 1 using a slam based navigation scheme

  • KümmerleR. et al.

    G 2 o: A general framework for graph optimization

  • PolokL. et al.

    Incremental block cholesky factorization for nonlinear least squares in robotics

    Robot.: Sci. Syst.

    (2013)
  • FerreiraF. et al.

    Real-time optical SLAM-based mosaicking for unmanned underwater vehicles

    Intell. Serv. Robot.

    (2012)
  • MahonI. et al.

    Efficient view-based SLAM using visual loop closures

    IEEE Trans. Robot.

    (2008)
  • AulinasJ. et al.

    Selective submap joining for underwater large scale 6-DOF SLAM

  • BurgueraA. et al.

    Towards robust image registration for underwater visual slam

  • HongS. et al.

    A robust loop-closure method for visual SLAM in unstructured seafloor environments

    Auton. Robots

    (2016)
  • Cited by (0)

    View full text