Outdoor scene understanding of mobile robot via multi-sensor information fusion
Introduction
Domestic and foreign researcher have paid close attention to and actively discussed the challenging research topic of how to make robots better understand their working environment or have environmental cognitive ability similar to intelligent life [1]. The understanding of the environment determines the possibility of autonomous mobile robot to complete tasks like motion planning, especially in the random and complex outdoor environment [2]. It is an effective way to improve the robot's perception of the environment in order to better understand the outdoor scene. However, the sensing ability of a single sensor is limited, which requires effective coordination between different sensors. Besides, the fusion of multi-source sensor data at different levels is essential to strengthen the understanding of the scene and ensure its efficient and stable work in outdoor scenes [3]. In a relatively structural indoor environment, the autonomous environment perception, environment map construction and indoor scene cognition technology of mobile robots based on multi-sensor fusion technology are relatively mature [4]. Due to the diversity, randomness, complexity of outdoor natural scenes and the mobility of mobile robots, the constructed scene understanding system should have high real-time performance and adaptability. Nevertheless, the existing mobile robots do not reach the designated position with the help of multi-sensor fusion technology, often with problems of poor real-time performance and low recognition rate [5]. Therefore, the research on the outdoor scene understanding of mobile robots via multi-sensor information fusion has an important reference for the application of mobile robots and the development of related industries.
As one of the core technologies of robots, autonomous navigation technology is gradually becoming mature. At the same time, positioning and map construction technology is the basis for robots to achieve autonomous navigation, and it is also the key to whether robots can successfully and efficiently complete designated tasks in various environments. The external sensors such as laser sensors and visual sensors equipped by autonomous mobile robots are the main ways to obtain scene information [6]. Because the laser sensor can quickly and accurately obtain the depth information of the scene, it is commonly used to describe 3D (three-dimensional) outdoor scenes. The original 3D laser point cloud is obtained by 3D laser ranging sensor to describe the outdoor scene. The laser point is fitted to a plane to describe the environment, and the pose of the autonomous mobile robots is corrected by the change of plane parameters to realize the 3D environment navigation [7]. The statistical method is used to construct an elevation map based on laser points, which is applied to the Raptor robot to enable it to cross the unknown outdoor environment. Choi et al. (2019) believed that there would be omissions and misinformation in outdoor environments which might threaten the safe operation of autonomous robots due to incomplete environmental description by laser sensors caused by the lack of color and texture information [8]. Therefore, in recent years, many researchers have tried to integrate laser information with image information. They attempted to compensate for the shortcomings of laser sensors through the rich information of objects or scenes obtained by visual sensors, so as to enhance the expression and information utilization of objects or scenes. Wang et al. (2020) added color attributes to the laser point to achieve effective classification of laser data [9]. Du et al. (2020) realized object surface reconstruction using fusion information, which can be applied to object recognition in the scene [10]. At present, the data fusion method applied to outdoor scenes is to add color information to the laser point. The high-precision pose estimation and 3D scene reconstruction are achieved by fusion of laser and visual sensors. Alshawabkeh et al. (2021) applied the ICP (Inductively Coupled Plasma Mass Spectrometry) method to the laser data fused with color information to realize the reconstruction of 3D outdoor scenes [11]. Although this method can simply and effectively realize the information fusion of the two sensors, there are some problems in practical application. First, the original laser point has a considerable amount of data, and the direct fusion of laser data and visual information takes up large storage resources and computing resources. Besides, it is necessary to carry out further processing of the laser point which is not conducive to the subsequent application of the robot, and the color information on the laser point will undoubtedly generate additional computational burden. In addition, it may weaken the advantages and functions of the visual system in scene understanding to add color information just to the laser point. The understanding of the environment determines the possibility of autonomous mobile robots to accomplish tasks such as motion planning, especially in random and complex outdoor environments
A real-time outdoor scene understanding method is proposed that effectively fuses the 3D terrain information obtained by the 3D laser ranging system with the terrain information obtained by the visual sensor. In order to improve the efficiency of the algorithm, the laser information and the visual information are parallelized. Then, the three-dimensional laser point is used to construct an elevation map to describe the topographic features of the scene, and the test image and the training image are matched based on the conditional random field model to obtain the topographic attributes of the image area.
- I
A real-time outdoor scene understanding method is proposed here which effectively fuse the 3D terrain information obtained by 3D laser ranging system and the landform information obtained by visual sensors.
- II
Moreover, laser information and visual information are processed in parallel to improve the efficiency of the algorithm, and the elevation map is constructed by using 3D laser points to describe the terrain feature of the scene.
- III
Meanwhile, two kinds of multi-class scene recognition models suitable for RGB-DI images are proposed based on the deep learning theory, namely FCN (Fully Convolutional Networks) model and CRF (Conditional Random Fields) model based on CNN (Convolution Neural Network). The convolutional neural network is mainly to provide a better learning hard condition in the later machine learning process of the scene robot.
- IV
The recognition result of RGB-DI image is inversely mapped in the RGB-DI point cloud scene to realize the 3D scene understanding.
- V
The projection transformation and information statistical method are used to realize the effective fusion of laser information and visual information to solve the storage and operation problems caused by the mass laser point data.
- VI
The research result provides a reference for improving the safety of robots in actual operation in outdoor scenes.
Section snippets
Data fusion by multi-sensor source sensing
The early data fusion is mainly to prepare for fusion in the sensor, while the late fusion is to fuse the data according to the type of data after the data transmission is summarized. The global scene cloud corresponding to the starting point is generated from laser points collected in mobile robot motion by the laser sensor and inertial navigation system. The key to the fusion of laser data and camera data is to find out the point pairs between the 3D laser points and the 2D pixels of the
Attribute classification of scene features
The point cloud contains the location and attribute information of corresponding points, so feature extraction of point cloud needs to consider two parts. On the one hand, the FPFH (Fast Piont Feature Histograms) algorithm is used to describe the features of location information of adjacent points in point clouds. On the other hand, the color moment similar to the one commonly used in images is applied for the description of attributes, which is the statistics of the attribute values of
Comparison of performance of different models
1, 2, 3, and 4 in all pictures in the article refer to A,B,C,D in the picture annotations in the article.
As shown in Fig. 7, the point cloud features of five different attributes are extracted in the same scene, the five attributes are space, color, size, position, and shape, and the CH value and DB value are collected according to each category label and feature description. The results indicate that the data of Group C has the best performance of CH value and DB value. For Group C and Group
Conclusion
A real-time outdoor scene understanding method is proposed with multiple sensors of two-dimensional laser, monocular camera, and inertial navigation system based on the information fusion technology of 3D laser ranging system and visual sensors. Meanwhile, a deep CNN model suitable for RGB-DI images is designed to identify the scene. Besides, the grid in the elevation map is taken as the carrier to effectively integrate the landform information obtained from image processing with the landform
Declaration of Competing Interest
There is no potential conflict of interest in our paper and all authors have seen the manuscript and approved to submit to your journal. We confirm that the content of the manuscript has not been published or submitted for publication elsewhere.
Acknowledgement
This work was supported by the key project of natural science research in universities of Jiangsu province (No. 20KJA460011); National Natural Science Foundation of China(Nos. 51508261 and 51765007), Natural Science Foundation of Shandong Province, China (No. ZR2021ME134); the Doctoral Program of Shandong Jiaotong University (No. BS2020012); The authors would like to thank the reviewers for their constructive comments and substantial help that improved the presentation of the paper.
Fusheng Zhang was born in Nenjiang, Heilongjiang. P.R. China, in 1971. He received the doctor's degree from Harbin University of Science and Technology. Now, he works in the School of Mechanical Engineering, Changshu Institute of Technology. His-research interests include the Internet of Things and big data, special robotics and intelligent transportation system.
References (30)
- et al.
Cooperative mapping of unknown environments by multiple heterogeneous mobile robots with limited sensing
Rob. Auton. Syst.
(2017) - et al.
Technology gaps in human-machine interfaces for autonomous construction robots
Automation in Construction
(2018) - et al.
Scene perception based visual navigation of mobile robot in indoor environment
ISA Trans.
(2021) - et al.
Automatic apple recognition based on the fusion of color and 3D feature for robotic fruit picking
Comput. Electron. Agriculture
(2017) - et al.
Impacts of extension access and cooperative membership on technology adoption and household welfare
J. Rural Stud.
(2017) - et al.
Robot coverage path planning under uncertainty using knowledge inference and hedge algebras
Machines
(2018) - et al.
Autonomous navigation framework for intelligent robots based on a semantic environment modeling
Appl. Sci.
(2020) - et al.
Front vehicle detection based on multi-sensor fusion for autonomous vehicle
J. Intelligent & Fuzzy Syst.
(2020) - et al.
Autonomous dam surveillance robot system based on multi-sensor fusion
Sensors
(2020) - et al.
An online context-aware machine learning algorithm for 5G mmWave vehicular communications
IEEE/ACM Trans. Networking
(2018)
Unmanned aerial vehicles using machine learning for autonomous flight; state-of-the-art
Adv. Rob.
Towards a theoretical framework of autonomous systems underpinned by intelligence and systems sciences
IEEE/CAA J. Automatica Sinica
Real-time onboard 3D state estimation of an unmanned aerial vehicle in multi-environments using multi-sensor data fusion
Sensors
Integration of Laser Scanner and Photogrammetry for Heritage BIM Enhancement
ISPRS Int. J. Geoinf.
Multi-sensor fusion SLAM approach for the mobile robot with a bio-inspired polarised skylight sensor
IET Radar, Sonar & Navigation
Cited by (5)
A modified federated Student's t-based variational adaptive Kalman filter for multi-sensor information fusion
2023, Measurement: Journal of the International Measurement ConfederationPositioning accuracy enhancement of a robotic assembly system for thin-walled aerostructure assembly
2023, Journal of Industrial Information IntegrationChallenges and Solutions for Autonomous Ground Robot Scene Understanding and Navigation in Unstructured Outdoor Environments: A Review
2023, Applied Sciences (Switzerland)Human-machine Collaborative Decision-making: An Evolutionary Roadmap Based on Cognitive Intelligence
2023, International Journal of Social Robotics
Fusheng Zhang was born in Nenjiang, Heilongjiang. P.R. China, in 1971. He received the doctor's degree from Harbin University of Science and Technology. Now, he works in the School of Mechanical Engineering, Changshu Institute of Technology. His-research interests include the Internet of Things and big data, special robotics and intelligent transportation system.
Dong-yuan Ge was born in Shaoyang, Hunan, P.R. China, in 1970. He received the Ph.D. degrees in mechanical engineering from the South China University of Technology in 2013. He has been a Research Fellow with School of Mechanical and Transportation Engineering, Guang-xi University of Science and Technology since 2019. His-current research interests include the machine vision and machine learning.
Jun Song was born in WeiHai, Shandong P.R. China, in 1982. He received the Ph.D. degree in Bridge Engineering from Chongqing Jiaotong University, P.R. China. Now, he is a Associate Professor at Shandong Jiaotong University. His-research interests include bridge health monitoring, and structure security evaluation.
Wen-jiang Xiang was born in Shaoyang, Hunan, P.R. China, in 1963. Hereceived the M. Eng. in mechanical Engineering in Southeast University in 1989, he has been a Professor of mechanical Engineering with the Shaoyang University. His-research areas include intelligent manufacture, precision measuremment and so on.