Original papers3D global mapping of large-scale unstructured orchard integrating eye-in-hand stereo vision and SLAM
Introduction
A fruit-picking robot equipped with a stereo vision system improves the efficiency and quality of orchard harvesting. At present, most of the stereo vision systems, such as binocular vision system, RGB-D vision system and multi-vision system, are able to adapt to the complex background, uneven lighting and low color contrast of the unstructured orchard (Tang et al., 2020).
The key point of further researches about fruit picking is how to build a compact, coordinated and practical fruit picking robot based on existing high-performance stereo vision systems. There have been some representative cases in this regard. Xiong et al. (2018) constructed a fruit picking robot using two CCD cameras. The main contribution of this work is to provide a method for calculating the oscillation angle of the fruit under natural disturbance. Ge et al., 2020, Ge et al., 2019) demonstrated a strawberry picking robot using RGB-D vision. They proposed a method that can estimate the 3D shape of the fruit from a small number of images. At the same time, they calculated the safe working area of the robot and constructed a complete strawberry picking process. Lin et al. (2019) constructed a robot for guava picking. The innovation of this work is to propose an efficient stereo vision method reconstructing obstacles on trees. Li et al. (2020) trained a high-performance semantic segmentation network, and obtained the spatial position of the tiny fruit stems directly from the original sampled images, which realized the end-to-end perception of the picking point. Silwal et al. (2017) combined agronomic methods to develop a picking robot that can be used in the orchard with ideal fruit distribution. This work is practical and constructive because it gives feasible solutions to achieve a complete picking process. Wibowo et al. (2017) developed an end-to-end autonomous coconut harvesting robot that can automatically climb coconut trees and detect fruits through a vision system. They designed a novel fixing and cutting mechanism, which is small in size and easy to carry. It provided a reference for the design of other similar lightweight harvesting robots. Zhang et al. (2020) introduced a robot for harvesting fruit with pedicels. They combined the deep instance segmentation network and spatial geometric methods to quickly solve the location of the picking point. They had also designed an end effector that can effectively protect the pulp from damage. This system was successfully verified in a laboratory environment. Williams et al. (2019) proposed a kiwifruit picking robot composed of four robotic arms. The vision system combined deep learning and stereo vision methods to perceive kiwifruit under natural lighting. Their dynamic fruit scheduling system successfully coordinated the various modules. The robot had passed the stability test in the orchard environment and was a complete and usable product.
The above cases had successfully applied a vision-based fruit picking robot in the unstructured environment. However, most of them are limited to the studies of basic issues or employed under local scenes. In fact, this is far from meeting the needs of fruit picking tasks targeting long time span and large scale area. The behavior of picking robots in a dynamic global scale is very complicated, and much more factors must be considered, such as the correlation of mobile platform and robotic arm, the efficiency and completeness of image sampling, the changes in the terrain and obstacles and the differences in distance between different targets, etc. This requires highly integrated hardware, stable 3D vision algorithms, and compact system framework. In addition, more stability tests under large scale orchards are necessary.
In recent years, more and more attention and efforts to visual picking robots have been shifted from local to global scale. Establishing an intelligent, robust, and large-scale picking system has gradually become a consensus in the field of harvesting robots (Tang et al., 2020). Visual SLAM technology can accurately construct a large-scale scene, track the position of the vision system in the global map, and estimate the motion trajectory; therefore, it is a potential solution for completing a large-scale picking task. There have already been some exploratory studies on this topic. For example, Dong et al. (2020) used a RGB-D camera to collect the point clouds of the trees and fruits in an orchard environment and, subsequently, utilized the ORB-SLAM2 framework for motion tracking and mapping. Habibie et al. (2018) collected simulated agricultural data through RGB cameras and lasers and, subsequently, generate a global grid map of the fruits and trees. Shalal et al. (2015) fused camera and laser scanner data to detect tree trunks and construct local orchard maps. The developed algorithm relied only on the on-board sensors of the mobile robot, without adding any artificial landmarks in the orchard. However, this work focused on the distribution of trees on the map, but lacked detailed information about the structure of the fruits. Underwood et al. (2016) proposed a 3D mobile scanning system for almond orchards to reconstruct the distribution map of flowers and fruits and estimate the yield of a single tree. They used a vehicle equipped with lidar and cameras to scan the orchard and construct a forest canopy model database over a long span. Fan et al. (2018) developed a portable and flexible RGB-D SLAM system to estimate the position, height and specific geometric parameters of trees in a large area of forest. As the point cloud resolution of the RGB-D camera dropped sharply in a large range, the point cloud fitting was relatively rough. Nellithimaru and Kantor (2019) proposed a visual SLAM system for fast counting of fruits in vineyards. It combined the classic 3D reconstruction technology with a robust instance segmentation network to obtain a clear 3D structure of the grapes. The structure of this SLAM system was relatively simple, so there may be risks of error accumulation. Ivanov et al. (2020) presented a complete set of technical solutions for outdoor mobile robots including visual perception, navigation and movement. While satisfying the basic functions, the solution guaranteed the stable exchange of internal data of the robot. It had been successfully applied to different complex scenarios and was of great importance for the behavioral logic design of future mobile robots. In addition to the above cases, more different types of agricultural SLAM systems regarding GPS, ultrasonic sensors or inertial measurement units (IMU) had also been proposed and verified (Chen et al., 2018, Gan et al., 2017, Katikaridis et al., 2019).
The SLAM technologies involved in the above research have provided possibilities for building a vision system suitable for large scale environments. However, it should to be pointed out that relevant works are, until now, in their infancy, and there are still a lot of works to be done to realize stable and practical orchard picking application. Capua et al. (2018) pointed out that the performance of the existing SLAM systems cannot fully meet the needs of large-scale agricultural tasks. They believed that by building a new form of SLAM system, optimizing the target tracking algorithm and increasing the type and number of sensors may be able to solve some of the problems. It had also been pointed out in some studies that the original SLAM system may encounter stability problems when applied to agricultural environments (Chebrolu et al., 2017, Gao et al., 2018, Pierzchała et al., 2018). In addition, the huge demand for computing resources of SLAM on a large scale is one of the most important factors restricting its application in the agricultural field (Aguiar et al., 2020, Zhao et al., 2020).
In fact, although SLAM achieves localization and mapping at the same time, it can be figured out that most of the SLAM applications regarding autonomous platforms (such as self-driving, high-altitude cruising, logistics scheduling, etc.) pay more attention to the real-time localization of the mobile platform, rather than the 3D detail of the constructed map. In order to ensure real-time performance, dense spatial features are always ignored, and only a sparse map is constructed. Even if a dense map is constructed in individual cases, it is only used for ensuring better navigations. However, on the contrary, picking in large-scale orchards require high local 3D reconstruction accuracy to better determine the shape, size, maturity and 3D structure of each observable fruits, and then generate a yield map. The more complete the output yield map, the more it can help to estimate the suitable picking area and form a better picking plan for the robot. Therefore, a global mapping system fully compatible to the nature of orchard picking task is urgently needed. Compared with the SLAM system, the stereo vision system pays more attention to the understanding of local structural details, so it can be utilized to improve the resolution of SLAM to construct a higher-performance orchard mapping system.
In this study, a new form of mobile picking robot and a mapping framework integrating stereo vision and SLAM was established. Specifically, an eye-in-hand stereo vision system was built to sample images and generate local maps, and a SLAM system was constructed to estimating the global trajectory of the cameras. Finally, the local maps were stitched according to the global trajectory to form a detailed, wide, 3D orchard map. The main contribution of this work is to combine high-accuracy stereo vision and large-scale SLAM technology, to construct a high-performance global mapping framework compatible to the fruit picking tasks. All the technologies mentioned in this study are necessities of this mapping framework, and they are mutually cooperative.
Section snippets
Overall framework
An eye-in-hand mobile robot based on binocular vision and 6-DOF robotic arm was constructed. The system comprises four modules, namely sampling, object detection, local dense mapping, and global trajectory estimation, as shown in Fig. 1.
The sampling module was implemented based on the eye-in-hand system. An efficient hand-eye calibration method was proposed to determine the coordinate correlation between the robotic arm and the camera.
The object detection module and the local dense mapping
Experiment
This section is divided into two parts: local experiments and global experiments. In the local experiments, the performance of the sampling module, object detection module and local dense mapping module was verified under the local scenes of the orchard, respectively. In the global experiments, detailed and broad global 3D maps of the orchards were obtained, facilitating the validation of the performance of the entire mapping system.
Conclusions and future work
The basic vision technologies of orchard picking robot, such as image segmentation, 3D positioning and surface reconstruction, have been initially completed with the joint efforts of today's researchers. It is believed that the global perceptions, general frameworks, and practical applications will be the key to the future development of the visual picking robots.
In this study, a mobile platform equipped with an eye-in-hand stereo vision system and a 6-DOF robotic arm was built. A flexible 3D
CRediT authorship contribution statement
Mingyou Chen: Methodology, Software, Writing - review & editing, Writing - original draft. Yunchao Tang: Conceptualization, Writing - original draft. Xiangjun Zou: Supervision. Zhaofeng Huang: Methodology, Data curation. Hao Zhou: Investigation, Visualization. Siyu Chen: Data curation.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This work was supported by the Special Funding Project of Guangdong Enterprise Science and Technology Commissioner (GDKTP2020029500), the Key-area Research and Development Program of Guangdong Province (2019B020223003), and the Major scientific research projects of Guangdong Province (2020KZDZX1037).
References (40)
- et al.
Three-dimensional perception of orchard banana central stock enhanced by adaptive multi-vision technology
Comput. Electron. Agric.
(2020) - et al.
Multi-feature fusion tree trunk detection and orchard mobile robot localization using camera/ultrasonic sensors
Comput. Electron. Agric.
(2018) - et al.
Symmetry-based 3D shape completion for fruit localisation for harvesting robots
Biosyst. Eng.
(2020) - et al.
Mapping forests using an unmanned ground vehicle with 3D LiDAR and graph-SLAM
Comput. Electron. Agric.
(2018) - et al.
Orchard mapping and mobile robot localisation using on-board camera and laser scanner data fusion - Part A: Tree detection
Comput. Electron. Agric.
(2015) - et al.
Mapping almond orchard canopy volume, flowers, fruit and yield using lidar and vision sensors
Comput. Electron. Agric.
(2016) - et al.
Robotic kiwifruit harvesting using machine vision, convolutional neural networks, and robotic arms
Biosyst. Eng.
(2019) - et al.
The recognition of litchi clusters and the calculation of picking point in a nocturnal natural environment
Biosyst. Eng.
(2018) A flexible new technique for camera calibration
(2000)- et al.
Localization and mapping for robots in agriculture and forestry: a survey
Robotics
(2020)
Agricultural robot dataset for plant classification, localization and mapping on sugar beet fields
Int. J. Rob. Res.
Hand-eye calibration using dual quaternions
Int. J. Rob. Res.
Semantic mapping for orchard environments by merging two-sides reconstructions of tree rows
J. F. Robot.
Estimating tree position, diameter at breast height, and tree height in real-time using a mobile phone with RGB-D SLAM
Remote Sens.
Review of wheeled mobile robots’ navigation problems and application prospects in agriculture
IEEE Access
Fruit localization and environment perception for strawberry harvesting robots
IEEE Access
Cited by (99)
Dynamic visual servo control methods for continuous operation of a fruit harvesting robot working throughout an orchard
2024, Computers and Electronics in Agriculture3D vision technologies for a self-developed structural external crack damage recognition robot
2024, Automation in ConstructionImproved binocular localization of kiwifruit in orchard based on fruit and calyx detection using YOLOv5x for robotic picking
2024, Computers and Electronics in AgriculturePath planning for mobile robots in unstructured orchard environments: An improved kinematically constrained bi-directional RRT approach
2023, Computers and Electronics in AgriculturePath planning method for Camellia oleifera forest trenching operation based on human-robot collaboration
2023, Computers and Electronics in AgricultureTransforming unmanned pineapple picking with spatio-temporal convolutional neural networks
2023, Computers and Electronics in Agriculture