Grapevine buds detection and localization in 3D space based on Structure from Motion and 2D image classification

doi:10.1016/j.compind.2018.03.033

Computers in Industry

Volume 99, August 2018, Pages 303-312

https://doi.org/10.1016/j.compind.2018.03.033 Get rights and content

Highlights

•
Grapevine bud 3D localization in natural field conditions.
•
First steps toward high throughput plant structuring.
•
Multi-view 3D reconstruction workflow for high precision localization in noisy conditions.

Abstract

In viticulture, there are several applications where 3D bud detection and localization in vineyards is a necessary task susceptible to automation: measurement of sunlight exposure, autonomous pruning, bud counting, type-of-bud classification, bud geometric characterization, internode length, and bud development stage. This paper presents a workflow to achieve quality 3D localizations of grapevine buds based on well-known computer vision and machine learning algorithms when provided with images captured in natural field conditions (i.e., natural sunlight and the addition of no artificial elements), during the winter season and using a mobile phone RGB camera. Our pipeline combines the Oriented FAST and Rotated BRIEF (ORB) for keypoint detection, a Fast Local Descriptor for Dense Matching (DAISY) for describing the keypoint, and the Fast Approximate Nearest Neighbor (FLANN) technique for matching keypoints, with the Structure from Motion multi-view scheme for generating consistent 3D point clouds. Next, it uses a 2D scanning window classifier based on Bag of Features and Support Vectors Machine for classification of 3D points in the cloud. Finally, the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) for 3D bud localization is applied. Our approach resulted in a maximum precision of 1.0 (i.e., no false detections), a maximum recall of 0.45 (i.e. 45% of the buds detected), and a localization error within the range of 259–554 pixels (corresponding to approximately 3 bud diameters, or 1.5 cm) when evaluated over the whole range of user-given parameters of workflow components.

Introduction

In this work, we present an approach for the efficient 3D detection and localization of grapevine buds. 3D models were reconstructed from multiple images captured during the winter season in natural field conditions (i.e., natural sunlight and the addition of no artificial elements) using a mobile phone RGB camera.

Grapevine buds were recognized early in viticulture history as one of the most important parts of the plant, mainly because they contain the whole plant productive capacity, from which all sprouts, leaves, bunches, and tendrils grow. In particular, bud bunch fertility, a.k.a. fruitfulness, is of particular interest, as it has a direct impact on the main goal of vine production, that is, to increase productivity without affecting fruit quality. It has been shown that bud fruitfulness depends on the amount of sunlight exposure of buds during the period starting at bud initiation in early spring throughout its development stage up to 30 days after bloom [[15], [21], [11], [25], [35], [27]]. Shading conditions during this period strongly depend on what we call shading structure, consisting in the localization and geometric characterization of those parts of the plant that occlude sunlight, mainly the leaves and bunches that grow after bloom. In addition, sunlight exposure can be used by growers to influence the productivity of the next period by choosing those buds that received the most sunlight exposure. In practice, this happens by deciding pruning procedures late in the winter [23]. There is a balance, however, as unpruned buds will produce vegetation, shading the newly initiated buds, and therefore, affecting the productivity of the next period. The decision of optimal pruning is, therefore, a complex task that must be carefully balanced between: (i) productivity maximization of the starting period determined by buds with maximum sun exposure, and (ii) productivity maximization of the following period determined by the shading conditions resulting from the green vegetation growing from those buds.

A solution to the first issue requires measuring the sun exposure of individual buds at regular intervals from initiation to 30 days after bloom and then recovering this value for each bud months later during winter pruning. Sunlight exposure has been measured so far through manual positioning of radiation sensors [25]. These manual procedures, however, are far from efficient for the massive measuring of sunlight exposure of individual plants, not to mention of individual buds. Our work aims to partially fulfill the need for an efficient method for measuring and recording the sunlight exposure of individual buds. The general rationale behind our approach is that it is possible to compute the sunlight exposure of a bud with high-precision when the precise 3D localization of the bud, the shading structure around it, the geo-positioning of the field, and the dates of interest are fed to a sun radiation model [[29], [8]]. It is an ambitious goal, attended partially by the present work that provides a solution to the 3D localization of winter buds. Future work, however, will have to solve the problem of producing the shading model. This could be done by localizing buds from initiation till the end of summer, and then by identifying buds between consecutive 3D modelizations to allow the recording of long-term sun exposure. A solution to the second issue requires a thorough understanding of which summer shading structures result from different winter pruning procedures and trellis systems [[11], [14]]. This demands measuring the shading structure, a procedure which is currently unavailable.

Simulations are a possibility for partially overcoming the inability to reconstruct the shading structure, necessary for solving both issues. There is a line of research that studies different procedures for producing simulated whole plant shading structures, including the canopy and bunches [[13], [16]]. They typically require plant architecture and bud localization as input. However, bud localization information, being inexistent, is provided by randomly simulating their position. Our work provides a solution to the latter, while [26] is one of the many studies that provide a solution to the former. Despite being a simulated model, the shading structure has the potential to produce invaluable—and to this day inexistent—information on the (simulated) long-term sun exposure of large bud samples, including months with a fully grown canopy. In particular, with plant architecture before the winter pruning, it is possible to simulate the backward shading structure of the previous spring as well as different forward shading structures resulting from different pruning treatments.

Finally, we note that both issues require an autonomous system for executing pruning. Historically, pruning procedures have been simplified to be accessible for humans. However, this may change with the extra information provided by 3D modeling, namely, the identification of fruitful buds and predictions of next-period's shading structures. With this information, the resulting optimal pruning may be too sophisticated to be amenable for human execution, requiring autonomous pruning systems.

In addition to measuring sunlight exposure and guiding autonomous pruning, bud localization is also required as part of the measuring processes of other variables of interest in viticulture. These are bud count, type-of-bud classification, bud geometric characterization, internode length, and bud development stage. Their values at any location are of importance to agronomists for deciding on possible treatments (e.g., the application of fertilizers, canopy pruning), or for predicting plant productivity. Observation and measurement of crop variables is a fundamental task that offers the agronomist information about crop state, providing the means for informed decision-making of what treatments must be applied in order to maximize productivity and crop quality. At present, these variables are measured through direct or indirect human visual inspection, whose elevated cost often results in the measurement of only a small sample of all cases. When data are scarce, even powerful statistical techniques may still result in high uncertainty in the decision-making process, motivating the introduction of improved sensing procedures. Locating buds is a necessary task to conduct a proper measurement of the above variables. However, 2D localization is sufficient for all variables with the exception of internode length, for which 3D localization of two consecutive buds in a cane is necessary to avoid perspective errors. Still, automatic, high-throughput measurement of these variables would come with no extra cost with an autonomous 3D localization system in place.

There are many computational approaches to aid viticulture, including detecting grapes and bunches, estimating grape size and weight, estimating production and foliar area indexes, phenotyping, and autonomous selective pulverization [[19], [30], [6], [12], [2], [31]]. For a more extensive review, see [37].

Specifically concerning the detection of grapevine buds, there are two recent studies (in 2D only) that address the problem of grapevine bud detection [[38], [12]]. The first one presents a grapevine bud detection algorithm designed specifically to establish the groundwork for a future autonomous pruning system in the winter season (with no leaves left that may occlude the vision and operation of the cutting mechanism). Bud detection is performed from RGB images (the image resolution in this study is unknown). Furthermore, on top of this assumption, images are captured indoors with an industrial CCD camera with controlled background and lighting conditions. To discriminate between plant and background pixels, the authors apply a simple threshold resulting in a binary image to obtain a wire skeleton of the plant. Under the assumption that bud morphology is similar to that of the corners, they apply Harris’ algorithm [9] to the skeleton image for detecting those corners. This process produces a recall of 0.702, i.e., 70.2% of buds detected. Although some improvements are suggested by the authors, the most striking limitations of this work are the need for images captured under controlled indoors conditions and the fact that the resulting localizations are in 2D. A second work for bud detection is presented by Herzog et al. [12]. This work introduces three methods of bud detection. The best results are obtained with the semi-automatic method that requires human intervention for validating the quality of the results. Detection is based on 3456 × 2304 RGB images, where the scene is altered with an artificial black background, producing a recall of 0.94. The authors argue that this recall is enough to satisfy the phenotyping of plants. However, as the authors themselves point out, these good results are mainly explained by the particular color and morphology of the buds, captured when bud sprouts are visibly green and their average size is around 2 cm (compared to a typical 5 mm diameter of a dormant bud) which makes it easier to discriminate them visually from other plant components. Although these works represent important advancements in specific bud detection applications, they suffer from some of the following limitations: (i) the use of an artificial background, (ii) controlled indoors luminosity, (iii) the need for human intervention, (iv) the detection of buds in an advanced stage of development, and (v) detection is in 2D.

Dey et al. [5] introduced a pipeline for recovering the 3D structure of the grapevine plant in the spring–summer season (i.e., with leaves and fruits) from a 3D point cloud. This 3D point cloud visually represents the surface parts of the environment, where each point is represented by a tuple containing the 3D position in world coordinates (x, y, z). Cloud reconstruction is obtained with the algorithm proposed by Snavely et al. [28]. Afterwards, the cloud is classified into leaves, branches, and fruits by means of a supervised classification algorithm that uses shape and color features. The experiments show an accuracy of 0.98 for grapes before maturation (still green) and 0.96 for fully ripe grapes (color change), where accuracy corresponds to the proportion of all observations (both grapes and background) that were correctly classified. Despite the similarities with our work, their work classifies grapes and ours classifies buds, making it hard to compare them. This is mainly due to the geometrical nature of the features they use that one would expect to work better for close-to-spherical shapes such as that of grapes, but which may work poorly for buds that present a highly irregular shape.

Section snippets

Materials and methods

In this section we provide a detailed description of our approach of 3D detection and localization of grapevine buds together with a detailed description of the input collection of images.

The detection and localization workflow consists of five stages as depicted in Fig. 1: (1) a 3D construction technique known as Structure from Motion [10] that, given as input a set of 2D images of some scene, produces both the 3D geometry (point cloud) of the scene and the camera pose of each 2D image; (2) a

Experiments

In this section we present results of systematic experiments that evaluate the quality of the 3D structures produced by our approach. We first introduce quantitative performance measures that assess detection and localization errors that report hard errors of true buds that were undetected, or clusters that detected no bud, and soft errors reporting how far the correctly detected buds fell from the actual position of the buds they detected. Values for these performance measures are reported

Discussion

From Fig. 6 we considered as best outcomes those located at precision = 1 (i.e., all detections correspond to actual buds) and recall in a range from 0.38 to 0.45 (i.e., between 38% and 45% of buds detected). These assignments show localization errors in the range of 259–554 pixels, which correspond to approximately 3 buds and approximately 1.5 cm. This is because, for the image scale in the collection, average bud diameter is 159 pixels with 95% of the total probability mass falling within the

Conclusions

In this work we introduce a workflow for the localization of grapevine buds in 3D space obtained from plant parts 3D models reconstructed from multiple 2D images, captured during the winter season, using RGB mobile phone cameras in natural field conditions. The proposed workflow is based on well-known computer vision and machine learning algorithms, such as SfM, SIFT, BoF, SVM, DAISY, ORB and DBSCAN. We justified the importance of bud 3D detection through their potential applications, such as

Acknowledgments

This work was funded by the National Technological University (UTN), the National Council of Scientific and Technical Research (CONICET), Argentina, and the National Fund for Scientific and Technological Promotion (FONCyT), Argentina. We thank the National Agricultural Technology Institute (INTA) for offering their vineyards to capture the images used in this work.

References (38)

P. Fu et al.
A geometric solar radiation model with applications in agriculture and forestry
Comput. Electron. Agric.
(2002)
D.S. Pérez et al.
Image classification for detection of winter grapevine buds in natural conditions using scale-invariant features transform, bag of features and support vector machines
Comput. Electron. Agric.
(2017)
S. Agarwal et al.
Ceres Solver
(2012)
R. Berenstein et al.
Grape clusters and foliage detection algorithms for autonomous selective vineyard sprayer
Intell. Serv. Robot.
(2010)
G. Bradski
The OpenCV library
Dr. Dobb's J.: Softw. Tools Prof. Program.
(2000)
G. Csurka et al.
Visual categorization with bags of keypoints
D. Dey et al.
Classification of plant structures from uncalibrated image sequences
M.-P. Diago et al.
Grapevine yield and leaf area estimation using supervised classification methodology on RGB images taken under field conditions
Sensors
(2012)
M. Ester et al.
A density-based algorithm for discovering clusters in large spatial databases with noise
Kdd, vol. 96
(1996)
C. Harris et al.
A combined corner and edge detector

R. Hartley et al.

Multiple View Geometry in Computer Vision. Cambridge Books Online

(2003)

E.W. Hellman

Grapevine structure and function

K. Herzog

Initial steps for high-throughput phenotyping in vineyards

Australian and New Zealand Grapegrower and Winemaker (603)

(2014)

A. Iandolino et al.

Simulating three-dimensional grapevine canopies and modelling their light interception characteristics

Aust. J. Grape Wine Res.

(2013)

M. Keller

The Science of Grapevines: Anatomy and Physiology

(2015)

S. Khanduja et al.

Fruitfulness of grape vine buds

Econ. Bot.

(1972)

G. Louarn et al.

A three-dimensional statistical reconstruction model of grapevine (Vitis vinifera) simulating canopy structure variability within and between cultivar/training system pairs

Ann. Bot.

(2008)

D.G. Lowe

Distinctive image features from scale-invariant keypoints

Int. J. Comput. Vis.

(2004)

M. Muja et al.

Fast approximate nearest neighbors with automatic algorithm configuration

Cited by (20)

Optimization of greenhouse tomato localization in overlapping areas
2023, Alexandria Engineering Journal
Tomato localization is the main difficulty of tomato picking robots vision system. To provide robots vision system with the accurate position of tomatoes, this paper collects images with a binocular camera, provides the principle of binocular ranging, and improves the census stereo matching algorithm. The improved algorithm betters the area matching process: only the areas containing tomatoes are matched, more constraints are applied on the area matching, and the Localization of tomatoes in overlapping areas is optimized. Compared with stereo processing by semiglobal matching and mutual information (SGBM) algorithm and pyramid stereo matching network (PSMnet), the improved algorithm achieved an extremely small disparity error. The absolute error maximized at 4 pixels. The matching time for a single image was 10 ms at the most. In this way, the matching time is improved significantly. Experimental results show that the improved census matching algorithm provided tomato picking robots vision system with more accurate localization information, and greatly improved the picking efficiency.
The phenotypic diversity of Schisandra sphenanthera fruit and SVR model for phenotype forecasting
2022, Industrial Crops and Products
Schisandra sphenanthera Rehd. et Wils. is an endangered traditional Chinese medical plant whose fruits can be used for medicinal purposes, and is mainly distributed in the Qinling Mountains. Presently, little is known about the phenotypic diversity in natural populations of S. sphenanthera, and its fitness and evolutionary potential to the environmental factors, which is important for its conservation. In this study, we have made an in-depth research of the degree of phenotypic differentiation between and within populations, phenotypic diversity, environment factors on phenotypic differentiation, and the prediction of fruit phenotypes under different environments. The phenotypic diversity of S. sphenanthera fruits on a large scale (9 provinces, 25 counties) is evaluated by means of cluster analysis, principle component analysis, statistics of phenotypic traits diversity index and differentiation coefficient. Our data report a high degree of variation between the 25 populations for most morphological traits. The average phenotypic variation between populations accounts for 27.11%, and the average phenotypic variation within populations accounts for 72.89%. 25 populations are divided into four categories based on twelve fruit phenotypic traits of S. sphenanthera. Five counties in Shaanxi Province are grouped together (cluster I) where the highest single fruit weight and 100-berries fresh weight, which are most closely related to fruit weight. Correlation analysis is used to evaluate the relationship between environmental factors and fruit phenotypes, and path analysis is used to analyse the effect of environmental factors on fruit phenotypes. There are many significant correlations between phenotype and environmental factors. Some factors have direct effects on phenotype, while others are indirect effect factors, for example, bio1 has direct effects on longitudinal and horizontal diameter of fruit and 100-berries weight (including fresh and dry weight), and has indirect effect on fruit pedicels length. Topographic and soil factors mainly affect phenotypes about fruit weight, and phenotypes of fruit shape are affected by topographic and climate factors. Our study propose a data-driven machine learning (ML) method, namely support vector regression (SVR), based on the results of path analysis, which can predict the phenotype of S. sphenanthera in different environments with high prediction accuracy. S. sphenanthera in five counties of Shaanxi Province (cluster I) can serve as an important source of genetic material for wild upbringing towards the development of new breed with high yield in this Chinese herb. This study also finds the environmental factors have significant effect on fruit weight-related phenotypes. The results of this study can provide data support for artificial cultivation to improve fruit yield.
3D point cloud density-based segmentation for vine rows detection and localisation
2022, Computers and Electronics in Agriculture
Citation Excerpt :
In this process, an essential phase of the processing algorithms is usually the semantic interpretation and segmentation of the 3D point clouds, which assign each point to different portions of the whole model. Indeed, many applicative examples of semantic interpretation of 3D point cloud for agricultural purposes can be found in literature, aimed at detecting portion of the model representing leaves, branches, fruits, buds, and other elements (Díaz et al., 2018; Mortensen et al., 2018; Zhou et al., 2019; Zeng et al., 2020; Comba et al., 2020a). These 3D point cloud algorithms have been usually developed to process a single tree, or portion of a crop, at a time (Comba et al., 2020b; Gené-Mola et al., 2020; Zhang et al., 2021).
The adoption of new sensors for crop monitoring is leading to the acquisition of large amounts of data, which usually are not directly usable for agricultural applications. The 3D point cloud maps of fields and parcels, generated from remotely sensed data, are examples of such big data, which require the development of specific algorithms for their processing and interpretation, with the final aim to obtain valuable information about crop status.
This manuscript presents an innovative 3D point cloud processing algorithm for vine row detection and localisation within vineyard maps, based on the detection of key points and a density-based clustering approach. Vine row localisation is a crucial phase in the interpretation of the complex and huge 3D point clouds of agricultural environments, which makes it possible to move the focus from a macro level (parcel and plot scale) to a micro level (plants, fruits and branches). The algorithm outputs fully describe the spatial location of each vine row within the whole 3D model of the agricultural environment by a set of key points and an interpolating curve. The algorithm is specifically conceived to be robust and: (i) independent of the adopted airborne sensor used to acquire the in-field data (not requiring a model with colour or spectral information); (ii) able to manage vineyards with any vine row layout or orientation (such as curvilinear) and (iii) not hindered by the occurrence of missing plants. The experimental results, obtained by processing the models of seven case study parcels, proved the algorithm’s reliability and accuracy: the automatic vine row detection was found to be 100% in accordance with the manual one; and the obtained localisation indices showed an average error of 12 cm and standard deviation of 10 cm, which is fully compatible with the considered agricultural applications. In addition, the algorithm outputs can be profitably exploited for enhanced path planning of autonomous agricultural machines adopted for in-field operations.
Modern approaches to precision and digital viticulture
2022, Improving Sustainable Viticulture and Winemaking Practices
New and emerging technologies could play a critical role in the viticulture and winemaking of the future. Climate change has threatened the status quo within the viticultural and wine industry due to increased ambient temperatures, the variability of precipitation, and the increase of climatic risks. These main threats are specifically related to the compression of phenological stages, earlier harvests, many of these within the hottest months producing a dual warming effect. Furthermore, the increase of climatic anomalies, such as floods, frosts, and bushfires, in number, intensity, and window of opportunity within the growing season directly impacts yield and grape and wine quality. The viticulture and winemaking of the future need to have a transformational process to be more predictive rather than only reactive by implementing disruptive technology supported by artificial intelligence.
Visual classification of apple bud-types via attention-guided data enrichment network
2021, Computers and Electronics in Agriculture
The number of flower buds on the apple tree is the crucial factor for fruit load determining, thus the essence of apple tree pruning is bud removal. Most horticulture activities in apple orchards at present primarily rely on skilled farmers. However, distinguishing between different types of apple buds is still hard work for many planters due to their similar appearances. The most recent published works have proven the superiority of computer vision and deep learning in image recognition tasks. Deep convolutional neural network (DCNN) is an efficient type of network in deep learning architecture for visual features analysis. To categorize types of apple bud at the fine-grained level, a DCNN-based visual classification model denoting the attention-guided data enrichment network (ADEN) is proposed. Specifically, in ADEN, the ResNeSt50 network is used as the feature extractor module for characterizing the apple bud trait from each input image. Based on attention maps, the attention-guide data enrichment module, containing attention-guided CutMix and attention-guided erasing, is designed for the task of enriching training samples via dropout and fusing local features of images, which further improves the training efficiency and discriminative ability of the classifier. All the experiments are conducted on the orchard-shot image dataset contained two classes of apple buds, include the flower bud and the leaf bud. The proposed method conveys a consistent and significant improvement in performance and achieves testing accuracy of 92.39% with satisfying precision, recall and f1-score, which outperformed the comparative models. The proposed method can readily realize accurate identification for bud-types of apples and is helping to promote the advancement of pruning and training robotization in orchards.
Technological advancements towards developing a robotic pruner for apple trees: A review
2021, Computers and Electronics in Agriculture
Citation Excerpt :
The accuracies were 100%, 97.2%, 98.7%, and 96.4% for the stationary, y-translation, x-rotation, and z-rotation, respectively, showing the potential of the machine vision system for robotic pruning in field conditions. Díaz et al. (2018) localized grapevine buds in 3D space using Fast Approximate Nearest Neighbor (FLANN), SVM, and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) where the FLANN was used for matching key points, SVM for classification of 3D points and finally DBSCAN for 3D bud localization. The error of positioning was within 1.5 cm, which corresponded to approximately three bud diameters.
Automation and robotics have been widely applied in many agricultural operations; however, the production operations of apple trees are usually performed manually. Pruning is one of the most labor-intensive operations, accounting for about 20% of the total labor costs. Robotic pruning is a potential long-term solution to deal with the issue of labor shortages and associated high costs. In recent years, researchers have advanced technologies that could lead to the development of a robotic pruning system. However, numerous challenges are involved for successful adoption of robotic pruning technologies. This review highlighted the technological progress in the core components of a robotic pruner including machine vision, manipulator, end-effector, path planning, and obstacle avoidance; the challenges associated with the existing technologies; and potential solutions to the development of an automated pruning system. The review covers both scientific/research and commercial technologies developed for robotic tree pruning. The available literature related to all technological components of a robotic pruning system was reviewed in detail, and useful information was synthesized for presenting in the paper. Finally, the paper scrutinizes the challenges and potential future opportunities for developing a robotic pruning system for a sustainable apple production system.

View all citing articles on Scopus

View full text

Grapevine buds detection and localization in 3D space based on Structure from Motion and 2D image classification

Highlights

Abstract

Introduction

Section snippets

Materials and methods

Experiments

Discussion

Conclusions

Acknowledgments

Comput. Electron. Agric.

Comput. Electron. Agric.

Ceres Solver

Grape clusters and foliage detection algorithms for autonomous selective vineyard sprayer

Intell. Serv. Robot.

The OpenCV library

Dr. Dobb's J.: Softw. Tools Prof. Program.

Visual categorization with bags of keypoints

Classification of plant structures from uncalibrated image sequences

Grapevine yield and leaf area estimation using supervised classification methodology on RGB images taken under field conditions

Sensors

A density-based algorithm for discovering clusters in large spatial databases with noise

Kdd, vol. 96

A combined corner and edge detector