Abstract
Viticulture has a major impact in the European economy and over the years the intensive grapevine production led to the proliferation of many varieties. Traditionally these varieties are manually catalogued in the field, which is a costly and slow process and being, in many cases, very challenging to classify even for an experienced ampelographer. This article presents a cost-effective and automatic method for grapevine varieties classification based on the analysis of the leaf’s images, taken with an RGB sensor. The proposed method is divided into three steps: (1) color and shape features extraction; (2) training and; (3) classification using Linear Discriminant Analysis. This approach was applied in 240 leaf images of three different grapevine varieties acquired from the Douro Valley region in Portugal and it was able to correctly classify 87% of the grapevine leaves. The proposed method showed very promising classification capabilities considering the challenges presented by the leaves which had many shape irregularities and, in many cases, high color similarities for the different varieties. The obtained results compared with manual procedure suggest that it can be used as an effective alternative to the manual procedure for grapevine classification based on leaf features. Since the proposed method requires a simple and low-cost setup it can be easily integrated on a portable system with real-time processing to assist technicians in the field or other staff without any special skills and used offline for batch classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Vivier, M.A., Pretorius, I.S.: Genetically tailored grapevines for the wine industry. Trends Biotechnol. 20, 472–478 (2002)
This, P., Lacombe, T., Thomas, M.: Historical origins and genetic diversity of wine grapes. Trends Genet. 22, 511–519 (2006)
Thomas, M.R., Cain, P., Scott, N.S.: DNA typing of grapevines: a universal methodology and database for describing cultivars and evaluating genetic relatedness. Plant Mol. Biol. 25, 939–949 (1994)
Diago, M.P., Fernandes, A.M., Millan, B., Tardaguila, J., Melo-Pinto, P.: Identification of grapevine varieties using leaf spectroscopy and partial least squares. Comput. Electron. Agric. 99, 7–13 (2013)
Fuentes, S., Hernández-Montes, E., Escalona, J.M., Bota, J., Gonzalez Viejo, C., Poblete-Echeverría, C., Tongson, E., Medrano, H.: Automated grapevine cultivar classification based on machine learning using leaf morpho-colorimetry, fractal dimension and near-infrared spectroscopy parameters. Comput. and Electr. in Agriculture 151, 311–318 (2018)
Gutiérrez, S., Tardaguila, J., Fernández-Novales, J., Diago, M.P.: Support vector machine and artificial neural network models for the classification of grapevine varieties using a portable NIR spectrophotometer. PLoS ONE 10, e0143197 (2015)
Gutiérrez, S., Fernández-Novales, J., Diago, M.P., Tardaguila, J.: On-The-Go hyperspectral imaging under field conditions and machine learning for the classification of grapevine varieties. Front. Plant Sci. 9, 1102 (2018)
Karasik, A., Rahimi, O., David, M., Weiss, E., Drori, E.: Development of a 3D seed morphological tool for grapevine variety identification, and its comparison with SSR analysis. Sci. Rep. 8, 6545 (2018)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. B Cybern. 9, 62–66 (1979)
Bendig, J., et al.: Combining UAV-based plant height from crop surface models, visible, and near infrared vegetation indices for biomass monitoring in barley. Int. J. Appl. Earth Obs. Geoinf. 39, 79–87 (2015)
Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 4th edn. Pearson (2017)
Rodgers, J.L., Nicewander, W.A.: Thirteen ways to look at the correlation coefficient. Am. Stat. 42, 59–66 (1988)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Yu, H.-F., Huang, F.-L., Lin, C.-J.: Dual coordinate descent methods for logistic regression and maximum entropy models. Mach. Learn. 85, 41–75 (2011)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics, 2nd edn. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7
Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46, 175–185 (1992)
Breiman, L.: Classification and Regression Trees. Routledge, Boca Raton (2017)
Zhang, H.: The Optimality of Naive Bayes. In: FLAIRS2004 Conference (2004)
Guyon, I., Boser, B., Vapnik, V.: Automatic capacity tuning of very large VC-dimension classifiers. In: Proceedings Advances in Neural Information Processing Systems, vol. 5, pp. 147–155 (1992)
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection, Montreal, vol. 14, pp. 1137–1145 (1995)
Du, J.-X., Wang, X.-F., Zhang, G.-J.: Leaf shape based plant species recognition. Appl. Math. Comput. 185, 883–893 (2007)
Silva, P.F.B., Marçal, A.R.S., da Silva, R.M.A.: Evaluation of features for leaf discrimination. In: Kamel, M., Campilho, A. (eds.) ICIAR 2013. LNCS, vol. 7950, pp. 197–204. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39094-4_23
Pauwels, E.J., de Zeeuw, P.M., Ranguelova, E.B.: Computer-assisted tree taxonomy by automated image recognition. Eng. Appl. A.I. 22, 26–31 (2009)
Yang, M., Kpalma, K., Ronsin, J.: A survey of shape feature extraction techniques. In: Yin, P.-Y. (ed.) Pattern Recognition, pp. 43–90. IN-TECH (2008)
Ghozlen, N.B., Cerovic, Z.G., Germain, C., Toutain, S., Latouche, G.: Non-destructive optical monitoring of grape maturation by proximal sensing. Sensors 2010(10), 10040–10068 (2010)
Mokhtarian, F., Mackworth, A.K.: A theory of multiscale, curvature-based shape representation for planar curves. IEEE Trans. Pattern Anal. Mach. Intell. 14, 789–805 (1992)
Jalba, A.C., Wilkinson, M.H.F., Roerdink, J.B.T.M.: Shape representation and recognition through morphological curvature scale spaces. IEEE Trans. Image Process. 2006(15), 331–341 (2006)
Acknowledgements
This work is financed by the ERDF – European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme within project «POCI-01-0145-FEDER-006961», and by National Funds through the FCT – Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) as part of project UID/EEA/50014/2013.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix A
Feature | Description | Equation |
---|---|---|
Defined as the ratio of the length of main inertia axis of the ROI EA to the length of minor inertia axis of the ROI EB | \( E = \frac{EA}{EB} \) | |
Ratio between the maximum length Dmax and the minimum length Dmin of the minimum bounding rectangle (MBR) of the leaf with holes | \( AR = \frac{Dmax}{Dmin} \) | |
Aspect Ratio 2 | Same as Aspect Ratio but without considering the holes of the leaf | |
From each point inside the ROI is calculated the minimal distance Dmin to the boundary contour \( \partial \) ROI and its denoted by Dme its maximal value over the region | \( D_{me} = {}_{x \in R}^{max} d\left( {x,\partial ROI} \right) \) | |
Then elongation is defined as: | \( E = 1 - \frac{{2D_{me} }}{{D_{ROI} }} \) | |
Describes the extent to which the shape is convex or concave | \( S = \frac{{A_{ROI} }}{{A_{CH} }} \) | |
If the closed contour \( \partial \) ROI of length L(\( \partial ROI) \) encloses a region ROI of area A (R), the isoperimetric factor is defined by | \( IF = \frac{4\pi A\left( R \right)}{{L\left( {\partial ROI} \right)^{2} }} \) | |
For each point on the contour of ROI is determined the distance to the convex hull, expressing this distance by a function. Then the Maximal Indentation Depth is the maximum of the function | ||
Represents the ratio of \( A_{ROI} \) and the MBR (Mum Bounding Rectangle) area | \( Rect = \frac{{A_{ROI} }}{{D_{max} \times D_{min} }} \) | |
Convex Perimeter Ratio [25] | Represents the ratio of the \( P_{ROI} \) and the \( P_{CH} \) | \( CPR = \frac{{P_{ROI} }}{{P_{CH} }} \) |
Defined by all of the bounding points of the ROI \( C = \frac{\mu R}{\sigma R} \) Where \( \mu \)R is the mean distance between the centre of the ROI and all of the bounding points, and \( \sigma \)R is the quadratic mean deviation of the mean distance: \( \mu R = \frac{1}{N}\sum\limits_{i = 0}^{N - 1} {\left\| {\left( {x_{i} ,y_{i} } \right) - \left( {\bar{x}, \bar{y}} \right)} \right\|} \) \( \sigma R = \frac{1}{N} \sum\limits_{i = 0}^{N - 1} {\left\| {\left( {\left( {x_{i} ,y_{i} } \right) - \left( {\bar{x}, {\bar{\text{y}}}} \right) - \mu R} \right)^{2} } \right\|} \) | ||
Sphericity [25] | Represents the ratio between the R of incircle (in) of the ROI and the radius of the excircle (ex) of the ROI | \( S = \frac{{R_{in} }}{{R_{ex} }} \) |
Entirety | Ratio between the difference between ACH and AROI, and the AROI | \( Ent = \frac{{A_{CH} - A_{ROI} }}{{A_{ROI} }} \) |
Extent | Ratio between AROI and the product of BB width and height | \( Ex = \frac{{A_{ROI} }}{{BB_{Width} \times BB_{height} }} \) |
Equiv Diameter | Calculates the D of a circle with the same area as the ROI | \( ED = \sqrt {\frac{{4 \times A_{ROI} }}{\pi }} \) |
Number of corners with 5 × 5 pixels neighbouring. In order to use K(n) for shape representation, it was quoted the function of curvature, K(n) as: \( K\left( n \right) = \frac{{\dot{x}\left( n \right)\ddot{y}\left( n \right) - \dot{y}\left( n \right)\ddot{x}\left( n \right) }}{{\left( {\dot{x}\left( n \right)^{2} + \dot{y}\left( n \right)^{2} } \right)^{3/2} }} \) Therefore, it is computed the curvature of a planar curve from its parametric representation. If n is the normalized arc-length parameter s, then: \( K\left( s \right) = \dot{x}\left( s \right)\ddot{y}\left( s \right) - \dot{y}\left( n \right)\ddot{x}\left( n \right) \) However, the curvature function is computed only from parametric derivatives, and, therefore, it is invariant under rotations and translations. Though, the curvature measure is scale dependent, i.e., inversely proportional to the scale. A possible way to achieve scale independence is to normalize this measure by the mean absolute curvature, i.e., \( K^{\prime}\left( s \right) = \frac{K\left( s \right)}{{\frac{1}{N}\mathop \sum \nolimits_{s = 1}^{N} \left| {K\left( s \right)} \right|}} \) where N is the number of points on the normalized contour |
Appendix B
Feature | Description | Equation |
---|---|---|
Mean | Average of leaf pixel values | |
Deviation | Standard deviation of leaf pixel values | |
Softness | Calculate the smoothness of the image | \( Sft = 1 - \frac{1}{{1 + Deviation^{2} }} \) |
Contrast | Returns the average of the measure of the intensity contrast between a pixel and its neighbour over the whole image, also known as Variance | \( C = \sum\limits_{i,j} {\left| {i - j} \right|^{2} p\left( {i,j} \right)} \) |
Correlation | Returns the average of the measure of how correlated a pixel is to its neighbour over the whole image, where the range is between –1 and 1. Correlation is 1 or –1 for a perfectly positively or negatively correlated image. Correlation is NaN for a constant image | \( Corr = \sum\limits_{i,j} {\frac{{\left( {i - \mu i} \right)\left( {j - \mu j} \right)p\left( {i,j} \right)}}{{\sigma_{i} \sigma_{j} }}} \) |
Energy | Returns the average of the sum of squared elements in the GLCM, where the range is between 0 and 1. Energy is 1 for a constant image The property Energy is also known as uniformity, uniformity of energy, and angular second moment, and its calculated by: | \( En = \sum\limits_{i,j} {p\left( {i,j} \right)^{2} } \) |
Homogeneity | Returns the average of the value that measures the closeness of the distribution of elements in the GLCM to the GLCM diagonal, where the range is between 0 and 1. Homogeneity is 1 for a diagonal GLCM | \( Homo = \sum\limits_{i,j} {\frac{{p\left( {i,j} \right)}}{{1 + \left| {i - j} \right|}}} \) |
Mean Hist | Calculate the average of the approximate probability density of occurrence of the intensity in the histogram | |
Variance Hist | Calculate the variance of the approximate probability density of occurrence of the intensity in the histogram | |
Skewness Hist | Calculate the skewness of the approximate probability density of occurrence of the intensity in the histogram Skewness is a measure of the asymmetry of the data around the sample mean. If skewness is negative, the data are spread out more to the left of the mean than to the right. If skewness is positive, the data are spread out more to the right. The skewness of the normal distribution (or any perfectly symmetric distribution) is zero The skewness of a distribution is defined as: \( Sk = \frac{{E\left( {x - \mu } \right)^{3} }}{{\sigma^{3} }} \) where µ is the mean of x, σ is the standard deviation of x, and E(t) represents the expected value of the quantity t | |
Kurtosis Hist | Calculate the kurtosis of the approximate probability density of occurrence of the intensity in the histogram Kurtosis is a measure of how outlier-prone a distribution is. The kurtosis of the normal distribution is 3. Distributions that are more outlier-prone than the normal distribution have kurtosis greater than 3; distributions that are less outlier-prone have kurtosis less than 3 The kurtosis of a distribution is defined as \( K = \frac{{E\left( {x - \mu } \right)^{4} }}{{\sigma^{4} }} \) where μ is the mean of x, σ is the standard deviation of x, and E(t) represents the expected value of the quantity t | |
Energy Hist | Calculate the energy of the approximate probability density of occurrence of the intensity in the histogram |
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Marques, P. et al. (2019). Grapevine Varieties Classification Using Machine Learning. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11804. Springer, Cham. https://doi.org/10.1007/978-3-030-30241-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-30241-2_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30240-5
Online ISBN: 978-3-030-30241-2
eBook Packages: Computer ScienceComputer Science (R0)