Skip to main content

Grapevine Varieties Classification Using Machine Learning

  • Conference paper
  • First Online:
Progress in Artificial Intelligence (EPIA 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11804))

Included in the following conference series:

  • 2870 Accesses

Abstract

Viticulture has a major impact in the European economy and over the years the intensive grapevine production led to the proliferation of many varieties. Traditionally these varieties are manually catalogued in the field, which is a costly and slow process and being, in many cases, very challenging to classify even for an experienced ampelographer. This article presents a cost-effective and automatic method for grapevine varieties classification based on the analysis of the leaf’s images, taken with an RGB sensor. The proposed method is divided into three steps: (1) color and shape features extraction; (2) training and; (3) classification using Linear Discriminant Analysis. This approach was applied in 240 leaf images of three different grapevine varieties acquired from the Douro Valley region in Portugal and it was able to correctly classify 87% of the grapevine leaves. The proposed method showed very promising classification capabilities considering the challenges presented by the leaves which had many shape irregularities and, in many cases, high color similarities for the different varieties. The obtained results compared with manual procedure suggest that it can be used as an effective alternative to the manual procedure for grapevine classification based on leaf features. Since the proposed method requires a simple and low-cost setup it can be easily integrated on a portable system with real-time processing to assist technicians in the field or other staff without any special skills and used offline for batch classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Vivier, M.A., Pretorius, I.S.: Genetically tailored grapevines for the wine industry. Trends Biotechnol. 20, 472–478 (2002)

    Article  Google Scholar 

  2. This, P., Lacombe, T., Thomas, M.: Historical origins and genetic diversity of wine grapes. Trends Genet. 22, 511–519 (2006)

    Article  Google Scholar 

  3. Thomas, M.R., Cain, P., Scott, N.S.: DNA typing of grapevines: a universal methodology and database for describing cultivars and evaluating genetic relatedness. Plant Mol. Biol. 25, 939–949 (1994)

    Article  Google Scholar 

  4. Diago, M.P., Fernandes, A.M., Millan, B., Tardaguila, J., Melo-Pinto, P.: Identification of grapevine varieties using leaf spectroscopy and partial least squares. Comput. Electron. Agric. 99, 7–13 (2013)

    Article  Google Scholar 

  5. Fuentes, S., Hernández-Montes, E., Escalona, J.M., Bota, J., Gonzalez Viejo, C., Poblete-Echeverría, C., Tongson, E., Medrano, H.: Automated grapevine cultivar classification based on machine learning using leaf morpho-colorimetry, fractal dimension and near-infrared spectroscopy parameters. Comput. and Electr. in Agriculture 151, 311–318 (2018)

    Article  Google Scholar 

  6. Gutiérrez, S., Tardaguila, J., Fernández-Novales, J., Diago, M.P.: Support vector machine and artificial neural network models for the classification of grapevine varieties using a portable NIR spectrophotometer. PLoS ONE 10, e0143197 (2015)

    Article  Google Scholar 

  7. Gutiérrez, S., Fernández-Novales, J., Diago, M.P., Tardaguila, J.: On-The-Go hyperspectral imaging under field conditions and machine learning for the classification of grapevine varieties. Front. Plant Sci. 9, 1102 (2018)

    Article  Google Scholar 

  8. Karasik, A., Rahimi, O., David, M., Weiss, E., Drori, E.: Development of a 3D seed morphological tool for grapevine variety identification, and its comparison with SSR analysis. Sci. Rep. 8, 6545 (2018)

    Article  Google Scholar 

  9. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. B Cybern. 9, 62–66 (1979)

    Article  Google Scholar 

  10. Bendig, J., et al.: Combining UAV-based plant height from crop surface models, visible, and near infrared vegetation indices for biomass monitoring in barley. Int. J. Appl. Earth Obs. Geoinf. 39, 79–87 (2015)

    Article  Google Scholar 

  11. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 4th edn. Pearson (2017)

    Google Scholar 

  12. Rodgers, J.L., Nicewander, W.A.: Thirteen ways to look at the correlation coefficient. Am. Stat. 42, 59–66 (1988)

    Article  Google Scholar 

  13. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  14. Yu, H.-F., Huang, F.-L., Lin, C.-J.: Dual coordinate descent methods for logistic regression and maximum entropy models. Mach. Learn. 85, 41–75 (2011)

    Article  MathSciNet  Google Scholar 

  15. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics, 2nd edn. Springer, New York (2009). https://doi.org/10.1007/978-0-387-84858-7

    Book  MATH  Google Scholar 

  16. Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46, 175–185 (1992)

    MathSciNet  Google Scholar 

  17. Breiman, L.: Classification and Regression Trees. Routledge, Boca Raton (2017)

    Book  Google Scholar 

  18. Zhang, H.: The Optimality of Naive Bayes. In: FLAIRS2004 Conference (2004)

    Google Scholar 

  19. Guyon, I., Boser, B., Vapnik, V.: Automatic capacity tuning of very large VC-dimension classifiers. In: Proceedings Advances in Neural Information Processing Systems, vol. 5, pp. 147–155 (1992)

    Google Scholar 

  20. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection, Montreal, vol. 14, pp. 1137–1145 (1995)

    Google Scholar 

  21. Du, J.-X., Wang, X.-F., Zhang, G.-J.: Leaf shape based plant species recognition. Appl. Math. Comput. 185, 883–893 (2007)

    MATH  Google Scholar 

  22. Silva, P.F.B., Marçal, A.R.S., da Silva, R.M.A.: Evaluation of features for leaf discrimination. In: Kamel, M., Campilho, A. (eds.) ICIAR 2013. LNCS, vol. 7950, pp. 197–204. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39094-4_23

    Chapter  Google Scholar 

  23. Pauwels, E.J., de Zeeuw, P.M., Ranguelova, E.B.: Computer-assisted tree taxonomy by automated image recognition. Eng. Appl. A.I. 22, 26–31 (2009)

    Article  Google Scholar 

  24. Yang, M., Kpalma, K., Ronsin, J.: A survey of shape feature extraction techniques. In: Yin, P.-Y. (ed.) Pattern Recognition, pp. 43–90. IN-TECH (2008)

    Google Scholar 

  25. Ghozlen, N.B., Cerovic, Z.G., Germain, C., Toutain, S., Latouche, G.: Non-destructive optical monitoring of grape maturation by proximal sensing. Sensors 2010(10), 10040–10068 (2010)

    Article  Google Scholar 

  26. Mokhtarian, F., Mackworth, A.K.: A theory of multiscale, curvature-based shape representation for planar curves. IEEE Trans. Pattern Anal. Mach. Intell. 14, 789–805 (1992)

    Article  Google Scholar 

  27. Jalba, A.C., Wilkinson, M.H.F., Roerdink, J.B.T.M.: Shape representation and recognition through morphological curvature scale spaces. IEEE Trans. Image Process. 2006(15), 331–341 (2006)

    Article  Google Scholar 

Download references

Acknowledgements

This work is financed by the ERDF – European Regional Development Fund through the Operational Programme for Competitiveness and Internationalisation - COMPETE 2020 Programme within project «POCI-01-0145-FEDER-006961», and by National Funds through the FCT – Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) as part of project UID/EEA/50014/2013.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to António Sousa .

Editor information

Editors and Affiliations

Appendices

Appendix A

Feature

Description

Equation

Eccentricity [21,22,23,24]

Defined as the ratio of the length of main inertia axis of the ROI EA to the length of minor inertia axis of the ROI EB

\( E = \frac{EA}{EB} \)

Aspect Ratio [21,22,23,24]

Ratio between the maximum length Dmax and the minimum length Dmin of the minimum bounding rectangle (MBR) of the leaf with holes

\( AR = \frac{Dmax}{Dmin} \)

Aspect Ratio 2

Same as Aspect Ratio but without considering the holes of the leaf

 

Elongation [22,23,24]

From each point inside the ROI is calculated the minimal distance Dmin to the boundary contour \( \partial \) ROI and its denoted by Dme its maximal value over the region

\( D_{me} = {}_{x \in R}^{max} d\left( {x,\partial ROI} \right) \)

Then elongation is defined as:

\( E = 1 - \frac{{2D_{me} }}{{D_{ROI} }} \)

Solidity [21,22,23,24]

Describes the extent to which the shape is convex or concave

\( S = \frac{{A_{ROI} }}{{A_{CH} }} \)

Isoperimetric Factor [21,22,23,24]

If the closed contour \( \partial \) ROI of length L(\( \partial ROI) \) encloses a region ROI of area A (R), the isoperimetric factor is defined by

\( IF = \frac{4\pi A\left( R \right)}{{L\left( {\partial ROI} \right)^{2} }} \)

Maximal Indentation Depth [22, 23]

For each point on the contour of ROI is determined the distance to the convex hull, expressing this distance by a function. Then the Maximal Indentation Depth is the maximum of the function

 

Rectangularity [21, 24]

Represents the ratio of \( A_{ROI} \) and the MBR (Mum Bounding Rectangle) area

\( Rect = \frac{{A_{ROI} }}{{D_{max} \times D_{min} }} \)

Convex Perimeter Ratio [25]

Represents the ratio of the \( P_{ROI} \) and the \( P_{CH} \)

\( CPR = \frac{{P_{ROI} }}{{P_{CH} }} \)

Circularity [21, 24]

Defined by all of the bounding points of the ROI

\( C = \frac{\mu R}{\sigma R} \)

Where \( \mu \)R is the mean distance between the centre of the ROI and all of the bounding points, and \( \sigma \)R is the quadratic mean deviation of the mean distance:

\( \mu R = \frac{1}{N}\sum\limits_{i = 0}^{N - 1} {\left\| {\left( {x_{i} ,y_{i} } \right) - \left( {\bar{x}, \bar{y}} \right)} \right\|} \)

\( \sigma R = \frac{1}{N} \sum\limits_{i = 0}^{N - 1} {\left\| {\left( {\left( {x_{i} ,y_{i} } \right) - \left( {\bar{x}, {\bar{\text{y}}}} \right) - \mu R} \right)^{2} } \right\|} \)

 

Sphericity [25]

Represents the ratio between the R of incircle (in) of the ROI and the radius of the excircle (ex) of the ROI

\( S = \frac{{R_{in} }}{{R_{ex} }} \)

Entirety

Ratio between the difference between ACH and AROI, and the AROI

\( Ent = \frac{{A_{CH} - A_{ROI} }}{{A_{ROI} }} \)

Extent

Ratio between AROI and the product of BB width and height

\( Ex = \frac{{A_{ROI} }}{{BB_{Width} \times BB_{height} }} \)

Equiv Diameter

Calculates the D of a circle with the same area as the ROI

\( ED = \sqrt {\frac{{4 \times A_{ROI} }}{\pi }} \)

Number Curvatures [24, 26, 27]

Number of corners with 5 × 5 pixels neighbouring. In order to use K(n) for shape representation, it was quoted the function of curvature, K(n) as:

\( K\left( n \right) = \frac{{\dot{x}\left( n \right)\ddot{y}\left( n \right) - \dot{y}\left( n \right)\ddot{x}\left( n \right) }}{{\left( {\dot{x}\left( n \right)^{2} + \dot{y}\left( n \right)^{2} } \right)^{3/2} }} \)

Therefore, it is computed the curvature of a planar curve from its parametric representation. If n is the normalized arc-length parameter s, then:

\( K\left( s \right) = \dot{x}\left( s \right)\ddot{y}\left( s \right) - \dot{y}\left( n \right)\ddot{x}\left( n \right) \)

However, the curvature function is computed only from parametric derivatives, and, therefore, it is invariant under rotations and translations. Though, the curvature measure is scale dependent, i.e., inversely proportional to the scale. A possible way to achieve scale independence is to normalize this measure by the mean absolute curvature, i.e.,

\( K^{\prime}\left( s \right) = \frac{K\left( s \right)}{{\frac{1}{N}\mathop \sum \nolimits_{s = 1}^{N} \left| {K\left( s \right)} \right|}} \)

where N is the number of points on the normalized contour

 

Appendix B

Feature

Description

Equation

Mean

Average of leaf pixel values

 

Deviation

Standard deviation of leaf pixel values

 

Softness

Calculate the smoothness of the image

\( Sft = 1 - \frac{1}{{1 + Deviation^{2} }} \)

Contrast

Returns the average of the measure of the intensity contrast between a pixel and its neighbour over the whole image, also known as Variance

\( C = \sum\limits_{i,j} {\left| {i - j} \right|^{2} p\left( {i,j} \right)} \)

Correlation

Returns the average of the measure of how correlated a pixel is to its neighbour over the whole image, where the range is between –1 and 1. Correlation is 1 or –1 for a perfectly positively or negatively correlated image. Correlation is NaN for a constant image

\( Corr = \sum\limits_{i,j} {\frac{{\left( {i - \mu i} \right)\left( {j - \mu j} \right)p\left( {i,j} \right)}}{{\sigma_{i} \sigma_{j} }}} \)

Energy

Returns the average of the sum of squared elements in the GLCM, where the range is between 0 and 1. Energy is 1 for a constant image

The property Energy is also known as uniformity, uniformity of energy, and angular second moment, and its calculated by:

\( En = \sum\limits_{i,j} {p\left( {i,j} \right)^{2} } \)

Homogeneity

Returns the average of the value that measures the closeness of the distribution of elements in the GLCM to the GLCM diagonal, where the range is between 0 and 1. Homogeneity is 1 for a diagonal GLCM

\( Homo = \sum\limits_{i,j} {\frac{{p\left( {i,j} \right)}}{{1 + \left| {i - j} \right|}}} \)

Mean Hist

Calculate the average of the approximate probability density of occurrence of the intensity in the histogram

 

Variance Hist

Calculate the variance of the approximate probability density of occurrence of the intensity in the histogram

 

Skewness Hist

Calculate the skewness of the approximate probability density of occurrence of the intensity in the histogram

Skewness is a measure of the asymmetry of the data around the sample mean. If skewness is negative, the data are spread out more to the left of the mean than to the right. If skewness is positive, the data are spread out more to the right. The skewness of the normal distribution (or any perfectly symmetric distribution) is zero

The skewness of a distribution is defined as:

\( Sk = \frac{{E\left( {x - \mu } \right)^{3} }}{{\sigma^{3} }} \)

where µ is the mean of x, σ is the standard deviation of x, and E(t) represents the expected value of the quantity t

 

Kurtosis Hist

Calculate the kurtosis of the approximate probability density of occurrence of the intensity in the histogram

Kurtosis is a measure of how outlier-prone a distribution is. The kurtosis of the normal distribution is 3. Distributions that are more outlier-prone than the normal distribution have kurtosis greater than 3; distributions that are less outlier-prone have kurtosis less than 3

The kurtosis of a distribution is defined as

\( K = \frac{{E\left( {x - \mu } \right)^{4} }}{{\sigma^{4} }} \)

where μ is the mean of x, σ is the standard deviation of x, and E(t) represents the expected value of the quantity t

 

Energy Hist

Calculate the energy of the approximate probability density of occurrence of the intensity in the histogram

 

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Marques, P. et al. (2019). Grapevine Varieties Classification Using Machine Learning. In: Moura Oliveira, P., Novais, P., Reis, L. (eds) Progress in Artificial Intelligence. EPIA 2019. Lecture Notes in Computer Science(), vol 11804. Springer, Cham. https://doi.org/10.1007/978-3-030-30241-2_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30241-2_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30240-5

  • Online ISBN: 978-3-030-30241-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics