Skip to main content
Log in

Synthetic samples generator (SYSGEN), an approach to increase the size of incidence samples in coffee leaf rust modelling

  • Review
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

Rust is declared as big problem for coffee farmers. Several rust attacks were occurred in Latin American countries as Colombia, Mexico, Peru, Ecuador and Salvador. Due to damage caused by coffee rust, several regression models have been proposed to estimate the rust from weather variables. However, these models lack real rust samples because the recollection process of samples requires large expenses of money and time. Considering this issue, we propose in this paper a mechanism called SYnthetic Samples GENerator (SYSGEN). This proposal is based on cubic spline interpolation to increase the size of rust incidence samples (RIS) and expert knowledge to adjust the rust progress curve in Colombian coffee crops. In order to demonstrate the reliability of SYSGEN, we built 132 regression models from synthetic incidence samples (dependent variable) and weather observations (independent variables). To do this, we considered three Colombian coffee regions, five experiments and four regression models. Besides, we used Recursive Feature Elimination (RFE) to select the relevant weather variables. The analysis of these models and RFE are promising since several aspects and effects related with the rust development are revealed. One of these aspects is that the regression models used frequently temperature (maximum, minimum and average) and relative humidity variables. In this sense, it is important to highlight that these meteorological variables are considered by experts as key drivers in germination, penetration, colonization and sporulation phases. In terms of performance, our experiments allow us to conclude that random forest (RF) and bagging trees (BT) reached the lowest Root Mean Square Error (RMSE). Finally, it is important to consider that different datasets produce different performance. For example, if we consider those experiments that involve flowering periods datasets, the lowest RMSE was reached by RF. However, in datasets of coffee harvest periods, BT reached lowest RMSE.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

source: Buitrón et al. (2019)

Fig. 4

Similar content being viewed by others

References

  • Akhtar U, Hassan M (2015) Big data mining based on computational intelligence and fuzzy clustering. In: Handbook of research on trends and future directions in big data and web intelligence, pp 130–148

  • Avelino J (2008) The coffee rust crises in Colombia and Central America. Food Security 7(2):303–321

    Article  Google Scholar 

  • Bartels RH, Beatty JC, Barsky BA (1987)

  • Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1007/bf00058655

    Article  MATH  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Buitrón EJG, Corrales DC, Avelino J, Iglesias JA, Corrales JC (2019) Rule- based expert system for detection of coffee rust warnings in colombian crops. J Intell Fuzzy Syst 36(5):4765–4775. https://doi.org/10.3233/jifs-179025

    Article  Google Scholar 

  • Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)? Geosci Model Dev Discuss 7(1):1525–1534. https://doi.org/10.5194/gmdd-7-1525-2014

    Article  Google Scholar 

  • Cintra ME, Meira CA, Monard MC, Camargo HA, Rodrigues LH (2011) The use of fuzzy decision trees for coffee rust warning in Brazilian crops. In: 2011 11th International conference on intelligent systems design and applications, pp 1347–1352

  • Corrales DC, Ledezma A, Hoyos J, Figueroa A, Corrales JC (2014a) A new dataset for coffee rust detection in Colombian crops base on classifiers. Sistemas Telemát 12(29):9–9. https://doi.org/10.18046/syt.v12i29.1802

    Article  Google Scholar 

  • Corrales DC, Peña Q, Andrés J, León C, Figueroa A, Corrales JC (2014b) Early warning system for coffee rust disease based on error correcting output codes: a proposal. Rev Ing Univ Medellín 13(25):57–64

    Article  Google Scholar 

  • Corrales DC, Figueroa A, Ledezma A, Corrales JC (2015) An empirical multi-classifier for coffee rust detection in colombian crops. In: International conference on computational science and its applications, pp 60–74

  • Corrales DC, Casas AF, Ledezma A, Corrales JC (2016) Two-level classifier ensembles for coffee rust estimation in Colombian crops. Int J Agric Environ Inf Syst 7(3):41–59. https://doi.org/10.4018/ijaeis.2016070103

    Article  Google Scholar 

  • Corrales DC, German G, Rodriguez JP, Agapito L, Corrales JC (2017) Lack of data: is it enough estimating the coffee rust with meteorological time series. Comput Sci Appl ICCSA 10405:3–16

    MathSciNet  Google Scholar 

  • Corrales DC, Lasso E, Casas AF, Ledezma A, Corrales JC (2018a) Estimation of coffee rust infection and growth through two-level classifier ensembles based on expert knowledge. Int J Bus Intell Data Min 13(4):369–369. https://doi.org/10.1504/ijbidm.2018.094984

    Article  Google Scholar 

  • Corrales DC, Lasso E, Ledezma A, Corrales JC (2018b) Feature selection for classification tasks: expert knowledge or traditional methods? J Intell Fuzzy Syst 34(5):2825–2835. https://doi.org/10.3233/jifs-169470

    Article  Google Scholar 

  • Cressie N (1990) The origins of kriging. Math Geol 22(3):239–252. https://doi.org/10.1007/bf00889887

    Article  MathSciNet  MATH  Google Scholar 

  • Deguine JP, Gloanec C, Laurent P, Ratnadass A, Aubertot JN (2017) Agroecological crop protection

  • Eshmawi A, Nair S (2014) Semi-synthetic data for enhanced sms spam detection: using synthetic minority oversampling technique smote. In: Proceedings of the 6th international conference on management of emergent digital ecosystems, pp 206–212

  • Granitto PM, Furlanello C, Biasioli F, Gasperi F (2006) Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemom Intell Lab Syst 83(2):83–90. https://doi.org/10.1016/j.chemolab.2006.01.007

    Article  Google Scholar 

  • Griffiths E (1972) ‘Negative’ effects of fungicides in coffee. Trop Sci 14(1):79–89

    Google Scholar 

  • Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28(5):1–26

    Article  Google Scholar 

  • Kushalappa AC, Eskes AB (1989) Advances in coffee rust research. Annu Rev Phytopathol 27(1):503–531. https://doi.org/10.1146/annurev.py.27.090189.002443

    Article  Google Scholar 

  • Lasso E, Thamada TT, Meira CAA, Corrales JC (2015) Graph patterns as representation of rules extracted from decision trees for coffee rust detection. In: Research conference on metadata and semantics research, pp 405–414

  • Lasso E, Valencia O, Corrales DC, López ID, Figueroa A, Corrales JC (2017) A cloud-based platform for decision making support in Colombian agriculture: a study case in coffee rust. In: International conference of ICT for adapting agriculture to climate change, pp 182–196

  • Luaces O, Rodrigues LHA, Meira CAA, Quevedo JR, Bahamonde A (2010) Viability of an alarm predictor for coffee rust disease using interval regression. In: International conference on industrial, engineering and other applications of applied intelligent systems, pp 337–346

  • McCook S (2006) Global rust belt: Hemileia vastatrix and the ecological integration of world coffee production since 1850. J Glob Hist 1(2):177–195. https://doi.org/10.1017/s174002280600012x

    Article  Google Scholar 

  • Neto CD, Rodrigues LHA, Meira CAA (2014) Modelos de predição da ferrugem do cafeeiro (Hemileia vastatrix Berkeley & Broome) por técnicas de mineração de dados. Coffee Sci 9(3):408–418

    Google Scholar 

  • Nutman FJ, Roberts FM, Clarke RT (1963) Studies on the biology of Hemileia vastatrix Berk. & Br. Trans Brit Mycol Soc 46(1):27–44. https://doi.org/10.1016/s0007-1536(63)80005-4

    Article  Google Scholar 

  • Orzco-Miranda E (2015)

  • Perez-Ariza CB, Nicholson AE, Flores MJ (2012) Prediction of coffee rust disease using bayesian networks. In: Proceedings of the sixth European workshop on probabilistic graphical models, pp 259–266

  • Rivillas CA, Serna CA, Cristancho MA, Gaitan AL (2011) La roya del cafeto en Colombia: Impacto manejo y costos del control. Boletín Técnico 36

  • Rodríguez JP, Corrales DC, Corrales JC (2018) A process for increasing the samples of coffee rust through machine learning methods. Int J Agric Env Inf Syst 9(2):32–52. https://doi.org/10.4018/ijaeis.2018040103

    Article  Google Scholar 

  • Sierra S, Osorio O, Gomez G, Leguizamón C (1993) Recomendaciones para el control químico de la roya del cafeto para 1993 (zonas con cosecha principal en el primer semestre del año. Cenicafé

  • Talhinhas P, Batista D, Diniz I, Vieira A, Silva DN, Loureiro A, Tavares S, Pereira AP, Azinheira HG, Guerra-Guimarães L, Várzea V, do Céu Silva M (2017) The coffee leaf rust pathogen Hemileia vastatrix : one and a half centuries around the tropics. Mol Plant Pathol 18(8):1039–1051. https://doi.org/10.1111/mpp.12512

    Article  Google Scholar 

  • Waller JM, Bigger M, Hillocks RJ (2007)

  • Yee TW, Wild CJ (1996) Vector generalized additive models. J Roy Stat Soc Ser B (Methodol) 58(3):481–493. https://doi.org/10.1111/j.2517-6161.1996.tb02095.x

    Article  MathSciNet  MATH  Google Scholar 

  • Zambolim L (2015)

Download references

Acknowledgements

We thank Centro Nacional de Investigaciones de Café (Cenicafé) and PhD. Álvaro Gaytan Bustamante for his knowledge. We are grateful with COLCIEN-CIAS for PhD scholarship granted to PhD. David Camilo Corrales. In addition, we would like to thank Universidad del Cauca and the research projects “Alternativas Innovadoras de Agricultura Inteligente para sistemas productivos agrícolas del departamento del Cauca soportado en entornos de IoT—ID 4633”, and Internacionalización de Proyectos UEES del Grupo de Ingeniería Telemática-ID 5271, financed by “Red de formación de talento humano para la innovación social y productiva en el departamento del Cauca InnovAcción Cauca”. In addition, this work has been supported by the Spanish MINECO under projects TRA2016-78886-C3-1-R and RTI2018-096036-B-C22.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jose Antonio Iglesias.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Girón, E.J., Corrales, D.C., Sesmero, M.P. et al. Synthetic samples generator (SYSGEN), an approach to increase the size of incidence samples in coffee leaf rust modelling. Evolving Systems 13, 625–636 (2022). https://doi.org/10.1007/s12530-021-09395-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12530-021-09395-0

Keywords

Navigation