Abstract
Rust is declared as big problem for coffee farmers. Several rust attacks were occurred in Latin American countries as Colombia, Mexico, Peru, Ecuador and Salvador. Due to damage caused by coffee rust, several regression models have been proposed to estimate the rust from weather variables. However, these models lack real rust samples because the recollection process of samples requires large expenses of money and time. Considering this issue, we propose in this paper a mechanism called SYnthetic Samples GENerator (SYSGEN). This proposal is based on cubic spline interpolation to increase the size of rust incidence samples (RIS) and expert knowledge to adjust the rust progress curve in Colombian coffee crops. In order to demonstrate the reliability of SYSGEN, we built 132 regression models from synthetic incidence samples (dependent variable) and weather observations (independent variables). To do this, we considered three Colombian coffee regions, five experiments and four regression models. Besides, we used Recursive Feature Elimination (RFE) to select the relevant weather variables. The analysis of these models and RFE are promising since several aspects and effects related with the rust development are revealed. One of these aspects is that the regression models used frequently temperature (maximum, minimum and average) and relative humidity variables. In this sense, it is important to highlight that these meteorological variables are considered by experts as key drivers in germination, penetration, colonization and sporulation phases. In terms of performance, our experiments allow us to conclude that random forest (RF) and bagging trees (BT) reached the lowest Root Mean Square Error (RMSE). Finally, it is important to consider that different datasets produce different performance. For example, if we consider those experiments that involve flowering periods datasets, the lowest RMSE was reached by RF. However, in datasets of coffee harvest periods, BT reached lowest RMSE.
Similar content being viewed by others
References
Akhtar U, Hassan M (2015) Big data mining based on computational intelligence and fuzzy clustering. In: Handbook of research on trends and future directions in big data and web intelligence, pp 130–148
Avelino J (2008) The coffee rust crises in Colombia and Central America. Food Security 7(2):303–321
Bartels RH, Beatty JC, Barsky BA (1987)
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140. https://doi.org/10.1007/bf00058655
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Buitrón EJG, Corrales DC, Avelino J, Iglesias JA, Corrales JC (2019) Rule- based expert system for detection of coffee rust warnings in colombian crops. J Intell Fuzzy Syst 36(5):4765–4775. https://doi.org/10.3233/jifs-179025
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)? Geosci Model Dev Discuss 7(1):1525–1534. https://doi.org/10.5194/gmdd-7-1525-2014
Cintra ME, Meira CA, Monard MC, Camargo HA, Rodrigues LH (2011) The use of fuzzy decision trees for coffee rust warning in Brazilian crops. In: 2011 11th International conference on intelligent systems design and applications, pp 1347–1352
Corrales DC, Ledezma A, Hoyos J, Figueroa A, Corrales JC (2014a) A new dataset for coffee rust detection in Colombian crops base on classifiers. Sistemas Telemát 12(29):9–9. https://doi.org/10.18046/syt.v12i29.1802
Corrales DC, Peña Q, Andrés J, León C, Figueroa A, Corrales JC (2014b) Early warning system for coffee rust disease based on error correcting output codes: a proposal. Rev Ing Univ Medellín 13(25):57–64
Corrales DC, Figueroa A, Ledezma A, Corrales JC (2015) An empirical multi-classifier for coffee rust detection in colombian crops. In: International conference on computational science and its applications, pp 60–74
Corrales DC, Casas AF, Ledezma A, Corrales JC (2016) Two-level classifier ensembles for coffee rust estimation in Colombian crops. Int J Agric Environ Inf Syst 7(3):41–59. https://doi.org/10.4018/ijaeis.2016070103
Corrales DC, German G, Rodriguez JP, Agapito L, Corrales JC (2017) Lack of data: is it enough estimating the coffee rust with meteorological time series. Comput Sci Appl ICCSA 10405:3–16
Corrales DC, Lasso E, Casas AF, Ledezma A, Corrales JC (2018a) Estimation of coffee rust infection and growth through two-level classifier ensembles based on expert knowledge. Int J Bus Intell Data Min 13(4):369–369. https://doi.org/10.1504/ijbidm.2018.094984
Corrales DC, Lasso E, Ledezma A, Corrales JC (2018b) Feature selection for classification tasks: expert knowledge or traditional methods? J Intell Fuzzy Syst 34(5):2825–2835. https://doi.org/10.3233/jifs-169470
Cressie N (1990) The origins of kriging. Math Geol 22(3):239–252. https://doi.org/10.1007/bf00889887
Deguine JP, Gloanec C, Laurent P, Ratnadass A, Aubertot JN (2017) Agroecological crop protection
Eshmawi A, Nair S (2014) Semi-synthetic data for enhanced sms spam detection: using synthetic minority oversampling technique smote. In: Proceedings of the 6th international conference on management of emergent digital ecosystems, pp 206–212
Granitto PM, Furlanello C, Biasioli F, Gasperi F (2006) Recursive feature elimination with random forest for PTR-MS analysis of agroindustrial products. Chemom Intell Lab Syst 83(2):83–90. https://doi.org/10.1016/j.chemolab.2006.01.007
Griffiths E (1972) ‘Negative’ effects of fungicides in coffee. Trop Sci 14(1):79–89
Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28(5):1–26
Kushalappa AC, Eskes AB (1989) Advances in coffee rust research. Annu Rev Phytopathol 27(1):503–531. https://doi.org/10.1146/annurev.py.27.090189.002443
Lasso E, Thamada TT, Meira CAA, Corrales JC (2015) Graph patterns as representation of rules extracted from decision trees for coffee rust detection. In: Research conference on metadata and semantics research, pp 405–414
Lasso E, Valencia O, Corrales DC, López ID, Figueroa A, Corrales JC (2017) A cloud-based platform for decision making support in Colombian agriculture: a study case in coffee rust. In: International conference of ICT for adapting agriculture to climate change, pp 182–196
Luaces O, Rodrigues LHA, Meira CAA, Quevedo JR, Bahamonde A (2010) Viability of an alarm predictor for coffee rust disease using interval regression. In: International conference on industrial, engineering and other applications of applied intelligent systems, pp 337–346
McCook S (2006) Global rust belt: Hemileia vastatrix and the ecological integration of world coffee production since 1850. J Glob Hist 1(2):177–195. https://doi.org/10.1017/s174002280600012x
Neto CD, Rodrigues LHA, Meira CAA (2014) Modelos de predição da ferrugem do cafeeiro (Hemileia vastatrix Berkeley & Broome) por técnicas de mineração de dados. Coffee Sci 9(3):408–418
Nutman FJ, Roberts FM, Clarke RT (1963) Studies on the biology of Hemileia vastatrix Berk. & Br. Trans Brit Mycol Soc 46(1):27–44. https://doi.org/10.1016/s0007-1536(63)80005-4
Orzco-Miranda E (2015)
Perez-Ariza CB, Nicholson AE, Flores MJ (2012) Prediction of coffee rust disease using bayesian networks. In: Proceedings of the sixth European workshop on probabilistic graphical models, pp 259–266
Rivillas CA, Serna CA, Cristancho MA, Gaitan AL (2011) La roya del cafeto en Colombia: Impacto manejo y costos del control. Boletín Técnico 36
Rodríguez JP, Corrales DC, Corrales JC (2018) A process for increasing the samples of coffee rust through machine learning methods. Int J Agric Env Inf Syst 9(2):32–52. https://doi.org/10.4018/ijaeis.2018040103
Sierra S, Osorio O, Gomez G, Leguizamón C (1993) Recomendaciones para el control químico de la roya del cafeto para 1993 (zonas con cosecha principal en el primer semestre del año. Cenicafé
Talhinhas P, Batista D, Diniz I, Vieira A, Silva DN, Loureiro A, Tavares S, Pereira AP, Azinheira HG, Guerra-Guimarães L, Várzea V, do Céu Silva M (2017) The coffee leaf rust pathogen Hemileia vastatrix : one and a half centuries around the tropics. Mol Plant Pathol 18(8):1039–1051. https://doi.org/10.1111/mpp.12512
Waller JM, Bigger M, Hillocks RJ (2007)
Yee TW, Wild CJ (1996) Vector generalized additive models. J Roy Stat Soc Ser B (Methodol) 58(3):481–493. https://doi.org/10.1111/j.2517-6161.1996.tb02095.x
Zambolim L (2015)
Acknowledgements
We thank Centro Nacional de Investigaciones de Café (Cenicafé) and PhD. Álvaro Gaytan Bustamante for his knowledge. We are grateful with COLCIEN-CIAS for PhD scholarship granted to PhD. David Camilo Corrales. In addition, we would like to thank Universidad del Cauca and the research projects “Alternativas Innovadoras de Agricultura Inteligente para sistemas productivos agrícolas del departamento del Cauca soportado en entornos de IoT—ID 4633”, and Internacionalización de Proyectos UEES del Grupo de Ingeniería Telemática-ID 5271, financed by “Red de formación de talento humano para la innovación social y productiva en el departamento del Cauca InnovAcción Cauca”. In addition, this work has been supported by the Spanish MINECO under projects TRA2016-78886-C3-1-R and RTI2018-096036-B-C22.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Girón, E.J., Corrales, D.C., Sesmero, M.P. et al. Synthetic samples generator (SYSGEN), an approach to increase the size of incidence samples in coffee leaf rust modelling. Evolving Systems 13, 625–636 (2022). https://doi.org/10.1007/s12530-021-09395-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-021-09395-0