Abstract
Advance prediction of crop yield is very critical in the context of ensuring food security as the region specific challenges in social and environmental conditions often infringe plan of policy makers. This study presents a generic methodology to configure and fine tune the state-of-the-art Long Short-Term Memory (LSTM) based Deep Learning (DL) model through hyperparameter optimization for prediction of yield (annual crop production) in Wheat, Groundnut and Barely over India based on multiple independent input variables identified using multicollinearity test. The Monte Carlo cross-validation method is used to validate the optimized LSTM models. Results from the LSTM model tuning showed that among the 4 optimizers tested, Adam was found to perform better irrespective of the crop and Bi-LSTM outperformed sLSTM in terms of prediction accuracy. The percentage reduction in error with Bi-LSTM compared to sLSTM in predicting wheat and groundnut crop yield was 39% and 13% respectively while in case of barley crop, error reduction was marginal (0.34%). The performance of optimized Bi-LSTM model is compared with the performance of traditional machine learning (ML) models such as support vector regression (SVR) and SVR polynomial {2nd and 3rd order}, Auto Regressive Integrated Moving Average (ARIMA) and ARIMAX (ARIMA with exogenous variables) and Vector Auto-regression (VAR). The Bi-LSTM model is found to be superior to ML models; the percentage reduction in mean absolute scaled error with the Bi-LSTM compared to the best performing ML model was 94%, 72%, and 71% in predicting wheat, groundnut and barley yield respectively. This study showed that by choosing proper explanatory (independent) variable and hyperparameter optimization, a simple (single layer) structure of deep neural network (LSTM) outperformed traditional ML models in terms of accuracy for crop yield prediction application.
Similar content being viewed by others
Data availability
Openly available.
Code availability
Codes will be provided on reasonable request.
References
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. John Wiley & Sons, USA
Armstrong JS (ed) (2001) Principles of forecasting: a handbook for researchers and practitioners. Kluwer Academic, Boston, MA, p 30
Hajirahimi Z, Khashei M (2019) Hybrid structures in time series modeling and forecasting: a review. Eng Appl Artif Intell 86:83–106
Wessel M, Quist-Wessel PF (2015) Auto-regressive integrated moving average (ARIMA) modeling of cocoa production in Nigeria: 1900–2025. J Crop Improv 33(4):445–455. https://doi.org/10.1080/15427528.2019.1610534
Wen Q, Wang Y, Zhang H, Li Z (2019) Application of ARIMA and SVM mixed model in agricultural management under the background of intellectual agriculture. Clust Comput 22(6):14349–14358
Verma U (2022) ARIMA and ARIMAX models for sugarcane yield forecasting in northern agro-climatic zone of Haryana. J Agrometeorol 24(2):200–202. https://doi.org/10.54386/jam.v24i2.1086
Mgaya JF (2019) Application of ARIMA models in forecasting livestock products consumption in Tanzania. Cogent Food Agric 5(1):1607430. https://doi.org/10.1080/23311932.2019.1607430
Sapankevych NI, Sankar R (2009) Time series prediction using support vector machines: a survey. IEEE Comput Intell Mag 4(2):24–38
Kok ZH, Shariff ARM, Alfatni MSM, Khairunniza-Bejo S (2021) Support vector machine in precision agriculture: a review. Comput Electron Agric 191:106546
Raj EE, Ramesh KV, Rajkumar R (2019) Modelling the impact of agrometeorological variables on regional tea yield variability in South Indian tea-growing regions: 1981–2015. Cogent Food Agric 5(1):1581457. https://doi.org/10.1080/23311932.2019.1581457
Umoh U, Asuquo D, Eyoh I, Abayomi A, Nyoho E, Vincent H (2022) A fuzzy-based support vector regression framework for crop yield prediction. In Soft Computing: Theories and Applications: Proceedings of SoCTA 2020, Volume 1 (pp. 173–185). Springer Singapore
Parmezan ARS, Souza VM, Batista GE (2019) Evaluation of statistical and machine learning models for time series prediction: identifying the state-of-the-art and the best conditions for the use of each model. Inf Sci 484:302–337
Makridakis S, Spiliotis E, Assimakopoulos V (2018) Statistical and machine learning forecasting methods: concerns and ways forward. PLoS ONE 13(3):e0194889
Schmidt J, Marques MRG, Botti S et al (2019) Recent advances and applications of machine learning in solid-state materials science. NPJ Comput Mater 5:83. https://doi.org/10.1038/s41524-019-0221-0
Rajula HSR, Verlato G, Manchia M, Antonucci N, Fanos V (2020) Comparison of conventional statistical methods with machine learning in medicine: diagnosis, drug development, and treatment. Medicina 56(9):455
Benos L, Tagarakis AC, Dolias G, Berruto R, Kateris D, Bochtis D (2021) Machine learning in agriculture: a comprehensive updated review. Sensors 21(11):3758
Thai TH, Omari RA, Barkusky D, Bellingrath-Kimura SD (2020) Statistical analysis versus the M5P machine learning algorithm to analyze the yield of winter wheat in a long-term fertilizer experiment. Agronomy 10(11):1779
Meshram V, Patil K, Meshram V, Hanchate D, Ramkteke SD (2021) Machine learning in agriculture domain: a state-of-art survey. Artif Intell Life Sci 1:100010
Abiodun OI, Jantan A, Omolara AE, Dada KV, Mohamed NA, Arshad H (2018) State-of-the-art in artificial neural network applications: a survey. Heliyon 4(11):e00938
Khan T, Jiangtao Q, Muhammad AAQ, Muhammad SI, Rashid M, Waqar H (2020) Agricultural fruit prediction using deep neural networks. Procedia Computer Science 174:72–78
Van Klompenburg T, Kassahun A, Catal C (2020) Crop yield prediction using machine learning: a systematic literature review. Comput Electron Agric 177:105709
Akbar A, Kuanar A, Patnaik J, Mishra A, Nayak S (2018) Application of artificial neural network modeling for optimization and prediction of essential oil yield in turmeric (Curcuma longa L.). Comput Electron Agric 148:160–178
Srivastava AK, Safaei N, Khaki S et al (2022) Winter wheat yield prediction using convolutional neural networks from environmental and phenological data. Sci Rep 12:3215. https://doi.org/10.1038/s41598-022-06249-w
Ji B, Sun Y, Yang S, Wan J (2007) Artificial neural networks for rice yield prediction in mountainous regions. J Agric Sci 145(3):249–261
O’Neal MR, Engel BA, Ess DR, Frankenberger JR (2002) AE—Automation and emerging technologies: neural network prediction of maize yield using alternative data coding algorithms. Biosys Eng 83(1):31–45
Wolanin A, Mateo-García G, Camps-Valls G, Gómez-Chova L, Meroni M, Duveiller G, Guanter L (2020) Estimating and understanding crop yields with explainable deep learning in the Indian Wheat Belt. Environ Res Lett 15(2):024019
Yan W (2012) Toward automatic time-series forecasting using neural networks. IEEE Tran Neural Netw Learn Syst 23(7):1028–1039
Zheng C, Wang S, Liu Y, Liu C, Xie W, Fang C, Liu S (2019) A novel equivalent model of active distribution networks based on LSTM. IEEE Trans Neural Netw Learn Syst 30(9):2611–2624
Ergen T, Kozat SS (2017) Efficient online learning algorithms based on LSTM neural networks. IEEE Trans Neural Netw Learn Syst 29(8):3772–3783
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Huawei Technologies Co., Ltd.. (2023) Overview of Deep Learning. In: Artificial Intelligence Technology. Springer, Singapore. https://doi.org/10.1007/978-981-19-2879-6_3
Wu Y, Yuan M, Dong S, Lin L, Liu Y (2018) Remaining useful life estimation of engineered systems using vanilla LSTM neural networks. Neurocomputing 275:167–179
Jiang Z, Liu C, Ganapathysubramanian B, Hayes DJ, Sarkar S (2020) Predicting county-scale maize yields with publicly available data. Sci Rep 10(1):1–12
Sathya P, Gnanasekaran P (2023) Paddy yield prediction in Tamilnadu Delta Region using MLR-LSTM model. Appl Artif Intell 37(1)
Crisóstomo de Castro Filho H, Abílio de Carvalho Júnior O, Ferreira de Carvalho OL, Pozzobon de Bem P, dos Santos de Moura R, Olino de Albuquerque A, Trancoso Gomes RA (2020) Rice crop detection using LSTM, Bi-LSTM, and machine learning models from sentinel-1 time series. Remote Sensing 12(16):2655
Ramesh KV, Rakesh V, Rao EVS (2020) Application of big data analytics and artificial intelligence in agronomic research. Indian J Agron 65(4):383–395
Tian H, Wang P, Tansey K, Zhang J, Zhang S, Li H (2021) An LSTM neural network for improving wheat yield estimates by integrating remote sensing data and meteorological data in the Guanzhong plain, PR China. Agric Forest Meteorol 310:108629
Yin J, Deng Z, Ines AV, Wu J, Rasu E (2020) Forecast of short-term daily reference evapotranspiration under limited meteorological variables using a hybrid bi-directional long short-term memory model (bi-LSTM). Agric Water Manag 242:106386
Nishu B, Anshu S (2021) Deep learning based wheat crop yield prediction model in Punjab Region of North India. Appl Artif Intell 35(15):1304–1328. https://doi.org/10.1080/08839514.2021.1976091
Maharana K, Mondal S, Nemade B (2022) A review: data pre-processing and data augmentation techniques. Global Transit Proc 3(1):91–99. https://doi.org/10.1016/j.gltp.2022.04.020
Salmerón R, García CB, García J (2018) Variance inflation factor and condition number in multiple linear regression. J Stat Comput Simul 88(12):2365–2384
Van Houdt G, Mosquera C, Nápoles G (2020) A review on the long short-term memory model. Artif Intell Rev 53(8):5929–5955
Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecast 22(4):679–688
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Annamalai N, Johnson A (2023) Analysis and forecasting of area under cultivation of Rice in India: univariate time series approach. SN Comput Sci 4:193. https://doi.org/10.1007/s42979-022-01604-0
Anggraeni W, Andri KB, Mahananto F (2017) The performance of ARIMAX model and vector autoregressive (VAR) model in forecasting strategic commodity price in Indonesia. Procedia Comput Sci 124:189–196
Holtz-Eakin D, Newey W, Rosen HS (1988) Estimating vector autoregressions with panel data. Econometrica J Econ Soci 56(6):1371–1395
Greff K, Srivastava RK, Koutnik J, Steunebrink BR, Schmidhuber J (2017) LSTM: A search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232. https://doi.org/10.1109/TNNLS.2016.2582924
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Proces 45:2673–268142
Graves A, Schmidhuber J (2005) Frame wise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5–6):602–610
Yadav A, Jha CK, Sharan A (2020) Optimizing LSTM for time series prediction in Indian stock market. Procedia Comput Sci 167:2091–2100
Ghimire S, Yaseen ZM, Farooque AA et al (2021) Streamflow prediction using an integrated methodology based on convolutional neural network and long short-term memory networks. Sci Rep 11:17497. https://doi.org/10.1038/s41598-021-96751-4
Sheela KG, Deepa SN (2013) Review on methods to fix number of hidden neurons in neural networks. Math Probl Eng 2013:425740. https://doi.org/10.1155/2013/425740
Kandel I, Castelli M (2020) The effect of batch size on the generalizability of the convolutional neural networks on a histopathology dataset. ICT Express 6(4):312–315
Baldi P, Sadowski P (2014) The dropout learning algorithm. Artif Intell 210:78–122
Verma P, Tripathi V, Pant B (2021) Comparison of different optimizers implemented on the deep learning architectures for COVID-19 classification. Mater Today Proc 46:11098–11102
Farzad A, Mashayekhi H, Hassanpour H (2019) A comparative performance analysis of different activation functions in LSTM networks for classification. Neural Comput Appl 31(7):2507–2521
Xu QS, Liang YZ (2001) Monte Carlo cross validation. Chemom Intell Lab Syst 56(1):1–11
Acknowledgements
The authors would like to thank National Mission on Himalayan studies for funding “Integrated system dynamical model to design and testing alternative intervention strategies for effective remediation and sustainable water management for two selected river basins of Indian Himalayas” and “Enhancement of the quality of livelihood opportunities and resilience for the people in the Indian Himalayas, through design of intervention strategies aimed at maximizing resource potential and minimizing risks in urban-rural ecosystem (NMHS-2017/MG-04/480 and NMHS-2017-18/MG-02/478).
Funding
The research was not supported any funding other than institutional support from CSIR, INDIA.
Author information
Authors and Affiliations
Contributions
Kiran Kumar contributed in analyzing data and generating figures. Ramesh and Rakesh contributed in conceptualize, design, data analysis and drafting the manuscript.
Corresponding author
Ethics declarations
Ethics approval
Complied with Ethical Standards of Applied Intelligence Journal.
Consent to participate
All listed authors have approved the manuscript before submission.
Consent for publication
All authors agreed with the content and gave explicit consent to submit and that they obtained consent from the responsible authorities at the institute/organization where the work has been carried out, before the work is submitted.
Conflicts of interest
None.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kiran Kumar, V., Ramesh, K.V. & Rakesh, V. Optimizing LSTM and Bi-LSTM models for crop yield prediction and comparison of their performance with traditional machine learning techniques. Appl Intell 53, 28291–28309 (2023). https://doi.org/10.1007/s10489-023-05005-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05005-5