Skip to main content
Log in

Optimizing LSTM and Bi-LSTM models for crop yield prediction and comparison of their performance with traditional machine learning techniques

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Advance prediction of crop yield is very critical in the context of ensuring food security as the region specific challenges in social and environmental conditions often infringe plan of policy makers. This study presents a generic methodology to configure and fine tune the state-of-the-art Long Short-Term Memory (LSTM) based Deep Learning (DL) model through hyperparameter optimization for prediction of yield (annual crop production) in Wheat, Groundnut and Barely over India based on multiple independent input variables identified using multicollinearity test. The Monte Carlo cross-validation method is used to validate the optimized LSTM models. Results from the LSTM model tuning showed that among the 4 optimizers tested, Adam was found to perform better irrespective of the crop and Bi-LSTM outperformed sLSTM in terms of prediction accuracy. The percentage reduction in error with Bi-LSTM compared to sLSTM in predicting wheat and groundnut crop yield was 39% and 13% respectively while in case of barley crop, error reduction was marginal (0.34%). The performance of optimized Bi-LSTM model is compared with the performance of traditional machine learning (ML) models such as support vector regression (SVR) and SVR polynomial {2nd and 3rd order}, Auto Regressive Integrated Moving Average (ARIMA) and ARIMAX (ARIMA with exogenous variables) and Vector Auto-regression (VAR). The Bi-LSTM model is found to be superior to ML models; the percentage reduction in mean absolute scaled error with the Bi-LSTM compared to the best performing ML model was 94%, 72%, and 71% in predicting wheat, groundnut and barley yield respectively. This study showed that by choosing proper explanatory (independent) variable and hyperparameter optimization, a simple (single layer) structure of deep neural network (LSTM) outperformed traditional ML models in terms of accuracy for crop yield prediction application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1:
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

Openly available.

Code availability

Codes will be provided on reasonable request.

References

  1. Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. John Wiley & Sons, USA

  2. Armstrong JS (ed) (2001) Principles of forecasting: a handbook for researchers and practitioners. Kluwer Academic, Boston, MA, p 30

    Google Scholar 

  3. Hajirahimi Z, Khashei M (2019) Hybrid structures in time series modeling and forecasting: a review. Eng Appl Artif Intell 86:83–106

    MATH  Google Scholar 

  4. Wessel M, Quist-Wessel PF (2015) Auto-regressive integrated moving average (ARIMA) modeling of cocoa production in Nigeria: 1900–2025. J Crop Improv 33(4):445–455. https://doi.org/10.1080/15427528.2019.1610534

    Article  Google Scholar 

  5. Wen Q, Wang Y, Zhang H, Li Z (2019) Application of ARIMA and SVM mixed model in agricultural management under the background of intellectual agriculture. Clust Comput 22(6):14349–14358

    Google Scholar 

  6. Verma U (2022) ARIMA and ARIMAX models for sugarcane yield forecasting in northern agro-climatic zone of Haryana. J Agrometeorol 24(2):200–202. https://doi.org/10.54386/jam.v24i2.1086

    Article  Google Scholar 

  7. Mgaya JF (2019) Application of ARIMA models in forecasting livestock products consumption in Tanzania. Cogent Food Agric 5(1):1607430. https://doi.org/10.1080/23311932.2019.1607430

    Article  Google Scholar 

  8. Sapankevych NI, Sankar R (2009) Time series prediction using support vector machines: a survey. IEEE Comput Intell Mag 4(2):24–38

    Google Scholar 

  9. Kok ZH, Shariff ARM, Alfatni MSM, Khairunniza-Bejo S (2021) Support vector machine in precision agriculture: a review. Comput Electron Agric 191:106546

    Google Scholar 

  10. Raj EE, Ramesh KV, Rajkumar R (2019) Modelling the impact of agrometeorological variables on regional tea yield variability in South Indian tea-growing regions: 1981–2015. Cogent Food Agric 5(1):1581457. https://doi.org/10.1080/23311932.2019.1581457

    Article  Google Scholar 

  11. Umoh U, Asuquo D, Eyoh I, Abayomi A, Nyoho E, Vincent H (2022) A fuzzy-based support vector regression framework for crop yield prediction. In Soft Computing: Theories and Applications: Proceedings of SoCTA 2020, Volume 1 (pp. 173–185). Springer Singapore

  12. Parmezan ARS, Souza VM, Batista GE (2019) Evaluation of statistical and machine learning models for time series prediction: identifying the state-of-the-art and the best conditions for the use of each model. Inf Sci 484:302–337

    Google Scholar 

  13. Makridakis S, Spiliotis E, Assimakopoulos V (2018) Statistical and machine learning forecasting methods: concerns and ways forward. PLoS ONE 13(3):e0194889

    Google Scholar 

  14. Schmidt J, Marques MRG, Botti S et al (2019) Recent advances and applications of machine learning in solid-state materials science. NPJ Comput Mater 5:83. https://doi.org/10.1038/s41524-019-0221-0

    Article  Google Scholar 

  15. Rajula HSR, Verlato G, Manchia M, Antonucci N, Fanos V (2020) Comparison of conventional statistical methods with machine learning in medicine: diagnosis, drug development, and treatment. Medicina 56(9):455

    Google Scholar 

  16. Benos L, Tagarakis AC, Dolias G, Berruto R, Kateris D, Bochtis D (2021) Machine learning in agriculture: a comprehensive updated review. Sensors 21(11):3758

    Google Scholar 

  17. Thai TH, Omari RA, Barkusky D, Bellingrath-Kimura SD (2020) Statistical analysis versus the M5P machine learning algorithm to analyze the yield of winter wheat in a long-term fertilizer experiment. Agronomy 10(11):1779

    Google Scholar 

  18. Meshram V, Patil K, Meshram V, Hanchate D, Ramkteke SD (2021) Machine learning in agriculture domain: a state-of-art survey. Artif Intell Life Sci 1:100010

    Google Scholar 

  19. Abiodun OI, Jantan A, Omolara AE, Dada KV, Mohamed NA, Arshad H (2018) State-of-the-art in artificial neural network applications: a survey. Heliyon 4(11):e00938

    Google Scholar 

  20. Khan T, Jiangtao Q, Muhammad AAQ, Muhammad SI, Rashid M, Waqar H (2020) Agricultural fruit prediction using deep neural networks. Procedia Computer Science 174:72–78

    Google Scholar 

  21. Van Klompenburg T, Kassahun A, Catal C (2020) Crop yield prediction using machine learning: a systematic literature review. Comput Electron Agric 177:105709

    Google Scholar 

  22. Akbar A, Kuanar A, Patnaik J, Mishra A, Nayak S (2018) Application of artificial neural network modeling for optimization and prediction of essential oil yield in turmeric (Curcuma longa L.). Comput Electron Agric 148:160–178

    Google Scholar 

  23. Srivastava AK, Safaei N, Khaki S et al (2022) Winter wheat yield prediction using convolutional neural networks from environmental and phenological data. Sci Rep 12:3215. https://doi.org/10.1038/s41598-022-06249-w

    Article  Google Scholar 

  24. Ji B, Sun Y, Yang S, Wan J (2007) Artificial neural networks for rice yield prediction in mountainous regions. J Agric Sci 145(3):249–261

    Google Scholar 

  25. O’Neal MR, Engel BA, Ess DR, Frankenberger JR (2002) AE—Automation and emerging technologies: neural network prediction of maize yield using alternative data coding algorithms. Biosys Eng 83(1):31–45

    Google Scholar 

  26. Wolanin A, Mateo-García G, Camps-Valls G, Gómez-Chova L, Meroni M, Duveiller G, Guanter L (2020) Estimating and understanding crop yields with explainable deep learning in the Indian Wheat Belt. Environ Res Lett 15(2):024019

    Google Scholar 

  27. Yan W (2012) Toward automatic time-series forecasting using neural networks. IEEE Tran Neural Netw Learn Syst 23(7):1028–1039

    Google Scholar 

  28. Zheng C, Wang S, Liu Y, Liu C, Xie W, Fang C, Liu S (2019) A novel equivalent model of active distribution networks based on LSTM. IEEE Trans Neural Netw Learn Syst 30(9):2611–2624

    Google Scholar 

  29. Ergen T, Kozat SS (2017) Efficient online learning algorithms based on LSTM neural networks. IEEE Trans Neural Netw Learn Syst 29(8):3772–3783

    MathSciNet  Google Scholar 

  30. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Google Scholar 

  31. Huawei Technologies Co., Ltd.. (2023) Overview of Deep Learning. In: Artificial Intelligence Technology. Springer, Singapore. https://doi.org/10.1007/978-981-19-2879-6_3

  32. Wu Y, Yuan M, Dong S, Lin L, Liu Y (2018) Remaining useful life estimation of engineered systems using vanilla LSTM neural networks. Neurocomputing 275:167–179

    Google Scholar 

  33. Jiang Z, Liu C, Ganapathysubramanian B, Hayes DJ, Sarkar S (2020) Predicting county-scale maize yields with publicly available data. Sci Rep 10(1):1–12

    Google Scholar 

  34. Sathya P, Gnanasekaran P (2023) Paddy yield prediction in Tamilnadu Delta Region using MLR-LSTM model. Appl Artif Intell 37(1)

  35. Crisóstomo de Castro Filho H, Abílio de Carvalho Júnior O, Ferreira de Carvalho OL, Pozzobon de Bem P, dos Santos de Moura R, Olino de Albuquerque A, Trancoso Gomes RA (2020) Rice crop detection using LSTM, Bi-LSTM, and machine learning models from sentinel-1 time series. Remote Sensing 12(16):2655

    Google Scholar 

  36. Ramesh KV, Rakesh V, Rao EVS (2020) Application of big data analytics and artificial intelligence in agronomic research. Indian J Agron 65(4):383–395

    Google Scholar 

  37. Tian H, Wang P, Tansey K, Zhang J, Zhang S, Li H (2021) An LSTM neural network for improving wheat yield estimates by integrating remote sensing data and meteorological data in the Guanzhong plain, PR China. Agric Forest Meteorol 310:108629

    Google Scholar 

  38. Yin J, Deng Z, Ines AV, Wu J, Rasu E (2020) Forecast of short-term daily reference evapotranspiration under limited meteorological variables using a hybrid bi-directional long short-term memory model (bi-LSTM). Agric Water Manag 242:106386

    Google Scholar 

  39. Nishu B, Anshu S (2021) Deep learning based wheat crop yield prediction model in Punjab Region of North India. Appl Artif Intell 35(15):1304–1328. https://doi.org/10.1080/08839514.2021.1976091

    Article  Google Scholar 

  40. Maharana K, Mondal S, Nemade B (2022) A review: data pre-processing and data augmentation techniques. Global Transit Proc 3(1):91–99. https://doi.org/10.1016/j.gltp.2022.04.020

    Article  Google Scholar 

  41. Salmerón R, García CB, García J (2018) Variance inflation factor and condition number in multiple linear regression. J Stat Comput Simul 88(12):2365–2384

    MathSciNet  MATH  Google Scholar 

  42. Van Houdt G, Mosquera C, Nápoles G (2020) A review on the long short-term memory model. Artif Intell Rev 53(8):5929–5955

    Google Scholar 

  43. Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecast 22(4):679–688

    Google Scholar 

  44. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  45. Annamalai N, Johnson A (2023) Analysis and forecasting of area under cultivation of Rice in India: univariate time series approach. SN Comput Sci 4:193. https://doi.org/10.1007/s42979-022-01604-0

    Article  Google Scholar 

  46. Anggraeni W, Andri KB, Mahananto F (2017) The performance of ARIMAX model and vector autoregressive (VAR) model in forecasting strategic commodity price in Indonesia. Procedia Comput Sci 124:189–196

    Google Scholar 

  47. Holtz-Eakin D, Newey W, Rosen HS (1988) Estimating vector autoregressions with panel data. Econometrica J Econ Soci 56(6):1371–1395

    MATH  Google Scholar 

  48. Greff K, Srivastava RK, Koutnik J, Steunebrink BR, Schmidhuber J (2017) LSTM: A search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232. https://doi.org/10.1109/TNNLS.2016.2582924

    Article  MathSciNet  Google Scholar 

  49. Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Proces 45:2673–268142

    Google Scholar 

  50. Graves A, Schmidhuber J (2005) Frame wise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18(5–6):602–610

    Google Scholar 

  51. Yadav A, Jha CK, Sharan A (2020) Optimizing LSTM for time series prediction in Indian stock market. Procedia Comput Sci 167:2091–2100

    Google Scholar 

  52. Ghimire S, Yaseen ZM, Farooque AA et al (2021) Streamflow prediction using an integrated methodology based on convolutional neural network and long short-term memory networks. Sci Rep 11:17497. https://doi.org/10.1038/s41598-021-96751-4

    Article  Google Scholar 

  53. Sheela KG, Deepa SN (2013) Review on methods to fix number of hidden neurons in neural networks. Math Probl Eng 2013:425740. https://doi.org/10.1155/2013/425740

    Article  Google Scholar 

  54. Kandel I, Castelli M (2020) The effect of batch size on the generalizability of the convolutional neural networks on a histopathology dataset. ICT Express 6(4):312–315

    Google Scholar 

  55. Baldi P, Sadowski P (2014) The dropout learning algorithm. Artif Intell 210:78–122

    MathSciNet  MATH  Google Scholar 

  56. Verma P, Tripathi V, Pant B (2021) Comparison of different optimizers implemented on the deep learning architectures for COVID-19 classification. Mater Today Proc 46:11098–11102

    Google Scholar 

  57. Farzad A, Mashayekhi H, Hassanpour H (2019) A comparative performance analysis of different activation functions in LSTM networks for classification. Neural Comput Appl 31(7):2507–2521

    Google Scholar 

  58. Xu QS, Liang YZ (2001) Monte Carlo cross validation. Chemom Intell Lab Syst 56(1):1–11

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank National Mission on Himalayan studies for funding “Integrated system dynamical model to design and testing alternative intervention strategies for effective remediation and sustainable water management for two selected river basins of Indian Himalayas” and “Enhancement of the quality of livelihood opportunities and resilience for the people in the Indian Himalayas, through design of intervention strategies aimed at maximizing resource potential and minimizing risks in urban-rural ecosystem (NMHS-2017/MG-04/480 and NMHS-2017-18/MG-02/478).

Funding

The research was not supported any funding other than institutional support from CSIR, INDIA.

Author information

Authors and Affiliations

Authors

Contributions

Kiran Kumar contributed in analyzing data and generating figures. Ramesh and Rakesh contributed in conceptualize, design, data analysis and drafting the manuscript.

Corresponding author

Correspondence to K. V. Ramesh.

Ethics declarations

Ethics approval

Complied with Ethical Standards of Applied Intelligence Journal.

Consent to participate

All listed authors have approved the manuscript before submission.

Consent for publication

All authors agreed with the content and gave explicit consent to submit and that they obtained consent from the responsible authorities at the institute/organization where the work has been carried out, before the work is submitted.

Conflicts of interest

None.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kiran Kumar, V., Ramesh, K.V. & Rakesh, V. Optimizing LSTM and Bi-LSTM models for crop yield prediction and comparison of their performance with traditional machine learning techniques. Appl Intell 53, 28291–28309 (2023). https://doi.org/10.1007/s10489-023-05005-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-05005-5

Keywords

Navigation