Skip to main content
Log in

A new hyperparameter to random forest: application of remote sensing in yield prediction

  • RESEARCH
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

Since there has been concern about food security, accurate prediction of wheat yield prior to harvest is a key component. Random Forest (RF) has been used in many classification and regression applications, such as yield estimation, and the performance of RF has improved by tuning its hyperparameters. In this paper, different changes are made to traditional RF for yield estimation, and the performance of RF is evaluated. Accordingly, RFs constructed using various weak learners, as well as a combined RF consisting of different weak learners are assessed by growing weak Gaussian Process Regression (GPR), Decision Tree (DT), Neural Network (NN), and Stepwise Regression (SW) models in the forest. The input data to DTs are also partitioned into leaves in N (e.g., two) dimensional feature space by using clustering in each parent node. In addition, a subset of the training set is randomly sampled with replacement for training a learner in the forest, instead of randomly sampling the whole training set in traditional RF. Using clustering in DTs added flexibility while utilizing NN as a weak learner yielded the most favorable outcomes in our research. The number of input training samples (Itree) to each tree was also identified as a new hyperparameter to the forest, and the prediction results were more influenced by the Itree compared to the known hyperparameters, such as the number of trees in the forest (ntree) and the number of features for each tree (mtry).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Fig. 5
Algorithm 2
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The datasets analyzed during the current study are available on reasonable request.

References

  • Ao Y, Li H et al (2019) Identifying channel sand-body from multiple seismic attributes with an improved random forest algorithm. J Petrol Sci Eng 173:781–792

    Article  Google Scholar 

  • Ashourloo, D, Manafifard M, et al. (2022) Wheat yield prediction based on Sentinel-2, regression and machine learning models in Hamedan, Iran. Scientia Iranica. https://doi.org/10.24200/sci.2022.57809.5429

  • Breiman L (2001) Random Forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Çakır, Y, Kırcı M, et al. (2014) Yield prediction of wheat in south-east region of Turkey by using artificial neural networks. 2014 The Third International Conference on Agro-Geoinformatics, Beijing, China

  • Chaudhary A, Kolhe S et al (2016) An improved random forest classifier for multi-class classification. Inf Process Agric 3(4):215–222

    Google Scholar 

  • Chu L, Huang C et al (2020) Spatial heterogeneity of winter wheat yield and its determinants in the Yellow River Delta China. Sustain 12(1):135. https://doi.org/10.3390/su12010135

    Article  Google Scholar 

  • Demidova LA, Klyueva IA et al (2019) Hybrid approach to improving the results of the SVM classification using the random forest algorithm. Procedia Comput Sci 150:455–461

    Article  Google Scholar 

  • Dong X, Li G et al (2021) Multiscale feature extraction from the perspective of graph for hob fault diagnosis using spectral graph wavelet transform combined with improved random forest. Measurement 176:109178

    Article  Google Scholar 

  • Du M, Noguchi N (2017) Monitoring of wheat growth status and mapping of wheat yield’s within-field spatial variations using color images acquired from uav-camera system. Remote Sens 9(3):289

    Article  Google Scholar 

  • Feng Y, Lin W et al (2021) Effects of fallow tillage on winter wheat yield and predictions under different precipitation types. PeerJ 9:e12602–e12602

    Article  Google Scholar 

  • Feng T, Wang C et al (2022) An improved artificial bee colony-random forest (IABC-RF) model for predicting the tunnel deformation due to an adjacent foundation pit excavation. Underground Space 7(4):514–527

    Article  Google Scholar 

  • Fu Z, Jiang J et al (2020) Wheat growth monitoring and yield estimation based on multi-rotor unmanned aerial vehicle. Remote Sens 12(3):508

    Article  Google Scholar 

  • Gao X, Wen J et al (2019) An improved random forest algorithm for predicting employee turnover. Math Probl Eng 2019:4140707

    Article  Google Scholar 

  • Halwani M, Bachinger J (2021) Using four data mining techniques to predict grain yield response of winter wheat under organic farming system. Lecture Notes in Informatics (LNI). Gesellschaft Für Informatik, Bonn 2021:121–126

    Google Scholar 

  • Han Q, Gui C et al (2019) A generalized method to predict the compressive strength of high-performance concrete by improved random forest algorithm. Constr Build Mater 226:734–742

    Article  Google Scholar 

  • Han J, Zhang Z et al (2020) Prediction of Winter Wheat Yield Based on Multi-Source Data and Machine Learning in China. Remote Sens 12(2):236

    Article  Google Scholar 

  • Han S, Williamson BD et al (2021) Improving random forest predictions in small datasets from two-phase sampling designs. BMC Med Inform Decis Mak 21(1):322

    Article  Google Scholar 

  • Ishwaran H, Kogalur UB et al (2011) Random survival forests for high-dimensional data. Statistical Anal Data Min 4:115–132

    Article  Google Scholar 

  • Jalal N, Mehmood A et al (2022) A novel improved random forest for text classification using feature ranking and optimal number of trees. J King Saud Univ – Comput Inf Sci 34(6):2733–2742. https://doi.org/10.1016/j.jksuci.2022.03.012

    Article  Google Scholar 

  • Kalaiselvi B, Thangamani M (2020) An efficient Pearson correlation based improved random forest classification for protein structure prediction techniques. Measurement 162:107885

    Article  Google Scholar 

  • Kulkarni VY, Sinha DPK (2013) Random forest classifiers : a survey and future research directions. Int J Adv Comput 36(1):1144–1153

    Google Scholar 

  • Lei M, Yu X et al (2018) Geographic origin identification of coal using near-infrared spectroscopy combined with improved random forest method. Infrared Phys Technol 92:177–182

    Article  Google Scholar 

  • Li J, Veeranampalayam-Sivakumar A-N et al (2019) Principal variable selection to explain grain yield variation in winter wheat from features extracted from UAV imagery. Plant Methods 15(1):123

    Article  Google Scholar 

  • Li X, Liu J et al (2021) Measurement and analysis of regional agricultural water and soil resource composite system harmony with an improved random forest model based on a dragonfly algorithm. J Clean Prod 305:127217

    Article  Google Scholar 

  • Murakami K, Shimoda S et al (2021) Prediction of municipality-level winter wheat yield based on meteorological data using machine learning in Hokkaido. Japan Plos One 16(10):1–19

    Google Scholar 

  • Pang A, Chang MWL et al (2022) Evaluation of random forests (RF) for regional and local-scale wheat yield prediction in southeast Australia. Sensors 22(3):717

    Article  Google Scholar 

  • Paul A, Mukherjee DP et al (2018) Improved random forest for classification. IEEE Trans Image Process 27(8):4012–4024

    Article  Google Scholar 

  • Rahman MM, Crain J et al (2021) Improving wheat yield prediction using secondary traits and high-density phenotyping under heat-stressed environments. Front Plant Sci 12:633–651

    Article  Google Scholar 

  • Ren J, Chen Z et al (2008) Regional yield estimation for winter wheat with MODIS-NDVI data in Shandong, China. Int J Appl Earth Obs Geoinf 10(4):403–413

    Google Scholar 

  • Robnik-Šikonja, M (2004) Improving Random Forests. Machine Learning: ECML 2004, Berlin, Heidelberg, Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-30115-8_34

  • Roell YE, Beucher A et al (2020) Comparing a random forest based prediction of winter wheat yield to historical tield potential. Agronomy 10(3):1–17

    Article  Google Scholar 

  • Shahhosseini, M, Hu G (2020) Improved weighted random forest for classification problems. ArXiv: 1–16

  • Sharma SK, Lilhore UK et al (2021) An improved random forest algorithm for predicting the COVID-19 pandemic patient health Annals of R.S.C.B. Sci Rep 25(1):67–75

    Google Scholar 

  • Sharma, S, Rai S, et al. (2020) Wheat crop yield prediction using deep LSTM model. ArXiv abs/2011.01498

  • Srivastava AK, Safaei N et al (2022) Winter wheat yield prediction using convolutional neural networks from environmental and phenological data. Sci Rep 12(1):3215

    Article  Google Scholar 

  • Sun J, Shen Z (2022) Research on improved random forest algorithm for highly unbalanced data. J Phys: Conf Ser 2333(1):1–6

    Article  Google Scholar 

  • Wang F, Ma S et al (2018) A hybrid model integrating improved flower pollination algorithm-based feature selection and improved random forest for NOX emission estimation of coal-fired power plants. Measurement 125:303–312

    Article  Google Scholar 

  • Xie Y, Li X et al (2009) Customer churn prediction using improved balanced random forests. Expert Syst Appl 36:5445–5449

    Article  Google Scholar 

  • Xin, L (2018) An improved text classifier based on random forest algorithm - comparative studies on multiple text classifiers. In: Proceedings of the 2017 4th International Conference on Machinery, Materials and Computer (MACMC 2017), Atlantis Press 150:175–178

  • Xu B, Guo X et al (2012) An Improved Random Forest Classifier for Text Categorization. J Comput 7:2913–2920

    Article  Google Scholar 

  • Xu C, Wan J et al (2021) Prediction of prognosis and survival of patients with gastric cancer by a weighted improved random forest model: an application of machine learning in medicine. Arch Med Sci 18(5):1208–1220

    Google Scholar 

  • Xue D, Cheng Y et al (2020) An improved random forest model applied to point cloud classification. IOP Conf Ser: Mater Sci Eng 768(7):1–6

    Article  Google Scholar 

  • Yang M, Zhao M et al (2021) Improved random forest method for ultra-short-term prediction of the output power of a photovoltaic cluster. Front Energy Res 9:1–12

    Article  Google Scholar 

  • Yu Y, Wang L et al (2020) An Improved Random Forest Algorithm. J Phys: Conf Ser 1646:1–6

    Google Scholar 

  • Zhang Y, Luo L et al (2021) Improved random forest algorithm based on decision paths for fault diagnosis of chemical process with incomplete data. Sensors (basel) 21(20):6715

    Article  Google Scholar 

  • Zhu Y, Xu W et al (2020) Random Forest enhancement using improved Artificial Fish Swarm for the medial knee contact force prediction. Artif Intell Med 103:101811

    Article  Google Scholar 

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Contributions

The entire manuscript was authored and implemented solely by the first author.

Corresponding author

Correspondence to Mehrtash Manafifard.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Communicated by H. Babaie

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Manafifard, M. A new hyperparameter to random forest: application of remote sensing in yield prediction. Earth Sci Inform 17, 63–73 (2024). https://doi.org/10.1007/s12145-023-01156-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-023-01156-8

Keywords

Navigation