Skip to main content

Advertisement

Log in

Automatic arable land detection with supervised machine learning

  • Methodology Article
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

In Precision Agriculture one of the basic tasks is the classification of land zones in either arable or non-arable land. Several studies have been conducted using data obtained from soil analysis or local exploration of the parcels. However, sometimes only data from satellite images are available and then the problem not only becomes more challenging but also more interesting to solve because it is much more cost-effective. In this paper, we consider different spectral and thermal bands from the Landsat 8 satellite images corresponding to a vineyard located in Galicia, a region in Northeastern Spain, and apply a range of supervised Machine Learning methods to classify the different land zones. We conclude that an adequate choice of the algorithm parameters together with feature selection techniques can yield a classification that is both highly effective and efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Aggelopooulou K, Castrignanò A, Gemtos T, Benedetto DD (2013) Delineation of management zones in an apple orchard in Greece using a multivariate approach. Comput Electron Agric 90:119–130

    Article  Google Scholar 

  • Altman NS (1992) An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression. The American Statistician 46(3):175–185

    Google Scholar 

  • Amazon (2015a) Amazon S3. http://aws.amazon.com/es/public-data-sets/landsat/, accessed: 2015-07-21

  • Amazon (2015b) Worldwide Reference System. http://landsat.gsfc.nasa.gov/?p=3231, accessed: 2015-07-21

  • Arango R, Díaz I, Campos A, Combarro E, Canas E (2015) On the influence of temporal resolution on automatic delimitation using clustering algorithms. Appl Math Inf Sci 9(2L):339–347

    Google Scholar 

  • Arango R, Campos A, Combarro E, Canas E, Díaz I (2016) Mapping cultivable land from satellite imagery with clustering algorithms. Int J Appl Earth Obs Geoinf 49:99–106

    Article  Google Scholar 

  • Bae JK, Kim J (2011) Combining models from neural networks and inductive learning algorithms. Expert Syst Appl 38(5):4839–4850

    Article  Google Scholar 

  • Blackmore S, Godwin RJ, Fountas S (2003) The analysis of spatial and temporal trends in yield map data over six years. Biosyst Eng 84(4):455–466

    Article  Google Scholar 

  • Ceccato P, Gobron N, Flasse S, Pinty B, Tarantola S (2002) Designing a spectral index to estimate vegetation water content from remote sensing data: Part 1: Theoretical approach. Remote Sens Environ 82(2):188–197

    Article  Google Scholar 

  • Chang YW, Hsieh CJ, Chang KW, Ringgaard M, Lin CJ (2010) Training and testing low-degree polynomial data mappings via linear SVM. J Mach Learn Res 11:1471–1490

    Google Scholar 

  • Chou JS (2012) Comparison of multilabel classification models to forecast project dispute resolutions. Expert Syst Appl 39(11):10,202–10,211

    Article  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    Google Scholar 

  • Díaz I, Ranilla J, Montañés E, Fernández J, Combarro EF (2004) Improving performance of text categorisation by combining filtering and support vector. J Am Soc Inf Sci Technol 55(7):579–592

    Article  Google Scholar 

  • Dreiseitl S, Ohno-Machado L (2002) Logistic regression and artificial neural network classification models: a methodology review. J Biomed Inform 35(5–6):352–359

    Article  Google Scholar 

  • Duro DC, Franklin SE, Dubé MG (2012) A comparison of pixel-based and object-based image analysis with selected machine learning algorithms for the classification of agricultural landscapes using SPOT-5 HRG imagery. Remote Sens Environ 118:259–272

    Article  Google Scholar 

  • EarthOnline (2014) https://earth.esa.int/web/guest/data-access, accessed: 2014-03-02

  • ESA (2014) Sentinel missions. http://www.esa.int/Our_Activities/Observing_the_Earth/Copernicus/Overview4/, accessed: 2016-05-14

  • Farid DM, Rahman MZ, Rahman CM (2011) Article: adaptive intrusion detection based on boosting and Naive Bayesian classifier. Int J Comput Appl 24(3):12–19

    Google Scholar 

  • Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874

    Article  Google Scholar 

  • Fensholt R, Sandholt I (2003) Derivation of a shortwave infrared water stress index from modis near-and shortwave infrared data in a semiarid environment. Remote Sens Environ 87(1):111–121

    Article  Google Scholar 

  • Friedl MA, Brodley CE (1997) Decision tree classification of land cover from remotely sensed data. Remote Sens Environ 61(3):399–409

    Article  Google Scholar 

  • Fu Q, Wang Z, Jiang Q (2010) Delineating soil nutrient management zones based on fuzzy clustering optimized by PSO. Math Comput Model 51(11–12):1299–1305

    Article  Google Scholar 

  • Gislason PO, Benediktsson JA, Sveinsson JR (2006) Random forests for land cover classification. Pattern Recogn Lett 27(4):294– 300

    Article  Google Scholar 

  • Gualtieri JA, Cromp RF (1999) Support vector machines for hyperspectral remote sensing classification. In: The 27th AIPR workshop: advances in computer-assisted recognition. International Society for Optics and Photonics, pp 221–232

  • Hall M (1997) Feature subset selection: a correlation based filter approach

  • Han N, Wu J, Tahmassebi ARS, wei XUH, WANG K (2011) NDVI-based lacunarity texture for improving identification of torreya using object-oriented method. Agric Sci China 10(9):1431–1444

    Article  Google Scholar 

  • Huang C, Davis L, Townshend J (2002) An assessment of support vector machines for land cover classification. Int J Remote Sens 23(4):725–749

    Article  Google Scholar 

  • Huete A, Liu H, Batchily K, Van Leeuwen W (1997) A comparison of vegetation indices over a global set of TM images for EOS-MODIS. Remote Sens Environ 59(3):440–451

    Article  Google Scholar 

  • Hunt ER, Rock BN (1989) Detection of changes in leaf water content using near-and middle-infrared reflectances. Remote Sens Environ 30(1):43–54

    Article  Google Scholar 

  • Jackson RD, Huete AR (1991) Interpreting vegetation indices. Prev Vet Med 11(3):185–200

    Article  Google Scholar 

  • Jardine N, van Rijsbergen CJ (1971) The use of hierarchic clustering in information retrieval. Inf Storage Retr 7(5):217–240

    Article  Google Scholar 

  • Jiang S, Pang G, Wu M, Kuang L (2012) An improved k-nearest-neighbor algorithm for text categorization. Expert Syst Appl 39(1):1503–1509

    Article  Google Scholar 

  • Johnson CK, Mortensen DA, Wienhold BJ, Shanahan JF, Doran JW (2003) Site-specific management zones based on soil electrical conductivity in a semiarid cropping system. Agron J 95(2):303–315

    Article  Google Scholar 

  • Kang DK, Kim MJ (2011) Propositionalized attribute taxonomies from data for data-driven construction of concise classifiers. Expert Syst Appl 38(10):12,739–12,746

    Article  Google Scholar 

  • Kavzoglu T, Mather P (2003) The use of backpropagating artificial neural networks in land cover classification. Int J Remote Sens 24(23):4907–4938

    Article  Google Scholar 

  • Klein I, Gessner U, Kuenzer C (2012) Regional land cover mapping and change detection in Central Asia using MODIS time-series. Appl Geogr 35(1–2):219–234

    Article  Google Scholar 

  • Koc L, Mazzuchi TA, Sarkani S (2012) A network intrusion detection system based on a hidden Naïve Bayes multiclass classifier. Expert Syst Appl 39(18):13,492–13,500

    Article  Google Scholar 

  • Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence (IJCAI’95), vol 2. Morgan Kaufmann Publishers Inc., San Francisco, pp 1137–1143

  • Kriegler F, Malila W, Nalepka R, Richardson W (1969) Preprocessing transformations and their effects on multispectral recognition. Remote Sens Environ VI 1:97

    Google Scholar 

  • Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28(5):1–26

    Article  Google Scholar 

  • Kuhn M, Johnson K (2013) Applied predictive modeling. Springer, New York, Heidelberg, Dordrecht, London

    Book  Google Scholar 

  • Kumar J, Mills RT, Hoffman FM, Hargrove WW (2011) Parallel k-means clustering for quantitative ecoregion delineation using large data sets. Procedia Comput Sci 4:1602–1611

    Article  Google Scholar 

  • Landsat (2013) Landsat. http://landsat.usgs.gov/, accessed: 2015-02-30

  • Lau BC, Ma EW, Chow TW (2014) Probabilistic fault detector for wireless sensor network. Expert Syst Appl 41(8):3703–3711

    Article  Google Scholar 

  • Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inf Theory 37(1):145–151

    Article  Google Scholar 

  • Liu H, Huete A (1995) A feedback based modification of the NDVI to minimize canopy background and atmospheric noise, vol 33, p 457

  • Liu M, Samal A (2002) A fuzzy clustering approach to delineate agroecozones. Ecol Model 149(3):215–228

    Article  Google Scholar 

  • Ludwig B, Nitschke R, Terhoeven-Urselmans T, Michel K, Flessa H (2008) Use of mid-infrared spectroscopy in the diffuse-reflectance mode for the prediction of the composition of organic matter in soil and litter. J Plant Nutr Soil Sci 171(3):384–391

    Article  Google Scholar 

  • Luoa Z, Yaolin L, Jiana W, Jingb W (2008) Quantitative mapping of soil organic material using field spectrometer and hyperspectral remote sensing. Int Arch Photogramm Remote Sens Spat Inf Sci 37:901–906

    Google Scholar 

  • Marconcini M, Camps-Valls G, Bruzzone L (2009) A composite semisupervised svm for classification of hyperspectral images. IEEE Geosci Remote Sens Lett 6(2):234–238

    Article  Google Scholar 

  • Mistikoglu G, Gerek IH, Erdis E, Usmen PM, Cakan H, Kazan EE (2015) Decision tree analysis of construction fall accidents involving roofers. Expert Syst Appl 42(4):2256–2263

    Article  Google Scholar 

  • MODIS (2014) http://lpdaac.usgs.gov/products/modis_products_table, accessed: 2015-05-18

  • Montañés E, Díaz I, Ranilla J, Combarro E, Fernández J (2005) Scoring and selecting terms for text categorization. IEEE Intell Syst 20(3):40–47

    Article  Google Scholar 

  • Moral F, Terrón J, Rebollo F (2011) Site-specific management zones based on the Rasch model and geostatistical techniques. Comput Electron Agric 75(2):223–230

    Article  Google Scholar 

  • Ormeño Villajos S, Arozarena Villar A, Martínez Peña M, Palomo Arroyo M, Villa Alcázar G, Peces Morera J, Pérez García L (2008) Los satélites de media y baja resolución espacial como fuente de datos para la obtención de indicadores ambientales. In: IX Congreso Nacional de Medio Ambiente, Madrid

  • Ortega RA, Santibáñez OA (2007) Determination of management zones in corn (Zea mays L.) based on soil fertility. Comput Electron Agric 58(1):49–59

    Article  Google Scholar 

  • Ottinger M, Kuenzer C, Liu G, Wang S, Dech S (2013) Monitoring land cover dynamics in the Yellow River Delta from 1995 to 2010 based on Landsat 5 TM. Appl Geogr 44:53–68

    Article  Google Scholar 

  • Pal M, Mather P (2005) Support vector machines for classification in remote sensing. Int J Remote Sens 26(5):1007–1011

    Article  Google Scholar 

  • Paliwal M, Kumar UA (2009) Neural networks and statistical techniques: a review of applications. Expert Syst Appl 36(1):2– 17

    Article  Google Scholar 

  • Peralta NR, Costa JL (2013) Delineation of management zones with soil apparent electrical conductivity to improve nutrient management. Comput Electron Agric 99:218–226

    Article  Google Scholar 

  • Powers DMW (2007) Evaluation: from precision, recall and F-factor to ROC, informedness, markedness & correlation. Tech. Rep. SIE-07-001, School of Informatics and Engineering, Flinders University, Adelaide, Australia

  • Quinlan RJ (1994) C4.5: programs for machine learning. Mach Learn 16(3):235–240

    Google Scholar 

  • Quinlan RJ (2000) Data mining tools See5 and C5

  • Ripley BD, Hjort NL (1995) Pattern recognition and neural networks, 1st edn. Cambridge University Press, New York

    Google Scholar 

  • Romanski P, Kotthoff L (2015) FSelector R Package. /FSelector/FSelector.pdf, accessed: 2015-07-21

  • Rubio M, Riaño D, Cheng Y, Ustin S (2006) Estimation of canopy water content from modis using artificial neural networks trained with radiative transfer models. 6th EMS/6th ECAC

  • Schepers AR, Shanahan JF, Liebig MA, Schepers JS, Johnson SH, Luchiari A (2004) Appropriateness of management zones for characterizing spatial variability of soil properties and irrigated corn yields across years. Agron J 96(1):195–203

    Article  Google Scholar 

  • Schuster E, Kumar S, Sarma SE, Willers J, Milliken G (2011) Infrastructure for data-driven agriculture: identifying management zones for cotton using statistical modeling and machine learning techniques. In: 8th international conference expo on emerging technologies for a smarter world (CEWIT), 2011, pp 1– 6

  • Sebastiani F (2002) Machine learning in automated text categorisation. ACM Comput Surv 34(1)

  • Sigpac (2015) Sistema de Información Geográfica de Parcelas Agrícolas. http://sigpac.magrama.es/fega/h5visor/, accessed: 2015-01-20

  • Simbahan GC, Dobermann A (2006) An algorithm for spatially constrained classification of categorical and continuous soil properties. Geoderma 136(3):504–523

    Article  Google Scholar 

  • SPOT-5 (2015) https://goo.gl/LpIaT4/, accessed: 2015-07-21

  • Trombetti M, Riaño D, Rubio M, Cheng Y, Ustin S (2008) Multi-temporal vegetation canopy water content retrieval and interpretation using artificial neural networks for the continental USA. Remote Sens Environ 112(1):203–215

    Article  Google Scholar 

  • USGS (1972) Landsat project. http://landsat.usgs.gov/, accessed: 2015-02-30

  • Xie H, Yang X, Drury C, Yang J, Zhang X (2011) Predicting soil organic carbon and total nitrogen using mid-and near-infrared spectra for Brookston clay loam soil in Southwestern Ontario, Canada. Can J Soil Sci 91(1):53–63

    Article  Google Scholar 

  • Zhang B, Li S, Wu C, Gao L, Zhang W, Peng M (2013) A neighbourhood-constrained k-means approach to classify very high spatial resolution hyperspectral imagery. Remote Sens Lett 4(2):161–170

    Article  Google Scholar 

  • Zhu H, Basir O (2005) An adaptive fuzzy evidential nearest neighbor formulation for classifying remote sensing images. IEEE Trans Geosci Remote Sens 43(8):1874–1889

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to I. Díaz.

Additional information

Communicated by: H. A. Babaie

This work has been supported Farm-Oriented Open Data in Europe (FOODIE) Pilot B from European Union’s Seventh Framework Programme for Research, Technological Development and Demonstration under grant agreement no. 621074.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Arango, R.B., Díaz, I., Campos, A. et al. Automatic arable land detection with supervised machine learning. Earth Sci Inform 9, 535–545 (2016). https://doi.org/10.1007/s12145-016-0270-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-016-0270-6

Keywords

Navigation