A particle swarm optimization based ensemble for vegetable crop disease recognition

https://doi.org/10.1016/j.compag.2020.105747Get rights and content

Highlights

  • A new EnsPSO approach is presented for multiclass classification.

  • The EnsPSO is applied to multiclass vegetable crop disease dataset.

  • The EnsPSO outperforms ensemble Vote for vegetable crop disease recognition.

  • The EnsPSO shows better performance than Vote also for standard datasets.

Abstract

Ensemble methods give better performance compared to a single machine learning algorithm. Vote is one of the best ensembles. Vote merges predictions from Simple Logistics and the Naive Bayes algorithms in the present work. The paper presents a new ensemble approach – Ensemble Particle Swarm Optimization (EnsPSO). The EnsPSO approach is a combination of (i) Vote, (ii) Correlation based Feature(s) Selection (CFS) method, (iii) PSO algorithm and (iv) random sampling method. The EnsPSO shows better performance results than Vote. The EnsPSO shows higher classification accuracy (96%) as compared to Vote (84%). The performance enhancement of EnsPSO is also proved using ten-fold cross validation on 3 standard datasets.

Introduction

Machine learning algorithms enable effective decision making (Wolpert, 1992, Alpaydin, 2004, Chaudhary et al., 2016a, Chaudhary et al., 2016b, Uddin et al., 2019) when used for the cases of high dimensional agriculture data (Chaudhary et al., 2013a, Chaudhary et al., 2013b, Liakos et al., 2018, Rangarajan et al., 2018, Lawrence et al., 2020). The algorithms efficiently mine the complex relationships in the data (Rocha et al., 2010). The feature selection methods help in choosing the most relevant features from the big datasets (Timmermans and Hulzebosch, 1996, Kundu et al., 2011, Hill et al., 2014, EI-Bendary et al., 2015). Researchers showed that Logistic Regression and Naïve Bayes correctly identify the plant diseases (Baker and Kirk, 2007, Gutiérrez et al., 2008, Sankaran et al., 2010, Phadikar et al., 2013).

Brinjal, Beet, Cabbage, Celery, Chilli, French bean, Okra, Onion, Turnip, Potato, Tomato and Pepper are the vital vegetable crops. An important reason for their unstable and less production is the incidence of pest infections and diseases. Different bacteria, fungi, viruses, nematodes and physiological disorders are responsible for diseases. Exact recognition of disease is a multiclass classification problem.

Present work is conducted for classification of diseases using data samples for Anthracnose, Bacterial wilt, Black leg, Black-rot, Chilli-mosaic, Club-root, Downy-mildew, Early blight, Fusarium wilt, Gray mold, Late blight, Leaf-spot, Onion-smut, Powdery-mildew, Rust, Septoria leaf spot, Verticillium wilt, Yellow vein mosaic.

The ensembles classify better than the individual machine learning algorithms (Hansen and Salamon, 1990, Schapire, 1990, Breiman, 1996, Ho, 1998, Bay, 1999, Opitz, 1999, Ting and Witten, 1999, Zheng and Webb, 1999, Dietterich, 2000, Stamatatos and Widmer, 2005, Kotsiantis, 2007, Sun et al., 2007, Bolón-Canedo et al., 2012, Hsu, 2012, Farid et al., 2014).

The present work suggests a new EnsPSO approach with intent to enhance the performance outcomes of Vote. The EnsPSO is a combination of (i) Vote, (ii) CFS method, (iii) PSO algorithm and (iv) random sampling method. The work also presents performance comparison of newly proposed EnsPSO approach with Vote ensemble. The EnsPSO approach is applied for recognition of vegetable crop diseases. Section 2 describes the details of the materials and methods used. Section 3 describes proposed EnsPSO approach. Section 4 presents the results and discusses them. Section 5 summarizes the conclusions drawn.

Section snippets

Materials and methods

Present work is conducted using WEKA (Witten and Frank, 2005, Hall et al., 2009). WEKA consists of various supervised and unsupervised machine learning algorithms. It provides an extensive set of data pre-processing and modeling methods.

The proposed EnsPSO approach

The EnsPSO is intended to improve the disease classification accuracy as compared to Vote. The pseudo- code of EnsPSO is shown in Algorithm 1. Consider a disease dataset as D = Dtraining ∪ Dtesting where D contains disease influencing features, and the resultant diseases.

Algorithm 1. EnsPSOInput: Dtraining = {S, F, C} where S = {s1, s2,…,sn} is non empty finite set of n samples such that for each sk ∈ S, 1 ≤ j ≤ n, has m features F = {f1, f2,…, fm} such that for each fp ∈ F, 1 ≤ p ≤ m, and

Results and discussion

Consistent estimates for classification accuracy are obtained using 10-fold cross validation strategy for assessing the performance of any machine learning algorithm (Baldi et al., 2000, Azar et al., 2014). The 10-fold cross validation strategy in present work divides the vegetable crop disease dataset into 10 pieces or 10 folds. The strategy used in the present work results in 10 evaluation results which are then averaged. Hence, each experiment is performed using 10-fold cross validation

Conclusions

The EnsPSO approach is an important contribution of present work. The EnsPSO presented for multiclass classification problems successfully recognize the vegetable crop diseases. The EnsPSO scales up the disease classification accuracy to 96% as compared to Vote which yields 84% accuracy. The EnsPSO shows better performance as compared to Vote for other performance measures as well. The EnsPSO approach is also tested for classification accuracy with 10-fold cross validation strategy on 3

CRediT authorship contribution statement

Archana Chaudhary: Conceptualization, Methodology, Data curation, Writing - original draft. Ramesh Thakur: Visualization, Investigation, Formal analysis. Savita Kolhe: Validation. Raj Kamal: Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (60)

  • R. Kohavi et al.

    Wrappers for feature Subset Selection

    Artif. Intell.

    (1997)
  • S. Kolhe et al.

    An intelligent multimedia interface for fuzzy-logic based inference in crops

    Expert Syst. Appl.

    (2011)
  • A. Kumar et al.

    Optimizing Feature Selection using Particle Swarm Optimization and Utilizing Ventral Sides of Leaves for Plant Leaf Classification Twelfth International Multi-Conference on Information Processing

  • A. Özcift

    Random forests ensemble classifier trained with data resampling strategy to improve cardiac arrhythmia diagnosis

    Comput. Biol. Med.

    (2011)
  • S. Phadikar et al.

    Rice diseases classification using feature selection and rule generation techniques

    Comput. Electron. Agric.

    (2013)
  • A.K. Rangarajan et al.

    Tomato crop disease classification using pre-trained deep learning algorithm

    Procedia Comput. Sci.

    (2018)
  • A. Rocha et al.

    Automatic fruit and vegetable classification from images

    Comput. Electron. Agric.

    (2010)
  • S. Sankaran et al.

    A review of advanced techniques for detecting plant diseases

    Comput. Electron. Agric.

    (2010)
  • L. Silva et al.

    Comparative assessment of feature selection and classification techniques for visual inspection of pot plant seedlings

    Comput. Electron. Agric.

    (2013)
  • M. Sokolova et al.

    A systematic analysis of performance measures for classification tasks

    Inf. Process. Manage.

    (2009)
  • S. Sun et al.

    An experimental evaluation of ensemble methods for EEG signals classification

    Pattern Recogn. Lett.

    (2007)
  • E. Stamatatos et al.

    Automatic identification of music performers with learning ensembles

    Artif. Intell.

    (2005)
  • A.J.M. Timmermans et al.

    Computer vision system for on-line sorting of pot plants using an artificial neural network classifier

    Comput. Electron. Agric.

    (1996)
  • D.H. Wolpert

    Stacked generalization

    Neural Networks

    (1992)
  • E. Alpaydin

    Introduction to machine learning

    (2004)
  • P. Baldi et al.

    Assessing the accuracy of prediction algorithms for classification and overview

    Bioinformatics

    (2000)
  • E. Bauer et al.

    An empirical comparison of voting classification algorithms: bagging, boosting, and variants

    Mach. Learn.

    (1999)
  • L. Breiman

    Bagging predictors

    Mach. Learn.

    (1996)
  • A. Chaudhary et al.

    Machine learning classification techniques: a comparative study

    Int. J. Adv. Comput. Theory Eng.

    (2013)
  • A. Chaudhary et al.

    Performance evaluation of feature selection methods for Mobile devices

    Int. J. Eng. Res. Appl.

    (2013)
  • Cited by (32)

    • Application of AI techniques and robotics in agriculture: A review

      2023, Artificial Intelligence in the Life Sciences
    • Smart farming using artificial intelligence: A review

      2023, Engineering Applications of Artificial Intelligence
    • Collaboration of features optimization techniques for the effective diagnosis of glaucoma in retinal fundus images

      2022, Advances in Engineering Software
      Citation Excerpt :

      The study [45] was based on real-world Coronary artery disease (CAD) data and attempts to propose a hybrid binary-real PSO that combines categorical and numerical particle encodings and a novel approach for estimating particle velocity. The paper [46] presents a new ensemble approach – Ensemble Particle Swarm Optimization (EnsPSO). The EnsPSO approach is a combination of (i) Vote, (ii) Correlation Based Feature(s) Selection (CFS) method, (iii) PSO algorithm, and (iv) random sampling method.

    • Citrus greening disease recognition algorithm based on classification network using TRL-GAN

      2022, Computers and Electronics in Agriculture
      Citation Excerpt :

      Nowadays, deep convolutional models can be used to learn features instead of manual feature extraction. With deep learning technology and increasingly sufficient plant disease image data, the traditional machine learning shows that the problems of extracting plant features and poor generalization ability are effectively solved, and many plant diseases can be recognized simultaneously by using deep convolutional networks instead of traditional manual features extraction (Chaudhary et al., 2020; Ma et al., 2018; Li et al., 2020; Zhou et al., 2021; Waheed et al., 2020; Zeng and Li, 2020; Zhang et al., 2019; Zhang et al., 2021; Gao et al., 2021; Wang et al., 2021; Abade et al., 2021; Jiang et al., 2021; Zhong and Zhao, 2020; Zhang et al., 2019; Gui et al., 2021). Likewise, deep learning has been made a certain achievement in the application of citrus greening disease at this point.

    View all citing articles on Scopus
    View full text