Late fusion of multimodal deep neural networks for weeds classification
Introduction
Weeds are one of the most critical factors in the reduction of agricultural productivity and increase the burden on farmers’ crops. Correct identification of weeds is an essential step for analyzing the threats of such weeds to secure agricultural productivity. It may be challenging to identify weeds belonging to the same family visually because of their natural similarity. Various applications that rely on physical features to classify weeds, such as the Naive Bayes and Gaussian mixture models, were adopted in De Rainville et al. (2014). Conversely, Persson and Åstrand (2008) applied Active shape models to extract weeds from images and k-nearest neighbors (KNNs) for the classification. However, the handcrafted features were overly simplified and could not emphasize the characteristics of the weeds in real-world applications.
Recently, with the improvement of camera quality and the development of hardware specifications, especially of graphics processing units (GPUs), researchers have achieved significant results when addressing the weeds classification problem by using deep learning models. Dyrmann et al. (2016) built a convolutional neural network (CNN) model based on the concept of the Resnet model. Chavan and Nandedkar (2018) combined the underlying architecture of AlexNet and VGG to form AgroAVNET, which included 6 convolutional and 3 fully connected layers. They applied it to a small dataset (Plant Seedlings Dataset), which includes 12 species and approximately 4200 images from the Aarhus University Computer Vision and Biosystems Signal Processing Group (Giselsson et al., 2017). Their model exhibited impressive performance; however, those deep learning-based methods were not sufficient for the classification of the Plant Seedlings dataset, which includes complex weeds structures.
To solve this problem, we propose a novel classification using the voting method with the late fusion of multimodal DNNs. The score method used for voting is calculated by alternately using Bayesian conditional probability-based proposed in Kittler et al. (1998) or by using performances of DNNs models to determine priority weights. The species are then scored considering the weighted linear combination or weighted power multiplication. The better the performance, the higher the priority weight. We tested our voting method experimentally by using the late fusion of 5 DNN models on the CNU and Plant Seedlings weeds datasets. We demonstrate that combining multiple DNNs models and giving priority to the best performing models can provide more accurate classification results in near real-time with central processing unit (CPU) or GPU.
The remainder of this paper is organized as follows. Section 2 summarizes previous approaches of weeds classification using fusion and DNNs models. The CNU Weeds and Plant Seedlings datasets are introduced in Section 3. Section 4 describes the overall voting method for weeds classification using the late fusion of multimodal DNNs and introduces the derivation of the score vector and formula for determining the priority weights. Section 5 presents the setup of our experiment, evaluation metrics, relative error reduction, performance results, and time calculations. The study is concluded in the final section.
Section snippets
Information fusion
Deep learning performance is generally improved by object-based multi-fusion. For weeds classification, after (Dyrmann et al., 2018) segmented the plants from the background, they separated the leaves from the plants. Then, they fused particular features of the leaves and plant by proposing Bayesian belief integration to estimate the belief of a correct prediction; This method, however, suffered the difficulty of leaf segmentation when a leaf overlaps the target leaf may inhibit the belief.
Plant Seedlings dataset
This dataset was created by the Computer Vision and Biosystems Signal Processing Group (2018) at the Department of Engineering, Aarhus University, Denmark (Giselsson et al., 2017). On their website, they provide 3 types of data: original, segmented, and cropped. The original image contains a small area surrounding the plant seedlings, which can lead to difficulty in extracting microfeatures from the surface of the plants. High background noise can also negatively impact the training process
Multimodal fusion
Each DNN architecture learns specific structure from the weeds; therefore, combining their different perspectives may have better classification results. Based on this idea, we approach the weeds classification problem with a multimodal DNNs system and fuse the information through scoring methods to classify the species. Early fusion aims to combine information from different DNNs at the feature levels, requiring careful analysis of which feature layers are suitable for fusion to avoid
Performance metrics
Suppose the evaluation dataset contains images of species (), is the set of images classified as species ( and ), and is the image in the set . Assume is the prediction result and is the correct label for image . We then have the following expressions.
- •
True Positive (): The number of images in that are classified correctly:
- •
False Positive (): The number of images in that are classified
Conclusion
We introduced the voting method using the late fusion of multimodal DNNs for weeds classification. Our main contribution was proposing Bayesian conditional probability-based and priority weights as scoring methods to calculate the score vector used for voting. Our method estimates the priority weights for models (or species) so that better models (or models that are better for a particular species) have a higher priority on contribution in the scoring step. The experimental results showed that
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgement
This work was carried out with the support of the “Cooperative Research Program for Agriculture Science and Technology Development (Project No. PJ01385501)”, Rural Development Administration, Republic of Korea.
References (21)
- et al.
AgroAVNET for crops and weeds classification: a step forward in automatic farming
Comput. Electron. Agric.
(2018) - et al.
Plant species classification using deep convolutional neural network
Biosyst. Eng.
(2016) - et al.
Classification of crops and weeds extracted by active shape models
Biosyst. Eng.
(2008) - et al.
Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis
NeuroImage
(2014) - et al.
Multimodal fusion framework: a multiresolution approach for emotion classification and recognition from physiological signals
NeuroImage
(2014) - Computer Vision and Biosystems Signal Processing Group, 2018. Plant Seedlings Dataset, Cropped plants, V2. [Online]...
Bayesian classification and unsupervised learning for isolating weeds in row crops
Pattern Anal. Appl.
(2014)- et al.
Estimation of plant species by classifying plants and leaves in combination
J. Field Rob.
(2018) Multimodal Deep Learning for Robust RGB-D Object Recognition. s.l
(2015)- Giselsson, T.M. et al., 2017. A public image database for benchmark of plant seedling classification algorithms. arXiv...
Cited by (0)
- 1
The authors contributed equally to this work.