Root-quatric mixture of experts for complex classification problems

doi:10.1016/j.eswa.2016.01.040

Expert Systems with Applications

Volume 53, 1 July 2016, Pages 192-203

https://doi.org/10.1016/j.eswa.2016.01.040 Get rights and content

Highlights

•
We design a new ensemble system based on mixture of experts.
•
The anti-correlation measure is augmented to error function of mixture of experts.
•
The gating network assigns the weights to all of the output neurons of the experts.
•
The effect of anti-correlation measure is investigated.
•
Increasing anti-correlation measure will increase the diversity among the experts.

Abstract

Mixture of experts (ME) as an ensemble method consists of several experts and a gating network to decompose the input space into some subspaces regarding to the experts specialties. To increase the diversity between experts in ME, this paper incorporates a correlation penalty function into the error function of ME. The significant of this modification is providing an occasion to encourage experts to specialize on different parts of the input space and to create decorrelated experts. The experimental results of this approach reveals that the impacts of this penalty function is extremely improved the diversity of experts and the tradeoff between the accuracy and the diversity in ME. Moreover in the implementation of this method, the experts are trained simultaneously and they can communicate by the aid of the correlation penalty function. The performance of the proposed method on ten classification benchmark datasets shows that the average of accuracy of this method improves 1.94%, 3.7%, and 3.74% compared with the mixture of negatively correlated experts, ME and the negative correlation learning, respectively. Thus the proposed method can be considered as a better classifier for healthy and medical problems and also when the great non-stationary data should be classified.

Section snippets

Preliminaries and related works

Ensemble learning directs to combine multiple experts (classifiers or regressions) that are trained on a sample problem. Their decisions are combined to obtain better generalization ability in comparison with the base models. For this aim, ensemble learning applies diverse experts and tries to minimize their mistakes. Different methods are designed to create decorrelated (diverse) experts that can be classified as explicit and implicit methods (Brown & Yao, 2001). Implicit methods indirectly

Root-quartic mixture of experts (rtqrt-me) technique

In this section we introduce our proposed approach, root quartic mixture of experts (RTQRT-ME) that incorporates a penalty correlation measure into the error function of ME. The main aim of designing an ensemble system is improvement of the decision making phase by combining the outputs of different base experts. For this purpose, the individual experts are required to be decorrelated with one another. Incorporating the correlation penalty measure into the error function of system is one

Simulation results

In this section, the empirical results of the proposed method on benchmark datasets are presented to illustrate the performance of it.

Conclusion

In this paper, a novel ME based ensemble learning approach is presented. One of the most important problems in designing ensemble systems is the diversity between experts which affects the performance of the system. In order to create diversity among the experts, several methods have been developed, in some of which the experts are independently trained, with no interaction and cooperation. In the proposed method, RTQRT-ME, a penalty correlation function is incorporated into the error function

References (60)

BrownG. et al.
Diversity creation methods: a survey and categorisation
Journal of Information Fusion
(2005)
CaoL.
Support vector machines experts for time series forecasting
Neurocomputing
(2003)
CruzR.M.O. et al.
META-DES: a dynamic ensemble selection framework using meta-learning
Pattern Recognition
(2015)
EavaniH. et al.
Capturing heterogeneous group difference susing mixture-of-experts: Application to a study of aging
NeuroImage
(2016)
FossacecaJ.M. et al.
MARK-ELM: application of a novel multiple kernel learning framework for improving the robustness of network intrusion detection
Expert Systems with Applications
(2015)
FreundY. et al.
Decision-theoretic generalization of on-line learning and an application to boosting
Computer and System Sciences
(1997)
GoodbandJ.H. et al.
A mixture of experts committee machine to design compensators for intensity modulated radiation therapy
Pattern Recognition
(2006)
KheradpishehS.R. et al.
Mixture of feature specified experts
Information Fusion
(2014)
LeeY.S. et al.
Activity recognition with android phone using mixture-of-experts co-trained with labeled and unlabeled data
Neurocomputing
(2014)
LiL. et al.
Dynamic classifier ensemble using classification confidence
Neurocomputing
(2013)

LiuY. et al.

Ensemble learning via negative correlation

Neural Networks

(1999)

LysiakR. et al.

Optimal selection of ensemble classifiers using measures of competence and diversity of base classifiers

Neurocomputing

(2014)

MasoudniaS. et al.

Combining features of negative correlation learning with mixture of experts in proposed ensemble methods

Applied Soft Computing

(2012)

MeoR. et al.

LODE: A distance-based classifier built on ensembles of positive and negative observations

Pattern Recognition

(2012)

PeraltaB. et al.

Embedded local feature selection within mixture of experts

Information Sciences

(2014)

RahmanA. et al.

Ensemble classifier generation using non-uniform layered clustering and genetic algorithm

Knowledge-Based Systems

(2013)

SimidjievskiN. et al.

Predicting long-term population dynamics with bagging and boosting of process-based models

Expert Systems with Applications

(2015)

UbeyliE.D.

Wavelet/mixture of experts network structure for EEG signals classification

Expert Systems with Applications

(2008)

YoonJ.W. et al.

Adaptive mixture-of-experts models for data glove interface with multiple users

Expert Systems with Applications

(2012)

ArmanoG. et al.

Run-time performance analysis of the mixture of experts model

Computer Recognition Systems

(2011)

AroraR. et al.

Comparative analysis of classification algorithms on different datasets using WEKA

International Journal of Computer Applications

(2012)

AsuncionA. et al.

UCI machine learning

Neural Computation

(2007)

AvnimelechR. et al.

Boosted mixture of experts: an ensemble learning scheme

Neural Computation

(1999)

BouchaffraD.

Induced subgraph game for ensemble selection

BreimanL.

Bagging predictors

Machine Learning

(1996)

BrownG. et al.

On the effectiveness of negative correlation learning

CaoK.A.L. et al.

Integrative mixture of experts to combine clinical factors and gene markers

Bioinformatics

(2010)

ChawlaN.V. et al.

Smoteboost: improving prediction of the minority class in boosting

DemsarJ.

Statistical comparisons of classifiers over multiple datasets

Journal of Machine learning research

(2006)

EbrahimpourR. et al.

Boost-wise pre-loaded mixture of experts for classification tasks

Neural Computing & Applications

(2013)

Cited by (11)

Neural trees with peer-to-peer and server-to-client knowledge transferring models for high-dimensional data classification
2019, Expert Systems with Applications
Citation Excerpt :
FELM-RT is a kind of mixture of experts that each expert is a neural tree exploiting ELM in the nodes. In the recent works on mixture of experts, one can see Abbasi, Shiri & Ghatee (2016a,b). In FELM-RT, the number of experts is equal to the number of clusters of features.
Classification of the high-dimensional data by a new expert system is followed in the current paper. The proposed system defines some non-disjoint clusters of highly relevant features with the least inner-redundancy. For each cluster, a neural tree is implemented exploiting an Extreme Learning Machine (ELM) together an inference engine in any node. The derived classification rules from ELM are stored in the rule-base of the inference engine to recognize the classes. A majority voting is used to unify the results of the different neural trees. This structure is refereed as the Forest of Extreme Learning Machines with Rule-base Transferring (FELM-RT). The contribution of FELM-RT is to decrease the duplicated computations by using two novel interaction models between the neural trees. In the first interaction model, namely Peer-to-Peer (P2P) model, each node can share its rule-base with the other nodes of the various neural trees. In the second that is referred as Server-to-Client (S2C) model, a neural tree that works on a cluster with the best relevancy and redundancy, shares the rules with the other neural trees. In both of the models, a fuzzy aggregation technique is used to adjust the certainty of the rules. The processing time of FELM-RT decreases essentially and it improves the classification accuracy. The high results of F-measure and G-mean, show that FELM-RT classifies the high-dimensional datasets without over-fitting. The comparison between FELM-RT and some state-of-the-art classifiers reveals that FELM-RT overcomes them specially on the datasets with more than 3 million features.
A new approach to oil spill detection that combines deep learning with unmanned aerial vehicles
2019, Computers and Industrial Engineering
This study presents a novel approach to automatic oil spill detection, using unmanned aerial vehicle (UAV) images to realize intelligent control in oil production. Despite considerable effort, oil spills still cannot be detected automatically and effectively due to the complexity of the real production environment, which forces oil enterprises to manually inspect facilities and detect oil spills. To solve the problem, we propose an approach consisting of UAVs, deep learning and traditional algorithms—an approach which divides the oil spill detection task into three independent sub-tasks. First, we constructed a model based on the deep convolutional neural network, which can quickly detect the suspected oil spill area in images to ensure there are no omissions. Second, to remove other obstacles in the images, we adjusted the Otsu algorithm to filter the detection results, which improves precision while not affecting the recall rate. Third, the Maximally Stable Extremal Regions algorithm was used to obtain the detail polygon region from the detection box, thus automatically evaluating the severity of the oil spill. Experiments showed that our method could solve problems effectively, reducing the cost of oil spill detection by 57.2% when compared with the traditional manual inspection process.
Classification using hierarchical mixture of discriminative learners: How to achieve high scores with few resources?
2018, Expert Systems with Applications
Citation Excerpt :
In Kotsiantis (2011), they combine bagging, boosting, rotation forest and random subspace methods to classify the data. The most recent ensemble methods RTQRT-ME and R-RTQRT-ME (Abbasi et al., 2016b; 2016a) used MLP as weighting functions and experts and introduced a regularization term to increase the diversity of the experts. For this experiment, we used the same number of experts (five) as in Abbasi et al. (2016b, 2016a) and the structure of the tree is determined by a 5-cross validation.
In many real-world classification tasks, discriminative models are widely applied since they achieve good predictive scores. In this paper, we propose a generalized framework for the well-studied Hierarchical mixture of experts (HME) model. HME combines hierarchically several discriminative models through a set of input-dependent weights. We derive from our generalized framework, two models as examples of how we can reduce the number of experts used. Those two examples are based on the choice of the weights functions. We choose a Gaussian-based and a linear softmax as weights, restricting our study to a two-level tree. Experiments on synthetic and real-world datasets show that our models can efficiently reduce the number of experts and outperform some state-of-art algorithms.
A context aware system for driving style evaluation by an ensemble learning on smartphone sensors data
2018, Transportation Research Part C: Emerging Technologies
Citation Excerpt :
It is possible to use reinforcement learning and unsupervised learning to adjust these thresholds automatically. Besides, some more modern learning algorithms such as mixture of experts (Abbasi et al., 2016) can be developed for solving this complex problem. To decrease battery consumption, we can use some feature selection methods or decreasing the frequency of sampling from sensors.
There are many systems to evaluate driving style based on smartphone sensors without enough awareness from the context. To cover this gap, we propose a new system namely CADSE system to consider the effects of traffic levels and car types on driving evaluation. CADSE system includes three subsystems to calibrate smartphone, to classify the maneuvers, and to evaluate driving styles. For each maneuver, the smartphone sensors data are gathered in three successive time intervals referred as pre-maneuver, in-maneuver, and post-maneuver times. Then, we extract some important mathematical and experimental features from these data. Afterwards, we propose an ensemble learning method on these features to classify the maneuvers. This ensemble method includes decision tree, support vector machine, multi-layer perceptron, and k-nearest neighbors. Finally, we develop a rule-based fuzzy inference system to integrate the outputs of these algorithms and to recognize dangerous and safe maneuvers. CADSE saves this result in driver’s profile to consider more for dangerous driving recognition. The experimental results show that accuracy, precision, recall, and F-measure of CADSE system are greater than 94%, 92%, 92%, and 93%, respectively that prove the system efficiency.
A regularized root-quartic mixture of experts for complex classification problems
2016, Knowledge-Based Systems
Citation Excerpt :
In addition, as the mentioned in [24] to encourage the experts to learn different regions of input space, one can incorporate a correlation penalty function into the error function of the model, e.g., in [25], a negative correlation learning penalty term was incorporated into the error function of ME. In [26], a novel penalty correlation measure namely RTQRT-ME was incorporated into the error function to control the correlation between the experts. In addition, to improve the robustness of ME or RTQRT-ME in confronting with the overfitting and noise effects, in this paper, we add a regularization term to the error function of RTQRT-ME.
Mixture of experts is a neural network based ensemble learning approach consisting of several experts and a gating network. In this paper, we introduce regularized root–quartic mixture of experts (R-RTQRT-ME) by incorporating a regularization term into the error function to control the complexity of model and to increase robustness in confronting with over-fitting and noise. The average of the results of R-RTQRT-ME on 20 classification benchmark datasets, shows that this algorithm performs 1.75%, 2.50%, 2.29% better than multi objective regularized negative correlation learning, multi objective negative correlation learning and multi objective neural network, respectively. Also, the average of improvements of R-RTQRT-ME is 1.16%, 2.31%, 3.40%, 3.39% in comparison with root-quartic mixture of experts, mixture of negatively correlated experts, mixture of experts and negative correlation learning, respectively. Furthermore, the effect of the regularization penalty term in R-RTQRT-ME on noisy data is analyzed which shows the robustness of R-RTQRT-ME in these situations.
Semi-explicit mixture of experts based on information table
2023, Journal of Ambient Intelligence and Humanized Computing

View all citing articles on Scopus

View full text

Root-quatric mixture of experts for complex classification problems

Highlights

Abstract

Section snippets

Preliminaries and related works

Root-quartic mixture of experts (rtqrt-me) technique

Simulation results

Conclusion

Journal of Information Fusion

Neurocomputing

Pattern Recognition

NeuroImage

Expert Systems with Applications

Computer and System Sciences

Pattern Recognition

Information Fusion

Neurocomputing

Neurocomputing

Neural Networks

Neurocomputing

Applied Soft Computing

Pattern Recognition

Information Sciences

Knowledge-Based Systems

Expert Systems with Applications

Expert Systems with Applications

Expert Systems with Applications

Run-time performance analysis of the mixture of experts model

Computer Recognition Systems

Comparative analysis of classification algorithms on different datasets using WEKA

International Journal of Computer Applications

UCI machine learning

Neural Computation

Boosted mixture of experts: an ensemble learning scheme

Neural Computation

Induced subgraph game for ensemble selection

Bagging predictors

Machine Learning

On the effectiveness of negative correlation learning

Integrative mixture of experts to combine clinical factors and gene markers

Bioinformatics

Smoteboost: improving prediction of the minority class in boosting

Statistical comparisons of classifiers over multiple datasets

Journal of Machine learning research

Boost-wise pre-loaded mixture of experts for classification tasks

Neural Computing & Applications