ABSTRACT
Ensemble learning is a powerful machine learning paradigm which leverages a collection of diverse base learners to achieve better prediction performance than that could be achieved by any individual base learner. This work proposes an evolutionary feature subspaces generation based ensemble learning framework, which formulates the tasks of searching for the most suitable feature subspace for each base learner into a multi-task optimization problem and solve it via an evolutionary multi-task optimizer. Multiple such problems which correspond to different base learners are solved simultaneously via an evolutionary multi-task feature selection algorithm such that solving one problem may help solve some other problems via implicit knowledge transfer. The quality of thus generated feature subspaces is supposed to outperform those obtained by individually seeking the optimal feature subspace for each base learner. We implement the proposed framework by using SVM, KNN, and decision tree as the base learners, proposing a multi-task binary particle swarm optimization algorithm for evolutionary multi-task feature selection, and utilizing the major voting scheme to combine the outputs of the base learners. Experiments on several UCI datasets demonstrate the effectiveness of the proposed method.
- Hussein Almuallim and Thomas G Dietterich. 1994. Learning boolean concepts in the presence of many irrelevant features. Artificial Intelligence 69, 1-2 (1994), 279--305. Google ScholarDigital Library
- Arthur Asuncion and David Newman. 2007. UCI machine learning repository. (2007).Google Scholar
- Alberto Bertoni, Raffaella Folgieri, and Giorgio Valentini. 2005. Bio-molecular cancer prediction with random subspace ensembles of support vector machines. Neurocomputing 63 (2005), 535--539. Google ScholarDigital Library
- Tim Blackwell and Jürgen Branke. 2004. Multi-swarm optimization in dynamic environments. In Workshops on Applications of Evolutionary Computation. Springer, 489--500.Google ScholarCross Ref
- Robert Bryll, Ricardo Gutierrez-Osuna, and Francis Quek. 2003. Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern recognition 36, 6 (2003), 1291--1302.Google Scholar
- Rohitash Chandra, Abhishek Gupta, Yew-Soon Ong, and Chi-Keong Goh. 2016. Evolutionary multi-task learning for modular training of feedforward neural networks. In International Conference on Neural Information Processing. Springer, 37--46. Google ScholarDigital Library
- Girish Chandrashekar and Ferat Sahin. 2014. A survey on feature selection methods. Computers & Electrical Engineering 40, 1 (2014), 16--28. Google ScholarDigital Library
- Li-Yeh Chuang, Sheng-Wei Tsai, and Cheng-Hong Yang. 2011. Improved binary particle swarm optimization using catfish effect for feature selection. Expert Systems with Applications 38, 10 (2011), 12699--12707. Google ScholarDigital Library
- Rakkrit Duangsoithong and Terry Windeatt. 2010. Bootstrap feature selection for ensemble classifiers. In Industrial Conference on Data Mining. Springer, 28--41. Google ScholarDigital Library
- Richard O Duda, Peter E Hart, David G Stork, et al. 1973. Pattern classification. Vol. 2. Wiley New York.Google Scholar
- David E Goldberg and John H Holland. 1988. Genetic algorithms and machine learning. Machine learning 3, 2 (1988), 95--99. Google ScholarDigital Library
- A Gupta, YS Ong, B Da, L Feng, and SD Handoko. 2016. Measuring complementarity between function landscapes in evolutionary multitasking. In 2016 IEEE Congress on Evolutionary Computation, accepted.Google Scholar
- Abhishek Gupta, Yew-Soon Ong, and Liang Feng. 2016. Multifactorial evolution: toward evolutionary multitasking. IEEE Transactions on Evolutionary Computation 20, 3 (2016), 343--357.Google ScholarDigital Library
- Isabelle Guyon and André Elisseeff. 2003. An introduction to variable and feature selection. Journal of machine learning research 3, Mar (2003), 1157--1182. Google ScholarDigital Library
- Tin Kam Ho. 1998. The random subspace method for constructing decision forests. IEEE transactions on pattern analysis and machine intelligence 20, 8 (1998), 832--844. Google ScholarDigital Library
- Borhan Kazimipour, Xiaodong Li, and A Kai Qin. 2013. Initialization methods for large scale global optimization. In Evolutionary Computation (CEC), 2013 IEEE Congress on. IEEE, 2750--2757.Google ScholarCross Ref
- Borhan Kazimipour, Xiaodong Li, and A Kai Qin. 2014. Effects of population initialization on differential evolution for large scale optimization. In Evolutionary Computation (CEC), 2014 IEEE Congress on. IEEE, 2404--2411.Google ScholarCross Ref
- James Kennedy. 2011. Particle swarm optimization. In Encyclopedia of machine learning. Springer, 760--766.Google Scholar
- Ron Kohavi and George H John. 1997. Wrappers for feature subset selection. Artificial intelligence 97, 1-2 (1997), 273--324. Google ScholarDigital Library
- Igor Kononenko. 1994. Estimating attributes: analysis and extensions of RELIEF. In European conference on machine learning. Springer, 171--182. Google ScholarDigital Library
- Sotiris Kotsiantis. 2011. Combining bagging, boosting, rotation forest and random subspace methods. Artificial Intelligence Review 35, 3 (2011), 223--240. Google ScholarDigital Library
- Ludmila I Kuncheva and Catrin O Plumpton. 2010. Choosing parameters for random subspace ensembles for fMRI classification. In International Workshop on Multiple Classifier Systems. Springer, 54--63. Google ScholarDigital Library
- Ludmila I Kuncheva, Juan J Rodríguez, Catrin O Plumpton, David EJ Linden, and Stephen J Johnston. 2010. Random subspace ensembles for fMRI classification. IEEE transactions on medical imaging 29, 2 (2010), 531--542.Google Scholar
- Jing J Liang, S Baskar, Ponnuthurai N Suganthan, and A Kai Qin. 2006. Performance evaluation of multiagent genetic algorithm. Natural Computing 5, 1 (2006), 83--96. Google ScholarDigital Library
- Shih-Wei Lin and Shih-Chieh Chen. 2009. PSOLDA: A particle swarm optimization approach for enhancing classification accuracy rate of linear discriminant analysis. Applied Soft Computing 9, 3 (2009), 1008--1015. Google ScholarDigital Library
- Thomas Marill and D Green. 1963. On the effectiveness of receptors in recognition systems. IEEE transactions on Information Theory 9, 1 (1963), 11--17. Google ScholarDigital Library
- Ahmet Mert, Niyazi Kiliç, and Erdem Bilgili. 2016. Random subspace method with class separability weighting. Expert Systems 33, 3 (2016), 275--285. Google ScholarDigital Library
- Durga Prasad Muni, Nikhil R Pal, and Jyotirmay Das. 2006. Genetic programming for simultaneous feature selection and classifier design. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 36, 1 (2006), 106--117. Google ScholarDigital Library
- Luiz S Oliveira, Robert Sabourin, Flávio Bortolozzi, and Ching Y Suen. 2003. Feature selection for ensembles: A hierarchical multi-objective genetic algorithm approach. In Proceedings of the Seventh International Conference on Document Analysis and Recognition-Volume 2. IEEE Computer Society, 676. Google ScholarDigital Library
- David W Opitz. 1999. Feature selection for ensembles. AAAI/IAAI 379 (1999), 384. Google ScholarDigital Library
- David W Opitz and Richard Maclin. 1999. Popular ensemble methods: An empirical study. J. Artif. Intell. Res.(JAIR) 11 (1999), 169--198. Google ScholarDigital Library
- Catrin O Plumpton, Ludmila I Kuncheva, Nikolaas N Oosterhof, and Stephen J Johnston. 2012. Naive random subspace ensemble with linear classifiers for real-time classification of fMRI data. Pattern Recognition 45, 6 (2012), 2101--2108. Google ScholarDigital Library
- A Kai Qin and Xiaodong Li. 2013. Differential evolution on the CEC-2013 single-objective continuous optimization testbed. In Evolutionary Computation (CEC), 2013 IEEE Congress on. IEEE, 1099--1106.Google Scholar
- A Kai Qin, Federico Raimondo, Florence Forbes, and Yew Soon Ong. 2012. An improved CUDA-based implementation of differential evolution on GPU. In Proceedings of the 14th annual conference on Genetic and evolutionary computation. ACM, 991--998. Google ScholarDigital Library
- A Kai Qin, Ponnuthurai N Suganthan, and Marco Loog. 2006. Generalized null space uncorrelated Fisher discriminant analysis for linear dimensionality reduction. Pattern Recognition 39, 9 (2006), 1805--1808. Google ScholarDigital Library
- Martin Sewell. 2008. Ensemble learning. RN 11, 02 (2008).Google Scholar
- Yuhui Shi and Russell C Eberhart. 1999. Empirical study of particle swarm optimization. In Evolutionary computation, 1999. CEC 99. Proceedings of the 1999 congress on, Vol. 3. IEEE, 1945--1950.Google Scholar
- Yisen Wang and Shu-Tao Xia. 2016. A novel feature subspace selection method in random forests for high dimensional data. In Neural Networks (IJCNN), 2016 International Joint Conference on. IEEE, 4383--4389.Google ScholarCross Ref
- A Wayne Whitney. 1971. A direct method of nonparametric measurement selection. IEEE Trans. Comput. 100, 9 (1971), 1100--1103. Google ScholarDigital Library
- Tsz Ho Wong, A Kai Qin, Shengchun Wang, and Yuhui Shi. 2015. cuSaDE: A CUDA-based parallel self-adaptive differential evolution algorithm. In Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems-Volume 2. Springer, 375--388.Google ScholarCross Ref
- Junshi Xia, Mauro Dalla Mura, Jocelyn Chanussot, Peijun Du, and Xiyan He. 2015. Random subspace ensembles for hyperspectral image classification with extended morphological attribute profiles. IEEE Transactions on Geoscience and Remote Sensing 53, 9 (2015), 4768--4786.Google ScholarCross Ref
- Bing Xue, Mengjie Zhang, and Will N Browne. 2014. Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms. Applied Soft Computing 18 (2014), 261--276. Google ScholarDigital Library
- Yunming Ye, Qingyao Wu, Joshua Zhexue Huang, Michael K Ng, and Xutao Li. 2013. Stratified sampling for feature subspace selection in random forests for high dimensional data. Pattern Recognition 46, 3 (2013), 769--787. Google ScholarDigital Library
- Shi-Zheng Zhao, Jing J Liang, Ponnuthurai N Suganthan, and Mehmet Fatih Tasgetiren. 2008. Dynamic multi-swarm particle swarm optimizer with local search for large scale global optimization. In Evolutionary Computation, 2008. CEC 2008.(IEEE World Congress on Computational Intelligence). IEEE Congress on. IEEE, 3845--3852.Google ScholarCross Ref
Index Terms
- Evolutionary feature subspaces generation for ensemble classification
Recommendations
An Evolutionary Algorithm Approach to Optimal Ensemble Classifiers for DNA Microarray Data Analysis
In general, the analysis of microarray data requires two steps: feature selection and classification. From a variety of feature selection methods and classifiers, it is difficult to find optimal ensembles composed of any feature-classifier pairs. This ...
BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification
This paper proposes an ensemble algorithm named of BPSO-Adaboost-KNN to cope with multi-class imbalanced data classification. The main idea of this algorithm is to integrate feature selection and boosting into ensemble. What's more, we utilize a novel ...
Enhancing the classification accuracy by scatter-search-based ensemble approach
Data-mining algorithms have been used in many classification problems. Among them, the decision tree (DT), back-propagation network (BPN), and support vector machine (SVM) are popular and can be applied to various areas. Nevertheless, different problems ...
Comments