Abstract
This paper treats with a decision tree consisting of RBF neural networks in the nodes to decrease the classification time and to improve accuracy as well as generalization. We propose two knowledge transferring mechanisms between nodes to reduce the duplicate computations in training process. The resulted classifier is titled as ensemble of RBF neural networks in decision tree structure with knowledge transferring or shortly ERDK. We accelerate ERDK by applying a cut-point mechanism to prune its tree structure. The results on a great number of benchmark datasets show that ERDK provides promising results. Particularly, AUCarea, F-measure, and G-mean for ERDK are 95, 90, and 90%, respectively. As a case study, we finally apply ERDK on Britain incident dataset.
Similar content being viewed by others
References
Bhardwaj A, Tiwari A, Bhardwaj H, Bhardwaj A (2016) A genetically optimized neural network model for multi-class classification. Expert Syst Appl 60:211–221
Jagtap J, Kokare M (2016) Human age classification using facial skin aging features and artificial neural network. Cogn Syst Res 40:116–128
Zhong P, Fukushima M (2007) Regularized nonsmooth Newton method for multi-class support vector machines. Optim Methods Softw 22(1):225–236
Leng Y, Sun C, Xu X, Yuan Q, Xing S, Wan H, Li D (2016) Employing unlabeled data to improve the classification performance of SVM, and its application in audio event classification. Knowl Based Syst 98:117–129
Frías-Blanco I, del Campo-Ávila J, Ramos-Jiménez G, Carvalho AC, Ortiz-Díaz A, Morales-Bueno R (2016) Online adaptive decision trees based on concentration inequalities. Knowl Based Syst 104:179–194
Kim K (2016) A hybrid classification algorithm by subspace partitioning through semi-supervised decision tree. Pattern Recogn 60:157–163
Cui Z, Wang Y, Gao X, Li J, Zheng Y (2016) Multispectral image classification based on improved weighted MRF Bayesian. Neurocomputing 212:75–87
Verbiest N, Vluymans S, Cornelis C, García-Pedrajas N, Saeys Y (2016) Improving nearest neighbor classification using ensembles of evolutionary generated prototype subsets. Appl Soft Comput 44:75–88
Diez-Pastor JF, Rodríguez JJ, García-Osorio C, Kuncheva LI (2015) Random balance: ensembles of variable priors classifiers for imbalanced data. Knowl Based Syst 85:96–111
Liu XY, Wu J, Zhou ZH (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybern Part B (Cybern) 39(2):539–550
Seiffert C, Khoshgoftaar TM, Van Hulse J, Napolitano A (2010) RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybern Part A Syst Hum 40(1):185–197
Vorraboot P, Rasmequan S, Chinnasarn K, Lursinsap C (2015) Improving classification rate constrained to imbalanced data between overlapped and non-overlapped regions by hybrid algorithms. Neurocomputing 152:429–443
Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEBoost: improving prediction of the minority class in boosting. In: European conference on principles of data mining and knowledge discovery, pp 107–119. Springer, Berlin
Knauer U, Backhaus A, Seiffert U (2015) Fusion trees for fast and accurate classification of hyperspectral data with ensembles of γ-divergence-based RBF networks. Neural Comput Appl 26:253–262
Haixiang G, Yijing L, Yanan L, Xiao L, Jinling L (2016) BPSO-Adaboost-KNN ensemble learning algorithm for multi-class imbalanced data classification. Eng Appl Artif Intell 49:176–193
Abbasi E, Shiri ME, Ghatee M (2016) A regularized root–quartic mixture of experts for complex classification problems. Knowl Based Syst 110:98–109
Zhang Z, Krawczyk B, Garcìa S, Rosales-Perez A, Herrera F (2016) Empowering one-vs-one decomposition with ensemble learning for multi-class imbalanced data. Knowl Based Syst 106:251–263
Guido RC (2016) ZCR-aided neurocomputing: a study with applications. Knowl Based Syst 105:248–269
Yijing L, Haixiang G, Xiao L, Yanan L, Jinling L (2016) Adapted ensemble classification algorithm based on multiple classifier system and feature selection for classifying multi-class imbalanced data. Knowl Based Syst 94:88–104
Micheloni C, Rani A, Kumar S, Foresti GL (2012) A balanced neural tree for pattern classification. Neural Netw 27:81–90
Schwenker F, Kestler HA, Palm G (2001) Three learning phases for radial-basis-function networks. Neural Netw 14(4):439–458
Kubat M (1998) Decision trees can initialize radial-basis function networks. IEEE Trans Neural Netw 9(5):813–821
Foresti GL, Pieroni G (1998) Exploiting neural trees in range image understanding. Pattern Recogn Lett 19(9):869–878
Zhang M-L, Zhou Z-H (2006) Adapting RBF neural networks to multi-instance learning. Neural Process Lett 23(1):1–26
Foresti GL, Micheloni C (2002) Generalized neural trees for pattern classification. IEEE Trans Neural Netw 13(6):1540–1547
Foresti GL, Dolso T (2004) An adaptive high-order neural tree for pattern recognition. IEEE Trans Syst Man Cybern Part B (Cybern) 34(2):988–996
Maji P (2008) Efficient design of neural network tree using a new splitting criterion. Neurocomputing 71(4):787–800
Akbilgic O, Bozdogan H, Erdal Balaban M (2014) A novel Hybrid RBF neural networks model as a forecaster. Stat Comput 24(3):365–375
Rani A, Kumar S, Micheloni C, Foresti GL (2013) Incorporating linear discriminant analysis in neural tree for multidimensional splitting. Appl Soft Comput 13(10):4219–4228
Rani A, Foresti GL, Micheloni C (2015) A neural tree for classification using convex objective function. Pattern Recogn Lett 68:41–47
Martinel N, Micheloni C, Foresti GL (2015) The evolution of neural learning systems: a novel architecture combining the strengths of NTs, CNNs, and ELMs. IEEE Syst Man Cybern Mag 1(3):17–26
Chen Y, Yang B, Dong J, Abraham A (2005) Time-series forecasting using flexible neural tree model. Inf Sci 174(3):219–235
Gentili S (2003) A new method for information update in supervised neural structures. Neurocomputing 51:61–74
Sakar A, Mammone RJ (1993) Growing and pruning neural tree networks. IEEE Trans Comput 42(3):291–299
Dhaka VP, Sharma MK (2015) Classification of image using a genetic general neural decision tree. Int J Appl Pattern Recognit 2(1):76–95
Ebtehaj I, Bonakdari H, Zaji AH (2016) An expert system with radial basis function neural network based on decision trees for predicting sediment transport in sewers. Water Sci Technol 74(1):176–183
Sug H (2010) Generating better radial basis function network for large data set of census. Int J Softw Eng Appl 4(2):15–22
Figueredo MVM (2013) A learning algorithm for constructive neural networks inspired on decision trees and evolutionary algorithms. Ph.D. thesis, Curitiba
Alcalá-Fdez J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Log Soft Comput 17:255–287
Bache K, Lichman M (2013) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine, CA. http://archive.ics.uci.edu/ml. Accessed 19 May 2018
Ojha VK, Abraham A, Snasel V (2017) Ensemble of heterogeneous flexible neural trees using multiobjective genetic programming. Appl Soft Comput 52:909–924
Lopez V, Fernandez A, Garcia S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141
Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17(4):491–502
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157–1182
Tabakhi S, Moradi P, Akhlaghian F (2014) An unsupervised feature selection algorithm based on ant colony optimization. Eng Appl Artif Intell 32:112–123
Bolon-Canedo V, Sánchez-Marono N, Alonso-Betanzos A (2013) A review of feature selection methods on synthetic data. Knowl Inf Syst 34(3):483–519
Ma L, Destercke S, Wang Y (2016) Online active learning of decision trees with evidential data. Pattern Recogn 52:33–45
Sing JK, Basu DK, Nasipuri M, Kundu M (2004) Center selection of RBF neural network based on modified k-means algorithm with point symmetry distance measure. Found Comput Decis Sci 29(3):247–266
Yang R, Er PV, Wang Z, Tan KK (2016) An RBF neural network approach towards precision motion system with selective sensor fusion. Neurocomputing 199:31–39
Fatemi M (2016) A new efficient conjugate gradient method for unconstrained optimization. J Comput Appl Math 300:207–216
Bertsekas DP (1999) Nonlinear programming. Athena scientific, Belmont, pp 1–60
Abbasi E, Shiri ME, Ghatee M (2016) Root-quatric mixture of experts for complex classification problems. Expert Syst Appl 53:192–203
Masoudnia S, Ebrahimpour R, Arani SAAA (2012) Incorporation of a regularization term to control negative correlation in mixture of experts. Neural Process Lett 36(1):31–47
Girosi F, Jones M, Poggio T (1995) Regularization theory and neural networks architectures. Neural Comput 7(2):219–269
Prachuabsupakij W, Soonthornphisaj N (2012) A new classification for multiclass imbalanced datasets based on clustering approach. In: The 26th annual conference of the Japanese society for artificial intelligence
Lyon RJ, Brooke JM, Knowles JD, Stappers BW (2014) Hellinger distance trees for imbalanced streams. In: 2014 22nd international conference on pattern recognition (ICPR), pp 1969–1974. IEEE
Haque MM (2014) Identification of novel differentially methylated DNA regions using active learning and imbalanced class learners. Doctoral dissertation, Washington State University
Fontenla-Romero O, Guijarro-Berdiñas B, Pérez-Sánchez B, Alonso-Betanzos A (2010) A new convex objective function for the supervised learning of single-layer neural networks. Pattern Recogn 43(5):1984–1992
Department for Transport, Road Accident Statistics Branch (2015) Road accident data, 2014, [data collection]. UK Data Service. SN: 7752. http://doi.org/10.5255/UKDA-SN-7752-1. Accessed 19 May 2018
Acknowledgements
We highly appreciate the Anonymous Reviewers, Area Editor, and Editor-in-Chief for their great useful comments, which led us to improve the essay.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Appendices
Appendix A: Evaluation measures
There are different measures for performance analysis of classification algorithms. This subsection describes some of the popular ones used in this paper. Some of the useful parameters are as follows:
-
True Positive (TP): number of positive classified data, which are positive in reality.
-
True Negative (TN): number of negative classified data, which are negative in reality.
-
False Positive (FP): number of positive classified data, which are negative in reality.
-
False Negative (FN): number of negative classified data, which are positive in reality.
-
Precision This measure is useful to compare the performances of different algorithms. It computes based on (11), see [16].
$${\text{Percision}} = \frac{\text{TP}}{{\left( {{\text{TP}} + {\text{FP}}} \right)}}$$(11) -
Recall The recall measure computes based on (12), see [16].
$${\text{Recall}} = \frac{\text{TP}}{{\left( {{\text{TP}} + {\text{FN}}} \right)}}$$(12) -
F-measure This measure is one of the other popular measures, which determines the performance of imbalanced data classification. F-measure computes based on (13), see [16].
$$F{\text{-measure}} = \frac{{2 \times {\text{Percision}} \times {\text{Recall}}}}{{{\text{Percision}} + {\text{Recall}}}}$$(13) -
G-mean The G-mean is computed based on (14), see [16].
$$G{\text{-mean}} = \left( {{\text{Percision}} \times {\text{Recall}}} \right)^{{{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 2}}\right.\kern-0pt} \!\lower0.7ex\hbox{$2$}}}}$$(14) -
AUCarea In [15], a new measure named AUCarea is introduced. In AUCarea, all the AUC values are plotted in a polar, and, finally, the area, covered by these polar coordinates is computed. So an AUCarea could compute by (15).
$${\text{AUCarea}} = \frac{{\frac{1}{2}\sin \left( {\frac{2\varPi }{q}} \right)\left( {\left( {\mathop \sum \nolimits_{i = 1}^{q - 1} r_{i} \times r_{i + 1} } \right) + \left( {r_{q} \times r_{1} } \right)} \right)}}{{\frac{1}{2}\sin \left( {\frac{2\varPi }{q}} \right) \times q}} = \frac{{\mathop \sum \nolimits_{i = 1}^{q - 1} \left( {r_{i} + r_{i + 1} } \right) + \left( {r_{q} \times r_{1} } \right)}}{q}$$(15)where \(r_{i}\) is the AUC value of binary combinations of different classes, and computed based on Eq. (16).
$$r_{i} = \frac{{{\text{Percision}} - {\text{Recall}} + 1}}{2}$$(16)
Appendix B: Dataset description
Number | Datasets | Number of classes | Number of samples | Number of features |
---|---|---|---|---|
1 | Zoo | 7 | 101 | 16 |
2 | Yeast | 10 | 1484 | 8 |
3 | Aba | 29 | 4177 | 8 |
4 | Lym | 4 | 148 | 18 |
5 | Ecoli | 4 | 358 | 7 |
6 | Car | 4 | 1728 | 6 |
7 | Pen-digit | 10 | 1100 | 16 |
8 | Mf-mor | 6 | 2000 | 6 |
9 | Led | 7 | 500 | 7 |
10 | Wine | 3 | 178 | 13 |
11 | New-thyroid | 3 | 215 | 5 |
12 | HayesR | 3 | 160 | 5 |
13 | Satellitea | 7 | 6435 | 4 |
14 | Glass | 7 | 214 | 10 |
15 | Vehicle | 4 | 946 | 18 |
16 | Letter | 26 | 20,000 | 16 |
17 | Segment | 7 | 2310 | 19 |
18 | Iris | 3 | 150 | 4 |
19 | WDBC | 2 | 569 | 30 |
20 | Ionosphere | 2 | 350 | 34 |
21 | Breast | 2 | 799 | 9 |
22 | Heart | 2 | 270 | 13 |
23 | Hepatitis | 2 | 155 | 19 |
Rights and permissions
About this article
Cite this article
Abpeykar, S., Ghatee, M. An ensemble of RBF neural networks in decision tree structure with knowledge transferring to accelerate multi-classification. Neural Comput & Applic 31, 7131–7151 (2019). https://doi.org/10.1007/s00521-018-3543-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3543-9