Abstract
We exploit an evolutionary three-objective optimization algorithm to produce a Pareto front approximation composed of fuzzy rule-based classifiers (FRBCs) with different trade-offs between accuracy (expressed in terms of sensitivity and specificity) and complexity (computed as sum of the conditions in the antecedents of the classifier rules). Then, we use the ROC convex hull method to select the potentially optimal classifiers in the projection of the Pareto front approximation onto the ROC plane. Our method was tested on 13 highly imbalanced datasets and compared with 2 two-objective evolutionary approaches and one heuristic approach to FRBC generation, and with three well-known classifiers. We show by the Wilcoxon signed-rank test that our three-objective optimization approach outperforms all the other techniques, except for one classifier, in terms of the area under the ROC convex hull, an accuracy measure used to globally compare different classification approaches. Further, all the FRBCs in the ROC convex hull are characterized by a low value of complexity. Finally, we discuss how, the misclassification costs and the class distributions are fixed, we can select the most suitable classifier for the specific application. We show that the FRBC selected from the convex hull produced by our three-objective optimization approach achieves the lowest classification cost among the techniques used as comparison in two specific medical applications.
Similar content being viewed by others
References
Alcalá R, Gacto MJ, Herrera F, Alcalá-Fdez J (2007) A multi-objective genetic algorithm for tuning and rule selection to obtain accurate and compact linguistic fuzzy rule-based systems. Int J Uncertain Fuzziness Knowl Based Syst 15(5):521–537. doi:10.1142/S0218488507004856
Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput 13(3):307–318. doi:10.1007/s00500-008-0323-y
Anastasio M, Kupinski M, Nishikawa R (1998) Optimization and FROC analysis of rule-based detection schemes using a multiobjective approach. IEEE Trans Med Imaging 17(10):1089–1093. doi:10.1109/42.746726
Antonelli M, Frosini G, Lazzerini B, Marcelloni F (2006) A CAD system for lung nodule detection based on an anatomical model and a fuzzy neural network. In: Proceedings of NAFIPS, Montreal, Canada, 3–6 June, pp 448–453
Asuncion A, Newman D (2007) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://www.ics.uci.edu/~mlearn/MLRepository.html
Awai K, Murao K, Ozawa A, Komi M, Hayakawa H, Hori S, Nishimura Y (2004) Pulmonary nodules at chest CT: effect of computer-aided diagnosis on radiologists’ detection performance. Radiology 230(2):347–352. doi:10.1148/radiol.2302030049
Batista G, Prati R, Monard M (2004) A study of the behaviour of several methods for balancing machine learning. SIGKDD Explor 6(1):20–29. doi:10.1145/1007730.1007735
Casillas J, Cordon O, Del Jesus MJ, Herrera F (2001) Genetic feature selection in a fuzzy rule-based classification system learning process for high-dimensional problems. Inf Sci 136:135–157. doi:10.1016/S0020-0255(01)00147-5
Casillas J, Cordon O, Herrera F, Magdalena L (eds) (2003a) Accuracy improvements in linguistic fuzzy modeling. Springer, Berlin
Casillas J, Cordon O, Herrera F, Magdalena L (eds) (2003b) Interpretability issues in fuzzy modeling. Springer, Berlin
Casillas J, Cordon O, Del Jesus MJ, Herrera F (2005) Genetic tuning of fuzzy rule deep structures preserving interpretability and its interaction with fuzzy rule set reduction. IEEE Trans Fuzzy Syst 13(1):13–29. doi:10.1109/TFUZZ.2004.839670
Casillas J, Herrera F, Péreza R, Del Jesus MJ, Villar P (2007) Special issue on genetic fuzzy systems and the Interpretability-Accuracy Trade-off. Int J Approx Reason 44(1):1–3. doi:10.1016/j.ijar.2006.06.002
Chang X, Lilly JH (2004) Evolutionary design of a fuzzy classifier from data. IEEE Trans Syst Man Cybern B 34(4):1894–1906. doi:10.1109/TSMCB.2004.831160
Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chi Z, Yan H, Pham T (1996) Fuzzy algorithms with applications to image processing and pattern recognition. World Scientific, Singapore
Cococcioni M, Ducange P, Lazzerini B, Marcelloni F (2007) A Pareto-based multi-objective evolutionary approach to the identification of Mamdani fuzzy systems. Soft Comput 11(11):1013–1031. doi:10.1007/s00500-007-0150-6
Coello Coello CA, Lamont GB (2004) Applications of multi-objective evolutionary algorithms. World Scientific, Singapore
Coello Coello CA (2006) Evolutionary multi-objective optimization: a historical view of the field. IEEE Comput Intell Mag 1(1):28–36. doi:10.1109/MCI.2006.1597059
Cordon O, Del Jesus MJ, Herrera F (1999) A proposal on reasoning methods in fuzzy rule-based classification systems. Int J Approx Reason 20(1):21–45
Cordon O, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems. World Scientific, Singapore
Cordon O, Del Jesus MJ, Herrera F, Magdalena L, Villar P (2003) A multiobjective genetic learning process for joint feature selection and granularity and contexts learning in fuzzy rule-based classification systems. In: Casillas J, Cordon O, Herrera F, Magdalena L (eds) Accuracy improvements in linguistic fuzzy modeling. Springer, Berlin, pp 79–99
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, London
Deb K, Pratab A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197. doi:10.1109/4235.996017
Duin RPW (2007) PRTools (Version 4.0) A Matlab toolbox for pattern recognition. Pattern recognition group, Delft, University of Technology. http://www.prtools.org
Everson RM, Fieldsend E (2006) Multiobjective optimization of safety related systems: An application short-term conflict alert. IEEE Trans Evol Comput 10(2):187–198. doi:10.1109/TEVC.2005.856067
Fawcett T (2003) ROC graphs: Notes and practical considerations for researchers. Tech. Rep. HPL-2003-4, HP Labs
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874. doi:10.1016/j.patrec.2005.10.010
Fernandez A, García S, Del Jesus MJ, Herrera F (2008) A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data sets. Fuzzy Sets Syst 159(18):2378–2398. doi:10.1016/j.fss.2007.12.023
Herrera F (2008) Genetic fuzzy systems: taxonomy, current research trends and prospects. Evol Intell 1:27–46. doi:10.1007/s12065-007-0001-5
Ho SY, Chen HM, Ho SJ, Chen TK (2004) Design of accurate classifiers with a compact fuzzy-rule base using an evolutionary scatter partition of feature space. IEEE Trans Syst Man Cybern 34(2):1031–1044. doi:10.1109/TSMCB.2003.819160
Horn J, Nafpliotis N, Goldberg DE (1999) A niched Pareto genetic algorithm for multiobjective optimization. In: Proceedings of the first IEEE conference on evolutionary computation, Orlando, Florida, 27–29 June, pp 82–87
Ishibuchi H (2007) Multiobjective genetic fuzzy systems: review and future research directions. In: Proceedings of the 2007 international conference on fuzzy systems, London, 23–26 July, pp 1-6
Ishibuchi H, Nojima Y (2007) Analysis of interpretability-accuracy tradeoff of fuzzy systems by multiobjective fuzzy genetics-based machine learning. Int J Approx Reason 44(1):4–31. doi:10.1016/j.ijar.2006.01.004
Ishibuchi H, Yamamoto T (2004) Fuzzy rule selection by multi-objective genetic local search algorithms and rule evaluation measures in data mining. Fuzzy Sets Syst 141(1):59–88. doi:10.1016/S0165-0114(03)00114-3
Ishibuchi H, Yamamoto T (2005) Rule weights specification in fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 13(4):428–435. doi:10.1109/TFUZZ.2004.841738
Ishibuchi H, Murata T, Turksen IB (1997) Single-objective and two-objective genetic algorithms for selecting linguistic rules for pattern classification problems. Fuzzy Sets Syst 89(2):135–150. doi:10.1016/S0165-0114(96)00098-X
Ishibuchi H, Nakashima T, Nii M (2005a) Classification and modeling with linguistic information granules: advanced approaches to linguistic data Mining. Springer, Berlin
Ishibuchi H, Nozaki K, Yamamoto N, Tanaka H (2005b) Selecting fuzzy if-then rules for classification problems using genetic algorithms. IEEE Trans Fuzzy Syst 3(3):260–270. doi:10.1109/91.413232
Karr CL, Gentry EJ (1993) Fuzzy control of pH using genetic algorithms. IEEE Trans Fuzzy Syst 1:46–53. doi:10.1109/TFUZZ.1993.390283
Knowles JD, Corne DW (2002) Approximating the non dominated front using the Pareto archived evolution strategy. Evol Comput 8(2):149–172. doi:10.1162/106365600568167
Kupinski M, Anastasio M (1999) Multiobjective genetic optimization of diagnostic classifiers with implications for generating receiver operating characteristic curves. IEEE Trans Med Imaging 18(8):675–685. doi:10.1109/42.796281
Mansoori E, Zolghadri M, Katebi S (2007) A weighting function for improving fuzzy classification systems performance. Fuzzy Sets Syst 158(5):583–591. doi:10.1016/j.fss.2006.10.004
Nakashima T, Shaefer G, Yokota Y, Ishibuchi H (2007) A weighted fuzzy classifier and its application to image processing tasks. Fuzzy Sets Syst 158:284–294. doi:10.1016/j.fss.2006.10.011
Nauck D, Kruse R (1999) Obtaining interpretable fuzzy classification rules from medical data. Artif Intell Med 16(2):149–169. doi:10.1016/S0933-3657(98)00070-0
Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3):203–231. doi:10.1023/A:1007601015854
Quinlan JR (1993) C4.5: Programs for Machine Learning. Morgan Kauffman, San Mateo
Raskutti B, Kowalczyk A (2004) Extreme rebalancing for SVMs: a case study. SIGKDD Explor 6(1):60–69. doi:10.1145/1007730.1007739
Sheskin D (2003) Handbook of parametric and nonparametric statistical procedures. Chapman & Hall/CRC, London/Boca Raton
Srinivas N, Deb K (1998) Multi-objective function optimization using non-dominated sorting genetic algorithms. Evol Comput 2:221–248. doi:10.1162/evco.1994.2.3.221
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83. doi:10.2307/3001968
Woods K, Doss C, Bowyer K, Solka J, Priebe J, Kegelmeyer P (1993) Comparative evaluation of pattern recognition techniques for detection of microcalcifications in mammography. Int J Pattern Recognit Artif Intell 7(6):1417–1436. doi:10.1142/S0218001493000698
Yen J, Wang L, Gillespie GW (1998) Improving the interpretability of TSK fuzzy models by combining global learning and local learning. IEEE Trans Fuzzy Syst 6(4):530–537. doi:10.1109/91.728447
Zitzler E, Thiele L (1999) Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach. IEEE Trans Evol Comput 3:257–271. doi:10.1109/4235.797969
Zitzler E, Deb K, Thiele L (2000) Comparison of multiobjective evolutionary algorithms: empirical results. Evol Comput 8(2):173–195. doi:10.1162/106365600568202
Zitzler E, Laumanns M, Thiele L (2001) SPEA2: Improving the strength Pareto evolutionary algorithm for multiobjective optimization. In: Proceedings of EUROGEN2001 evolutionary methods for design, opt. and control with applications to industrial problems, Athens, pp 95–100
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ducange, P., Lazzerini, B. & Marcelloni, F. Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets. Soft Comput 14, 713–728 (2010). https://doi.org/10.1007/s00500-009-0460-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-009-0460-y