Skip to main content

Advertisement

Log in

Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets

  • Original Paper
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

We exploit an evolutionary three-objective optimization algorithm to produce a Pareto front approximation composed of fuzzy rule-based classifiers (FRBCs) with different trade-offs between accuracy (expressed in terms of sensitivity and specificity) and complexity (computed as sum of the conditions in the antecedents of the classifier rules). Then, we use the ROC convex hull method to select the potentially optimal classifiers in the projection of the Pareto front approximation onto the ROC plane. Our method was tested on 13 highly imbalanced datasets and compared with 2 two-objective evolutionary approaches and one heuristic approach to FRBC generation, and with three well-known classifiers. We show by the Wilcoxon signed-rank test that our three-objective optimization approach outperforms all the other techniques, except for one classifier, in terms of the area under the ROC convex hull, an accuracy measure used to globally compare different classification approaches. Further, all the FRBCs in the ROC convex hull are characterized by a low value of complexity. Finally, we discuss how, the misclassification costs and the class distributions are fixed, we can select the most suitable classifier for the specific application. We show that the FRBC selected from the convex hull produced by our three-objective optimization approach achieves the lowest classification cost among the techniques used as comparison in two specific medical applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Alcalá R, Gacto MJ, Herrera F, Alcalá-Fdez J (2007) A multi-objective genetic algorithm for tuning and rule selection to obtain accurate and compact linguistic fuzzy rule-based systems. Int J Uncertain Fuzziness Knowl Based Syst 15(5):521–537. doi:10.1142/S0218488507004856

    Article  Google Scholar 

  • Alcalá-Fdez J, Sánchez L, García S, del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM, Fernández JC, Herrera F (2009) KEEL: a software tool to assess evolutionary algorithms to data mining problems. Soft Comput 13(3):307–318. doi:10.1007/s00500-008-0323-y

    Article  Google Scholar 

  • Anastasio M, Kupinski M, Nishikawa R (1998) Optimization and FROC analysis of rule-based detection schemes using a multiobjective approach. IEEE Trans Med Imaging 17(10):1089–1093. doi:10.1109/42.746726

    Article  Google Scholar 

  • Antonelli M, Frosini G, Lazzerini B, Marcelloni F (2006) A CAD system for lung nodule detection based on an anatomical model and a fuzzy neural network. In: Proceedings of NAFIPS, Montreal, Canada, 3–6 June, pp 448–453

  • Asuncion A, Newman D (2007) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://www.ics.uci.edu/~mlearn/MLRepository.html

  • Awai K, Murao K, Ozawa A, Komi M, Hayakawa H, Hori S, Nishimura Y (2004) Pulmonary nodules at chest CT: effect of computer-aided diagnosis on radiologists’ detection performance. Radiology 230(2):347–352. doi:10.1148/radiol.2302030049

    Article  Google Scholar 

  • Batista G, Prati R, Monard M (2004) A study of the behaviour of several methods for balancing machine learning. SIGKDD Explor 6(1):20–29. doi:10.1145/1007730.1007735

    Article  Google Scholar 

  • Casillas J, Cordon O, Del Jesus MJ, Herrera F (2001) Genetic feature selection in a fuzzy rule-based classification system learning process for high-dimensional problems. Inf Sci 136:135–157. doi:10.1016/S0020-0255(01)00147-5

    Article  MATH  Google Scholar 

  • Casillas J, Cordon O, Herrera F, Magdalena L (eds) (2003a) Accuracy improvements in linguistic fuzzy modeling. Springer, Berlin

  • Casillas J, Cordon O, Herrera F, Magdalena L (eds) (2003b) Interpretability issues in fuzzy modeling. Springer, Berlin

  • Casillas J, Cordon O, Del Jesus MJ, Herrera F (2005) Genetic tuning of fuzzy rule deep structures preserving interpretability and its interaction with fuzzy rule set reduction. IEEE Trans Fuzzy Syst 13(1):13–29. doi:10.1109/TFUZZ.2004.839670

    Article  Google Scholar 

  • Casillas J, Herrera F, Péreza R, Del Jesus MJ, Villar P (2007) Special issue on genetic fuzzy systems and the Interpretability-Accuracy Trade-off. Int J Approx Reason 44(1):1–3. doi:10.1016/j.ijar.2006.06.002

    Article  Google Scholar 

  • Chang X, Lilly JH (2004) Evolutionary design of a fuzzy classifier from data. IEEE Trans Syst Man Cybern B 34(4):1894–1906. doi:10.1109/TSMCB.2004.831160

    Article  Google Scholar 

  • Chawla N, Bowyer K, Hall L, Kegelmeyer W (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    MATH  Google Scholar 

  • Chi Z, Yan H, Pham T (1996) Fuzzy algorithms with applications to image processing and pattern recognition. World Scientific, Singapore

  • Cococcioni M, Ducange P, Lazzerini B, Marcelloni F (2007) A Pareto-based multi-objective evolutionary approach to the identification of Mamdani fuzzy systems. Soft Comput 11(11):1013–1031. doi:10.1007/s00500-007-0150-6

    Article  Google Scholar 

  • Coello Coello CA, Lamont GB (2004) Applications of multi-objective evolutionary algorithms. World Scientific, Singapore

  • Coello Coello CA (2006) Evolutionary multi-objective optimization: a historical view of the field. IEEE Comput Intell Mag 1(1):28–36. doi:10.1109/MCI.2006.1597059

    Article  MathSciNet  Google Scholar 

  • Cordon O, Del Jesus MJ, Herrera F (1999) A proposal on reasoning methods in fuzzy rule-based classification systems. Int J Approx Reason 20(1):21–45

    Google Scholar 

  • Cordon O, Herrera F, Hoffmann F, Magdalena L (2001) Genetic fuzzy systems. World Scientific, Singapore

  • Cordon O, Del Jesus MJ, Herrera F, Magdalena L, Villar P (2003) A multiobjective genetic learning process for joint feature selection and granularity and contexts learning in fuzzy rule-based classification systems. In: Casillas J, Cordon O, Herrera F, Magdalena L (eds) Accuracy improvements in linguistic fuzzy modeling. Springer, Berlin, pp 79–99

    Google Scholar 

  • Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, London

  • Deb K, Pratab A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197. doi:10.1109/4235.996017

    Article  Google Scholar 

  • Duin RPW (2007) PRTools (Version 4.0) A Matlab toolbox for pattern recognition. Pattern recognition group, Delft, University of Technology. http://www.prtools.org

  • Everson RM, Fieldsend E (2006) Multiobjective optimization of safety related systems: An application short-term conflict alert. IEEE Trans Evol Comput 10(2):187–198. doi:10.1109/TEVC.2005.856067

    Article  MathSciNet  Google Scholar 

  • Fawcett T (2003) ROC graphs: Notes and practical considerations for researchers. Tech. Rep. HPL-2003-4, HP Labs

  • Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874. doi:10.1016/j.patrec.2005.10.010

    Article  Google Scholar 

  • Fernandez A, García S, Del Jesus MJ, Herrera F (2008) A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data sets. Fuzzy Sets Syst 159(18):2378–2398. doi:10.1016/j.fss.2007.12.023

    Article  Google Scholar 

  • Herrera F (2008) Genetic fuzzy systems: taxonomy, current research trends and prospects. Evol Intell 1:27–46. doi:10.1007/s12065-007-0001-5

    Article  Google Scholar 

  • Ho SY, Chen HM, Ho SJ, Chen TK (2004) Design of accurate classifiers with a compact fuzzy-rule base using an evolutionary scatter partition of feature space. IEEE Trans Syst Man Cybern 34(2):1031–1044. doi:10.1109/TSMCB.2003.819160

    Article  Google Scholar 

  • Horn J, Nafpliotis N, Goldberg DE (1999) A niched Pareto genetic algorithm for multiobjective optimization. In: Proceedings of the first IEEE conference on evolutionary computation, Orlando, Florida, 27–29 June, pp 82–87

  • Ishibuchi H (2007) Multiobjective genetic fuzzy systems: review and future research directions. In: Proceedings of the 2007 international conference on fuzzy systems, London, 23–26 July, pp 1-6

  • Ishibuchi H, Nojima Y (2007) Analysis of interpretability-accuracy tradeoff of fuzzy systems by multiobjective fuzzy genetics-based machine learning. Int J Approx Reason 44(1):4–31. doi:10.1016/j.ijar.2006.01.004

    Article  MATH  MathSciNet  Google Scholar 

  • Ishibuchi H, Yamamoto T (2004) Fuzzy rule selection by multi-objective genetic local search algorithms and rule evaluation measures in data mining. Fuzzy Sets Syst 141(1):59–88. doi:10.1016/S0165-0114(03)00114-3

    Article  MATH  MathSciNet  Google Scholar 

  • Ishibuchi H, Yamamoto T (2005) Rule weights specification in fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 13(4):428–435. doi:10.1109/TFUZZ.2004.841738

    Article  Google Scholar 

  • Ishibuchi H, Murata T, Turksen IB (1997) Single-objective and two-objective genetic algorithms for selecting linguistic rules for pattern classification problems. Fuzzy Sets Syst 89(2):135–150. doi:10.1016/S0165-0114(96)00098-X

    Article  Google Scholar 

  • Ishibuchi H, Nakashima T, Nii M (2005a) Classification and modeling with linguistic information granules: advanced approaches to linguistic data Mining. Springer, Berlin

  • Ishibuchi H, Nozaki K, Yamamoto N, Tanaka H (2005b) Selecting fuzzy if-then rules for classification problems using genetic algorithms. IEEE Trans Fuzzy Syst 3(3):260–270. doi:10.1109/91.413232

    Article  Google Scholar 

  • Karr CL, Gentry EJ (1993) Fuzzy control of pH using genetic algorithms. IEEE Trans Fuzzy Syst 1:46–53. doi:10.1109/TFUZZ.1993.390283

    Article  Google Scholar 

  • Knowles JD, Corne DW (2002) Approximating the non dominated front using the Pareto archived evolution strategy. Evol Comput 8(2):149–172. doi:10.1162/106365600568167

    Article  Google Scholar 

  • Kupinski M, Anastasio M (1999) Multiobjective genetic optimization of diagnostic classifiers with implications for generating receiver operating characteristic curves. IEEE Trans Med Imaging 18(8):675–685. doi:10.1109/42.796281

    Article  Google Scholar 

  • Mansoori E, Zolghadri M, Katebi S (2007) A weighting function for improving fuzzy classification systems performance. Fuzzy Sets Syst 158(5):583–591. doi:10.1016/j.fss.2006.10.004

    Article  MathSciNet  Google Scholar 

  • Nakashima T, Shaefer G, Yokota Y, Ishibuchi H (2007) A weighted fuzzy classifier and its application to image processing tasks. Fuzzy Sets Syst 158:284–294. doi:10.1016/j.fss.2006.10.011

    Article  Google Scholar 

  • Nauck D, Kruse R (1999) Obtaining interpretable fuzzy classification rules from medical data. Artif Intell Med 16(2):149–169. doi:10.1016/S0933-3657(98)00070-0

    Article  MathSciNet  Google Scholar 

  • Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3):203–231. doi:10.1023/A:1007601015854

    Article  MATH  Google Scholar 

  • Quinlan JR (1993) C4.5: Programs for Machine Learning. Morgan Kauffman, San Mateo

    Google Scholar 

  • Raskutti B, Kowalczyk A (2004) Extreme rebalancing for SVMs: a case study. SIGKDD Explor 6(1):60–69. doi:10.1145/1007730.1007739

    Article  Google Scholar 

  • Sheskin D (2003) Handbook of parametric and nonparametric statistical procedures. Chapman & Hall/CRC, London/Boca Raton

  • Srinivas N, Deb K (1998) Multi-objective function optimization using non-dominated sorting genetic algorithms. Evol Comput 2:221–248. doi:10.1162/evco.1994.2.3.221

    Article  Google Scholar 

  • Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83. doi:10.2307/3001968

    Article  Google Scholar 

  • Woods K, Doss C, Bowyer K, Solka J, Priebe J, Kegelmeyer P (1993) Comparative evaluation of pattern recognition techniques for detection of microcalcifications in mammography. Int J Pattern Recognit Artif Intell 7(6):1417–1436. doi:10.1142/S0218001493000698

    Article  Google Scholar 

  • Yen J, Wang L, Gillespie GW (1998) Improving the interpretability of TSK fuzzy models by combining global learning and local learning. IEEE Trans Fuzzy Syst 6(4):530–537. doi:10.1109/91.728447

    Article  Google Scholar 

  • Zitzler E, Thiele L (1999) Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach. IEEE Trans Evol Comput 3:257–271. doi:10.1109/4235.797969

    Article  Google Scholar 

  • Zitzler E, Deb K, Thiele L (2000) Comparison of multiobjective evolutionary algorithms: empirical results. Evol Comput 8(2):173–195. doi:10.1162/106365600568202

    Article  Google Scholar 

  • Zitzler E, Laumanns M, Thiele L (2001) SPEA2: Improving the strength Pareto evolutionary algorithm for multiobjective optimization. In: Proceedings of EUROGEN2001 evolutionary methods for design, opt. and control with applications to industrial problems, Athens, pp 95–100

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Marcelloni.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ducange, P., Lazzerini, B. & Marcelloni, F. Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets. Soft Comput 14, 713–728 (2010). https://doi.org/10.1007/s00500-009-0460-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-009-0460-y

Keywords

Navigation