Skip to main content

Advertisement

Log in

An adjustable fuzzy classification algorithm using an improved multi-objective genetic strategy based on decomposition for imbalance dataset

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In this paper, we propose an adjustable fuzzy classification algorithm using multi-objective genetic strategy based on decomposition (AFC_MOGD) to solve imbalance classification problem. In AFC_MOGD, firstly, an improved multi-objective genetic strategy based on decomposition is adopted as the basic optimization algorithm in which a new updating pattern getting good solutions is designed. Then, an adjustable parameter which is ranged in the interval [0, 1] is used to adjust complexity of each classifier artificially. Finally, a normalized method which takes class percentage into account to determine class label and rule weight of each rule is introduced so as to obtain more reasonable rules. The proposed algorithm is compared with three typical algorithms on eleven imbalance datasets in terms of area under the ROC of convex hull. The Wilcoxon signed-rank test is also carried out to show that our algorithm is superior to other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Alcala-Fdez J, Fernandez A, Luengo J, Derrac J, Garcia S, Sanchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multi-Valued Log. Soft Comput 17:255–287

    Google Scholar 

  2. Alcala R, Nojima Y, Herrera F, Ishibuchi H (2009) Generating single granularity-based fuzzy classification rules for multiobjective genetic fuzzy rule selection. In: Proceedings of FUZZ-IEEE, pp 1718–1723

  3. Antonelli M, Ducange P, Marcelloni F (2012) Multi-objective evolutionary rule and condition selection for designing fuzzy rule-based classifiers. In: Proceedings of FUZZ-IEEE, pp 1–7

  4. Bandyopadhyay S, Pal SK (2007) Classification and learning using genetic algorithms: applications in bioinformatics and web intelligence (Natural computing series). Springer, Berlin

    MATH  Google Scholar 

  5. Batista G, Prati R, Monard M (2004) A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor 6(1):20–29

    Article  Google Scholar 

  6. Booker LB, Goldberg DE, Holland JH (1989) Classifier systems and genetic algorithms. Artif Intell 40(1-3):235–282

    Article  Google Scholar 

  7. Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit 30:1145–1159

    Article  Google Scholar 

  8. Burnaev E (2015) Influence of resampling on accuracy of imbalanced classification. In: Proceedings of SPIE 9875, ICMV

  9. Campadelli P, Casiraghi E, Valentini G (2005) Support vector machines for candidate nodules classification. Lett Neurocomput 68:281–288

    Article  Google Scholar 

  10. Cao L, Shen H (2017) Combining resampling with twin support vector machine for imbalanced data classification. In: International conference on parallel and distributed computing, pp 325–329

  11. Coello Coello CA, Lamont GB, van Veldhuizen DA (2007) Evolutionary algorithms for solving multi-objective problems (Genetic and evolutionary computation), 2nd edn. Springer, Berlin

    MATH  Google Scholar 

  12. Deb K, Agrawal S, Pratap A, Meyarivan T (2002) A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197

    Google Scholar 

  13. Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, London

    MATH  Google Scholar 

  14. Ducange P, Lazzerini B, Marcelloni F (2010) Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets. Soft Comput 14(7):713–728

    Article  Google Scholar 

  15. Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874

    Article  Google Scholar 

  16. Fawcett T(2003) ROC graphs: notes and practical considerations for researchers. Technical Report HPL, HP Labs

  17. Fernandez A, Garcia S, De Jesus MJ, Herrera F (2008) A study of the behavior of linguistic fuzzy rule based rule based classification systems in the framework of imbalanced data sets. Fuzzy Sets Syst 159(18):2378–2398

    Article  Google Scholar 

  18. Fernandez A, Garcia S, Luengo J, Bernado-Mansilla E, Herrera F (2010) Genetics-based machine learning for rule induction: state of the art, taxonomy and comparative study. IEEE Trans Evol Comput 14(6):913–941

    Article  Google Scholar 

  19. Garcia S, Molina D, Lozano M, Herrera F (2009) A study on the use of non-parametic tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’ 2005 special session on real parameter optimization. J Heuristics 15:617–644

    Article  Google Scholar 

  20. Gonzalez A, Perez R (1998) Completeness and consistency conditions for learning fuzzy rules. Fuzzy Sets Syst 96:37–51

    Article  MathSciNet  Google Scholar 

  21. He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  22. Ishibuchi H, Nozaki K, Tanaka H (1992) Distributed representation of fuzzy rules and its application to pattern classification. Fuzzy Sets Syst 52:21–32

    Article  Google Scholar 

  23. Ishibuchi H, Yamamoto T (2005) Rule weight specification in fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 13(4):428–435

    Article  Google Scholar 

  24. Lopez V, Fernandez A, Garca S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141

    Article  Google Scholar 

  25. Lopez V, Fernandez A, Herrera F (2014) On the importance of the validation technique for classification with imbalanced datasets: addressing covariate shift when data is skewed. Inf Sci 257:1–13

    Article  Google Scholar 

  26. Mandal DP, Murthy CA, Pal SK (1992) Formulation of a multi-valued recognition system. IEEE Trans Syst Man Cybern 22(4):607–620

    Article  Google Scholar 

  27. Moreno-Torres J, Saez J, Herrera F (2012) Study on the impact of partition-induced dataset shift on k-fold cross-validation. IEEE Trans Neural Netw Learn Syst 23(8):1304–1312

    Article  Google Scholar 

  28. Mukhopadhyay A, Maulik U, Bandyopadhyay S, Coello Coello CA (2014) A survey of multiobjective evolutionary algorithms for data mining: part I. IEEE Trans Evol Comput 18(1):4–19

    Article  Google Scholar 

  29. Orriols-Puig A, Bernado-Mansilla E (2009) Evolutionary rule-based systems for imbalanced datasets. Soft Comput 13:213–225

    Article  Google Scholar 

  30. Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3):203–231

    Article  Google Scholar 

  31. Pruengkarn R, Wong KW, Fung CC (2017) imbalanced data classification using complementary fuzzy support vector machine techniques and SMOTE. In: IEEE international conference on systems, pp 978–983

  32. Pulkkinen P, Koivisto H (2008) Fuzzy classifier identification using decision tree and multiobjective evolutionary algorithms. Int J Approx Reason 48(2):526–543

    Article  Google Scholar 

  33. Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo

    Google Scholar 

  34. Ren F, Cao P, Li W, Zhao D, Zaiane O (2017) Ensemble based adaptive oversampling method for imbalanced data learning in computer aided detection of microaneurysm. Comput Med Imaging Graph 55:54–55

    Article  Google Scholar 

  35. Seiffert C, Khoshgoftaar TM, Van Hulse J, Folleco A (2014) An empirical study of the classification performance of learners on imbalanced and noisy software quality data. Inf Sci 259:571–595

    Article  Google Scholar 

  36. Smith SF (1980) A learn system based on genetic algorithms. Ph.D. Dissertation, University of Pittsburgh, PA

  37. Srinivasan S, Ramakrishnan S (2011) Evolutionary multiobjective optimization for rule mining: a review. Artif Intell Rev 36(3):205–248

    Article  Google Scholar 

  38. Wang LX, Mendel JM (1992) Generating fuzzy rules by learning from examples. IEEE Trans Syst Man Cybern 22(6):1414–1427

    Article  MathSciNet  Google Scholar 

  39. Wang S, Minku L, Yao X (2013) Online class imbalance learning and its applications in fault detection. Int J Comput Intell Appl 12(4):134001

    Article  Google Scholar 

  40. Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443

    Article  Google Scholar 

  41. Weiss G, Hirsh H (2000) A quantitative study of small disjuncts. In: National conference on artificial intelligence, pp 665–670

  42. Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83

    Article  MathSciNet  Google Scholar 

  43. Xu L, Chow M, Taylor L (2007) Power distribution fault cause identification with imbalanced data using the data mining-based fuzzy classification E-algorithm. IEEE Trans Power Syst 22(1):164–171

    Article  Google Scholar 

  44. Zhang C, Wang G, Zhou Y, Jiang J (2017) A new approach for imbalanced data classification based on minimize loss learning. In: IEEE second international conference on data science in cyberspace, pp 82–87

  45. Zhang Q, Li H (2007) MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731

    Article  Google Scholar 

  46. Zhuang L, Dai H, Hang X (2005) A novel field learning algorithm for dual imbalance text classification. In: International conference on fuzzy systems and knowledge discovery, pp 39–48

    Chapter  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61876141, 61373111, 61272279, 61103119, 61672405 and 61203303); the Fundamental Research Funds for the Central Universities (No. JB1502027); and the Provincial Natural Science Foundation of Shaanxi of China (No. 2014JM8321).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruochen Liu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, R., Wang, F., He, M. et al. An adjustable fuzzy classification algorithm using an improved multi-objective genetic strategy based on decomposition for imbalance dataset. Knowl Inf Syst 61, 1583–1605 (2019). https://doi.org/10.1007/s10115-019-01342-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-019-01342-5

Keywords

Navigation