Abstract
In this paper, we propose an adjustable fuzzy classification algorithm using multi-objective genetic strategy based on decomposition (AFC_MOGD) to solve imbalance classification problem. In AFC_MOGD, firstly, an improved multi-objective genetic strategy based on decomposition is adopted as the basic optimization algorithm in which a new updating pattern getting good solutions is designed. Then, an adjustable parameter which is ranged in the interval [0, 1] is used to adjust complexity of each classifier artificially. Finally, a normalized method which takes class percentage into account to determine class label and rule weight of each rule is introduced so as to obtain more reasonable rules. The proposed algorithm is compared with three typical algorithms on eleven imbalance datasets in terms of area under the ROC of convex hull. The Wilcoxon signed-rank test is also carried out to show that our algorithm is superior to other algorithms.
Similar content being viewed by others
References
Alcala-Fdez J, Fernandez A, Luengo J, Derrac J, Garcia S, Sanchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multi-Valued Log. Soft Comput 17:255–287
Alcala R, Nojima Y, Herrera F, Ishibuchi H (2009) Generating single granularity-based fuzzy classification rules for multiobjective genetic fuzzy rule selection. In: Proceedings of FUZZ-IEEE, pp 1718–1723
Antonelli M, Ducange P, Marcelloni F (2012) Multi-objective evolutionary rule and condition selection for designing fuzzy rule-based classifiers. In: Proceedings of FUZZ-IEEE, pp 1–7
Bandyopadhyay S, Pal SK (2007) Classification and learning using genetic algorithms: applications in bioinformatics and web intelligence (Natural computing series). Springer, Berlin
Batista G, Prati R, Monard M (2004) A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explor 6(1):20–29
Booker LB, Goldberg DE, Holland JH (1989) Classifier systems and genetic algorithms. Artif Intell 40(1-3):235–282
Bradley AP (1997) The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognit 30:1145–1159
Burnaev E (2015) Influence of resampling on accuracy of imbalanced classification. In: Proceedings of SPIE 9875, ICMV
Campadelli P, Casiraghi E, Valentini G (2005) Support vector machines for candidate nodules classification. Lett Neurocomput 68:281–288
Cao L, Shen H (2017) Combining resampling with twin support vector machine for imbalanced data classification. In: International conference on parallel and distributed computing, pp 325–329
Coello Coello CA, Lamont GB, van Veldhuizen DA (2007) Evolutionary algorithms for solving multi-objective problems (Genetic and evolutionary computation), 2nd edn. Springer, Berlin
Deb K, Agrawal S, Pratap A, Meyarivan T (2002) A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, London
Ducange P, Lazzerini B, Marcelloni F (2010) Multi-objective genetic fuzzy classifiers for imbalanced and cost-sensitive datasets. Soft Comput 14(7):713–728
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27:861–874
Fawcett T(2003) ROC graphs: notes and practical considerations for researchers. Technical Report HPL, HP Labs
Fernandez A, Garcia S, De Jesus MJ, Herrera F (2008) A study of the behavior of linguistic fuzzy rule based rule based classification systems in the framework of imbalanced data sets. Fuzzy Sets Syst 159(18):2378–2398
Fernandez A, Garcia S, Luengo J, Bernado-Mansilla E, Herrera F (2010) Genetics-based machine learning for rule induction: state of the art, taxonomy and comparative study. IEEE Trans Evol Comput 14(6):913–941
Garcia S, Molina D, Lozano M, Herrera F (2009) A study on the use of non-parametic tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’ 2005 special session on real parameter optimization. J Heuristics 15:617–644
Gonzalez A, Perez R (1998) Completeness and consistency conditions for learning fuzzy rules. Fuzzy Sets Syst 96:37–51
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Ishibuchi H, Nozaki K, Tanaka H (1992) Distributed representation of fuzzy rules and its application to pattern classification. Fuzzy Sets Syst 52:21–32
Ishibuchi H, Yamamoto T (2005) Rule weight specification in fuzzy rule-based classification systems. IEEE Trans Fuzzy Syst 13(4):428–435
Lopez V, Fernandez A, Garca S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inf Sci 250:113–141
Lopez V, Fernandez A, Herrera F (2014) On the importance of the validation technique for classification with imbalanced datasets: addressing covariate shift when data is skewed. Inf Sci 257:1–13
Mandal DP, Murthy CA, Pal SK (1992) Formulation of a multi-valued recognition system. IEEE Trans Syst Man Cybern 22(4):607–620
Moreno-Torres J, Saez J, Herrera F (2012) Study on the impact of partition-induced dataset shift on k-fold cross-validation. IEEE Trans Neural Netw Learn Syst 23(8):1304–1312
Mukhopadhyay A, Maulik U, Bandyopadhyay S, Coello Coello CA (2014) A survey of multiobjective evolutionary algorithms for data mining: part I. IEEE Trans Evol Comput 18(1):4–19
Orriols-Puig A, Bernado-Mansilla E (2009) Evolutionary rule-based systems for imbalanced datasets. Soft Comput 13:213–225
Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3):203–231
Pruengkarn R, Wong KW, Fung CC (2017) imbalanced data classification using complementary fuzzy support vector machine techniques and SMOTE. In: IEEE international conference on systems, pp 978–983
Pulkkinen P, Koivisto H (2008) Fuzzy classifier identification using decision tree and multiobjective evolutionary algorithms. Int J Approx Reason 48(2):526–543
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
Ren F, Cao P, Li W, Zhao D, Zaiane O (2017) Ensemble based adaptive oversampling method for imbalanced data learning in computer aided detection of microaneurysm. Comput Med Imaging Graph 55:54–55
Seiffert C, Khoshgoftaar TM, Van Hulse J, Folleco A (2014) An empirical study of the classification performance of learners on imbalanced and noisy software quality data. Inf Sci 259:571–595
Smith SF (1980) A learn system based on genetic algorithms. Ph.D. Dissertation, University of Pittsburgh, PA
Srinivasan S, Ramakrishnan S (2011) Evolutionary multiobjective optimization for rule mining: a review. Artif Intell Rev 36(3):205–248
Wang LX, Mendel JM (1992) Generating fuzzy rules by learning from examples. IEEE Trans Syst Man Cybern 22(6):1414–1427
Wang S, Minku L, Yao X (2013) Online class imbalance learning and its applications in fault detection. Int J Comput Intell Appl 12(4):134001
Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443
Weiss G, Hirsh H (2000) A quantitative study of small disjuncts. In: National conference on artificial intelligence, pp 665–670
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics 1:80–83
Xu L, Chow M, Taylor L (2007) Power distribution fault cause identification with imbalanced data using the data mining-based fuzzy classification E-algorithm. IEEE Trans Power Syst 22(1):164–171
Zhang C, Wang G, Zhou Y, Jiang J (2017) A new approach for imbalanced data classification based on minimize loss learning. In: IEEE second international conference on data science in cyberspace, pp 82–87
Zhang Q, Li H (2007) MOEA/D: a multiobjective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731
Zhuang L, Dai H, Hang X (2005) A novel field learning algorithm for dual imbalance text classification. In: International conference on fuzzy systems and knowledge discovery, pp 39–48
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Nos. 61876141, 61373111, 61272279, 61103119, 61672405 and 61203303); the Fundamental Research Funds for the Central Universities (No. JB1502027); and the Provincial Natural Science Foundation of Shaanxi of China (No. 2014JM8321).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, R., Wang, F., He, M. et al. An adjustable fuzzy classification algorithm using an improved multi-objective genetic strategy based on decomposition for imbalance dataset. Knowl Inf Syst 61, 1583–1605 (2019). https://doi.org/10.1007/s10115-019-01342-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-019-01342-5