A novel multi-stage hybrid model with enhanced multi-population niche genetic algorithm: An application in credit scoring

https://doi.org/10.1016/j.eswa.2018.12.020Get rights and content

Highlights

  • A novel multi-stage hybrid model is proposed and applied to credit scoring.

  • Multi-population niche GA (MPNGA) is proposed to improve search efficiency.

  • Feature/classifier selection enables the acquisition of optimal subset.

  • The stacking-based ensemble is constructed to enhance predictive effectiveness.

  • The proposed model is validated on five datasets over four performance metrics.

Abstract

In recent years, artificial intelligence and machine learning technology have made great progress and development. Various novel models have been constructed to enhance prediction performance of binary classification from different aspects. Credit scoring model is a typical application of artificial intelligence and machine learning technology. In this study, we propose a novel multi-stage hybrid model, which combines feature selection and classifier selection to obtain optimal feature subset and optimal classifier subset, then uses classifier ensemble to improve the prediction performance based on the two optimal subsets mentioned above. We also extend genetic algorithm, i.e., propose an enhanced multi-population niche genetic algorithm (EMPNGA), to improve the ability of optimization effectively by enhancing the selection, crossover, and mutation steps, and adding niche and migration steps. Furthermore, EMPNGA is applied to combine several filter methods and priori knowledge in feature selection and classifier selection respectively to further increase the search efficiency. The proposed model is applied to credit scoring to verify its prediction performance. Finally, five datasets and four evaluation metrics are applied in the experiment. The experimental results confirm that the performance of proposed model is superior to the other comparative models, proving that this study is of significance and effectiveness.

Introduction

In recent years, artificial intelligence and machine learning technology have been greatly developed. In previous studies, several typical classification models have been applied in binary classification, such as linear discriminant analysis (LDA; Fisher, 1936), logistic regression (LR; Hand & Kelly, 2002), decision tree (DT; Li, Ying, Tuo, & Li, 2004), support vector machine (SVM; Huang, Chen, Hsu, Chen, & Wu, 2004), and multilayer perceptron network (MLP; West, 2000).

In general, datasets for machine learning are typically multidimensional. However, irrelevant and redundant features not only reduce the prediction performance of a classification model but can also increase the computational complexity. Feature selection methods are recognized as promising approaches in machine learning, and it is applied to identify the key features to reduce the computing time cost of the classification models and improve the prediction performance. Some previous studies have explored feature selection methods, including Chen and Li, 2010, Hajek and Michalak, 2013, Maldonado, Pérez and Bravo, 2017, Oreski and Oreski, 2014, and Wang, Zhang, Bai, Mao (2017). But, there still remain new capabilities to be discovered and explored.

Ensemble models have also been widely considered to improve the performance of classification models in recent years. Many ensemble models have been applied to machine learning, such as homogeneous ensemble models based on DT, random forest (RF; Friedman, 2001), gradient boosting decision tree (GBDT; Friedman, 2001), and XGBoost (Chen & Guestrin, 2016). The heterogeneous ensemble models, which combine multiple base classifiers, have also garnered widespread attention (Ala’ Raj & Abbod, 2016a, and Ala'Raj and Abbod, 2016b, Xia, Liu, Da and Xie, 2018). Lessmann, Baesens, Seow, and Thomas (2015) proved that the performance of heterogeneous ensembles is frequently superior to individual classifiers. However, how to determine the most effective ensemble model for different datasets has not yet been completely solved. In addition, the problem complexity and computational time of classifier selection in the original feature is usually large. Therefore, effective classifier selection methods should be considered to obtain a more appropriate ensemble model within a certain complexity.

Credit scoring has gained considerable attention in financial industry owing to its importance in credit risk management. A small improvement in credit scoring model can bring large profits to financial institutions, therefore, many artificial intelligence and machine learning models have been applied to credit scoring to verify their performance in binary classification. In this study, we propose a novel multi-stage hybrid model, which combines feature selection and classifier selection, to obtain a superior prediction performance. Furthermore, an enhanced multi-population niche genetic algorithm (EMPNGA) is proposed to combine several filter methods and priori knowledge in feature selection and classifier selection respectively, to enable the acquisition of optimal feature/classifier subset. Then classifier ensemble is used to improve the prediction performance of the model based on these optimal subsets mentioned above. The proposed model is applied to credit scoring to verify its prediction performance in binary classification. The experimental results demonstrate that these multiple stages of the hybrid model have played a significant role in improving the prediction performance and the final prediction performance of the proposed model is superior to other comparative models. This confirms that the proposed model is effective and practical, and provides a new research direction for future machine learning research.

The remainder of this study is organized as follows. Section 2 describes related work regarding genetic algorithm, feature selection and classifier ensemble. Section 3 describes the mechanism of the proposed model. Section 4 presents the experimental design. Section 5 describes the experimental results and comparative analysis. The conclusions and future works are listed in Section 6.

Section snippets

Related work

Our studies in this paper can be divided into three parts in relation to: (1) genetic algorithm, (2) feature selection, and (3) classifier ensemble. As important sub-fields of machine learning research, these issues have attracted much attention from scholars. In this section, these three issues are reviewed and their applications in credit scoring are elaborated.

The proposed multi-stage hybrid model

In this section, the multi-stage hybrid model is presented, and its framework is described in Fig. 1. This hybrid model can be divided into three stages: feature selection, classifier selection, and classifier ensemble. In the feature selection stage, the preprocessed data are used as input data and several filter methods are combined to determine the synthetic feature importance of all the features. The synthetic feature importance combines the respective characteristics of the several filter

Credit datasets

In the experiment, five real-world credit datasets are used to verify the performance of the proposed model. That is, three credit scoring datasets from the UCI Machine Learning Repository (Asuncion & Newman, 2007), namely, Australian, German, and Japanese datasets, PPDai dataset, which is a part of a loan dataset provided by the Chinese internet finance enterprise named PaiPaiDai,1 and GMSC dataset, which is published by a famous data competition platform (Kaggle2

Experimental results

In this section, experiment results are presented to validate the advantages of the proposed model compared to other comparative classifiers and demonstrate the effectiveness of the proposed model. All of the experiments used Python Version 3.6 on a PC with 3.2 GHz Intel CORE i7 processor. The PC had 32 GB of RAM, and ran the Microsoft Windows 7 operating system.

Conclusions and future work

In recent years, artificial intelligence and machine learning technology have made rapid development, and various novel models have been constructed to enhance prediction performance in binary classification. Researchers have conducted numerous valuable explorations in some fields, including feature selection, classifier selection, and classifier ensemble. Although some studies have done a combinatorial research of the above-mentioned approaches, the optimal integration of them has not been

Acknowledgment

The work has been supported by National Natural Science Foundation of China (Nos. 51875503, 51475410), and Zhejiang Natural Science Foundation of China (No. LY17E050010).

References (40)

  • M. Ala'Raj et al.

    Classifiers consensus system approach for credit scoring

    Knowledge-Based Systems

    (2016)
  • M. Ala'Raj et al.

    A new hybrid ensemble credit scoring model based on classifiers consensus system approach

    Expert Systems with Applications

    (2016)
  • A. Asuncion et al.

    UCI machine learning repository

    (2007)
  • A. Bequé et al.

    Approaches for credit scorecard calibration: An empirical analysis

    Knowledge-Based Systems

    (2017)
  • L. Breiman

    Bagging predictors

    Machine Learning

    (1996)
  • ChenF.L. et al.

    Combination of feature selection approaches with SVM in credit scoring

    Expert Systems with Applications

    (2010)
  • ChenN. et al.

    A genetic algorithm-based approach to cost-sensitive bankruptcy prediction

    Expert Systems with Applications

    (2011)
  • ChenT. et al.

    Xgboost: A scalable tree boosting system

  • ChouC.H. et al.

    Hybrid genetic algorithm and fuzzy clustering for bankruptcy prediction

    Applied Soft Computing

    (2017)
  • T.M. Cover et al.

    Elements of information theory

    (1991)
  • J. Demšar

    Statistical comparisons of classifiers over multiple data sets

    The Journal of Machine Learning Research

    (2006)
  • S. Finlay

    Multiple classifier architectures and their application to credit risk assessment

    European Journal of Operational Research

    (2011)
  • R.A. Fisher

    Studies in crop variation. I. An examination of the yield of dressed grain from broadbalk

    The Journal of Agricultural Science

    (1921)
  • R.A. Fisher

    The use of multiple measurements in taxonomic problems

    Annals of Human Genetics

    (1936)
  • J.H. Friedman

    Greedy function approximation: A gradient boosting machine

    Annals of Statistics

    (2001)
  • M. Friedman

    A comparison of alternative tests of significance for the problem of m rankings

    The Annals of Mathematical Statistics

    (1940)
  • P. Hajek et al.

    Feature selection in corporate credit rating prediction

    Knowledge-Based Systems

    (2013)
  • D.J. Hand

    Measuring classifier performance: A coherent alternative to the area under the roc curve

    Machine Learning

    (2009)
  • D.J. Hand et al.

    A better beta for the H measure of classification performance

    Pattern Recognition Letters

    (2014)
  • D.J. Hand et al.

    Superscorecards

    Ima Journal of Management Mathematics

    (2002)
  • Cited by (108)

    View all citing articles on Scopus
    View full text