An uncertainty-oriented cost-sensitive credit scoring framework with multi-objective feature selection

https://doi.org/10.1016/j.elerap.2022.101155Get rights and content

Highlights

  • An uncertainty-oriented credit scoring framework with multi-objective feature selection is developed to tackle the credit classification task under uncertainty.

  • This study extends the use of cost space to feature selection in the credit scoring process.

  • The proposed credit scoring framework can provide a series of Pareto-optimal credit scoring models to fit different decision-making contexts.

  • The cost plot enables credit decision-makers to see the results of different credit scoring models with different risk sources and corresponding expected costs.

Abstract

In order to solve the problem of uncertain misclassification costs and class distributions in credit scoring tasks, an uncertainty-oriented credit scoring framework based on a multi-objective feature selection strategy is proposed in this study. This proposed framework searches for a pool of Pareto-optimal credit scoring models with different feature subsets without the assumption of the operating condition (misclassification costs and class distributions) information. Specifically, the searching process concerns the trade-off of the False Positive Rate and the False Negative Rate using a binary multi-objective particle swarm optimization (BMOPSO) algorithm. By visualizing the Pareto-optimal solutions in cost space, credit decision-makers can select an optimal compromise model based on their decision-making contexts. The proposed framework is compared with baseline models on three retail credit scoring datasets. The experimental results show that the proposed framework could find out the optimal credit model with minimal misclassification cost for almost all possible operating conditions.

Introduction

As a critical component of credit risk management process, credit scoring, which can help financial institutions decide whether or not to grant credit, is particularly important (Thomas, 2000). A credit scoring task is commonly formulated as a binary classification problem (Lessmann et al., 2015). Usually, the implementation of credit score modeling involves following steps: (i) collecting a training dataset with class labels (e.g., ’default’ and ’non-default’ classes), (ii) constructing a model that can accurately identify the class of each loan application, and (iii) evaluating the identification quality of the model. Many classification algorithms, such as Logistic Regression (LR), Support Vector Machine (SVM), Artificial Neural Networks (ANN), and Random Forest (RF), have been applied in credit scoring tasks (Maldonado et al., 2017a, Wang et al., 2012). In practice, an original high-dimensional feature set may contain redundant or irrelevant attributes, leading to high computational complexity and inferior classification performance (Unler and Murat, 2010). For this purpose, some sophisticated classification frameworks have been proposed to address this problem, including hybrid approaches. These hybrid approaches combine a feature selection strategy with a binary classifier (Kozodoi et al., 2019, Papouskova and Hajek, 2019). In addition, adequate feature selection strategies may also help reveal crucial insights for decision-making and reduce variable acquisition costs (Lopez and Maldonado, 2019).

Usually, feature selection strategies can be roughly divided into three groups: filter-based, wrapper-based, and embedded methods (Liang et al., 2015, Li et al., 2017). The filter-base strategy assesses feature importance relying solely on data characteristics. Compared to other feature selection methods, the filter-based method is computationally efficient. However, due to the lack of a specific classifier in guiding feature selection phase, the selected features may not be optimal for a target classifier. Previous studies show that filter-based strategy performs poorly relative to benchmark models (Sun et al., 2014). The wrapper-based method evaluates the quality of selected features according to the predictive performance of a predefined learning algorithm. Since evaluating all possible feature combinations is computationally expensive (for example, the number of possible feature combinations for d features is 2d), researchers have suggested various heuristic-search strategies, such as sequential forward selection (SFS) (Xia et al., 2017a) and evolutionary algorithms (Unler and Murat, 2010, Zhang et al., 2019). The embedded method is a trade-off between filter-based and wrapper-based strategies. Its main drawback is that it can only be utilized within a specific model family.

In credit scoring studies, wrapper-based feature selection strategy is widely employed because of its superior prediction performance and application flexibility (Wang et al., 2018, Kozodoi et al., 2019). This paper concentrates on the credit scoring approach with wrapper-based feature selection strategy. Table 1 presents an overview of applications of feature selection strategy in credit scoring tasks.

Some previous credit scoring studies utilize statistical criteria to evaluate a model’s performance and identify the optimal subset of features. For example, Xia et al. (2017a) utilizes SFS for feature selection, and the optimal feature subset is selected according to logarithmic loss value. The feature selection processes of Oreski and Oreski, 2014, Jadhav et al., 2018 select the feature subset with the highest classification accuracy as the optimal feature subset. In the research of Pławiak et al. (2020), a genetic algorithm (GA) is utilized to perform feature selection tasks. The fitness function of the GA algorithm is the total number of incorrect classifications. However, the above-mentioned evaluation metrics may be inappropriate for credit scoring tasks. This is because these metrics do not properly consider the imbalance property of credit scoring data and the business reality of a credit score model (asymmetric misclassification costs) (Liang et al., 2019, Verbraken et al., 2014). In practice, non-defaulted loans usually preponderate over defaulted loans in credit scoring data set (Moscato et al., 2021). The credit scoring models with feature selection strategies based on the above-mentioned measures tend to reveal excellent identification capability for the majority class (the non-defaulted) while producing poor identification results for the minority class (the defaulted) (Kang et al., 2021). Moreover, the cost of misclassifying a real defaulter (the minority) may be higher than that of misclassifying a real non-defaulter (the majority) (Yu et al., 2018). Therefore, the poor classified accuracy for a minority class may significantly impede the economic value of traditional classifiers in credit scoring tasks (Lee and Zhu, 2011, Xia et al., 2017b).

On account of the adverse effects of this reality, some business-oriented measures, also known as cost-sensitive measures, have been introduced. In the study of Papouskova and Hajek (2019), the overall misclassification cost of the credit scoring model and the number of features are selected as metrics to train/evaluate the credit scoring model and determine the optimal feature subset. Taking the expected profits and losses of credit granting into account, Verbraken et al. (2014) tailored a novel profit-based performance measure, i.e., Expected Maximum Profit (EMP) measure, for credit scoring tasks. The empirical results of Verbraken et al. (2014) show that a high-profit credit scoring model can be generated through the EMP measure. Kou et al., 2021, Kozodoi et al., 2019, Maldonado et al., 2017a extend the use of EMP to feature selection in their credit scoring modeling processes. However, these cost-sensitive credit scoring models still have disadvantages. They require (i) misclassification costs (or benefits) to be known precisely, and (ii) misclassification costs (or benefits) and class distributions remaining constant when the credit scoring model is trained and evaluated. In fact, the above-mentioned two requirements are unrealistic in reality. The misclassification costs and class distributions in credit scoring tasks are unknown and evolutive (Shen et al., 2021). Firstly, defaulted loans can eventually either be partly recovered or not recovered at all (Loterman et al., 2012), and these outcomes are highly uncertain. Secondly, misclassification cost depends on some evolving variables, such as interest rate volatilities. Reasonably, the misclassification cost of a non-defaulted loan is highly dependent on the interest rate. Thirdly, the amount of doubtful and non-performing loans evolves over time because it highly depends on macroeconomic changes (Basha et al., 2021).

To tackle tasks with asymmetric misclassification costs and class distributions, these cost-sensitive evaluation metrics assign larger weight to strengthen the impact of the minority class and smaller weight to weaken the effect of majority class. More specifically, while their rates of misclassifying a defaulter as a non-defaulter are reduced, the rates of misclassifying a non-defaulter as a defaulter are sacrificed (Maldonado et al., 2017a). An effective credit scoring model does not only need to minimize the loss of accepting bad applications but also need to safeguard benefits of accepting good applications (Rao et al., 2020). The model’s discriminative ability of real default instances is crucial. Still, it is not reasonable to sacrifice significant majority class instance identification ability. Especially when one cannot precisely specify its costs and class probabilities, such a sacrifice could be very hazardous (Lee and Zhu, 2011, Zhang and Zhanghang and Zhang, 2017). On the one hand, the avoided financial loss by identifying genuine defaulters may not offset the economic loss caused by misclassifying real non-defaulters. The experimental result of Oskarsdóttir et al. (2019) has demonstrated that a credit scoring model which is conservative and excludes a higher proportion of the defaulters may produce a low profit. On the other hand, more rejection decisions may make the disadvantaged group less likely to obtain credits. As a result, financial inclusion may be hampered (Fu et al., 2021). Therefore, the uncertain environment should be emphasized in the feature selection process of credit scoring tasks.

Considering the uncertain environment in binary-classification tasks, Chatelain et al., 2010, De Bock et al., 2020 employ multiple objectives as evaluation measures to optimize the combination of base classifiers for their ensemble classifier. Inspired by their studies, this study tries to construct an uncertainty-oriented credit scoring framework through a multi-objective feature selection strategy. Specifically, the proposed credit scoring framework is to search for a pool of Pareto-optimal credit scoring models concerning the trade-off of False Positive Rate (FPR) and False Negative Rate (FNR). Without the assumption of the operating condition (misclassification costs and class distributions) information during the searching process, the performance of credit scoring model can be decoupled from these uncertain factors. After obtaining the pool of Pareto-optimal credit scoring models, the competence ranges of each Pareto-optimal credit scoring model are visualized in a cost space plot. Decision-makers can then select a single credit scoring model and make customized optimal lending decisions based on the cost space plot and the information they possess during the deployment stage.

The main contributions of this study can be described as follows. This proposed credit scoring framework performs its feature selection strategy by using the FNR and FPR measures as fitness functions to address the uncertainty problem. To the best of our knowledge, there are at least two studies which perform feature selection of their credit scoring models as a multi-objective optimization problem (Kozodoi et al., 2019, Papouskova and Hajek, 2019). Their optimization objectives include the number of the selected features and the traditional cost-sensitive measures (e.g., EMP and overall misclassification cost, respectively). The problem of uncertain (and evolutive) misclassification costs and class distributions is not considered in their studies. The proposed framework may provide some benefits to practitioners from two perspectives. On the one hand, the output of the proposed credit scoring framework is a set of Pareto-optimal credit scoring candidates. Thus, the proposed framework enables a practitioner to change the credit scoring model without a new computationally expensive learning stage under uncertainty. On the other hand, through the multi-objective perspective and the cost plot’s visual interpretability, the practitioner can see the consequences of different credit scoring models concerning various risk sources and the corresponding expected costs simultaneously.

The remainder of this paper is organized as follows. Section 2 presents preliminary methods and proposes an uncertainty-oriented credit scoring framework. Section 3 describes the experiment designs. In Section 4, we present the experimental results and analysis on two retail credit scoring datasets. The final section draws conclusions, outlines research limitations, and offers suggestions for future research.

Section snippets

Cost space

Like most of the previous credit scoring studies, this study assigns the labels of ’0’ and ’1’ to defaulters and non-defaults, respectively. At first, the classification algorithm might provide an estimate p̂(1|x) of the probability that an application with the feature vector x belongs to class 1, or, more generally, it might produce a continuous score, s(x). And then, the actual classification results are obtained by comparing s(x) with a cutoff value τ. In this study, all applications with s(x

Dataset description

In this study, three retail credit scoring data sets are used to evaluate the effectiveness of the proposed framework. The first data set is the lending data for the whole year of 2019 from Lending Club, the largest U.S. P2P platform. The raw Lending Club (LC) data set contains 518,107 consumer loan records with 150 attributes, among which ”loan_status” is the explanatory variable of this study. There are seven types of ”loan_status” in this data set, among which ”Fully Paid” and ”Charged Off”

Experimental results and discussions

The proposed credit scoring framework is comprehensively compared with the benchmark models, including the resampling-based classifiers and credit scoring models with traditional feature selection strategies.

Conclusion and future works

In this research, an uncertainty-oriented credit scoring framework with multi-objective feature selection is developed to tackle the credit classification task under uncertainty. In this framework, four main phases, data preprocessing, classifier training, feature selection, and credit decision making, are involved. The framework selects False Positive Rate and False Negative Rate as optimization objectives for the wrapper-based feature selection process without the assumption of the operating

CRediT authorship contribution statement

Yiqiong Wu: Methodology, Formal analysis, Software, Writing - original draft. Wei Huang: Supervision, Writing - review & editing, Funding acquisition. Yingjie Tian: Supervision, Writing - review & editing. Qing Zhu: Visualization, Software, Validation. Lean Yu: Methodology, Conceptualization, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

The first author would like to acknowledge that this research was partially supported by NSFC grants (No. 71731009). The second author would like to acknowledge that this research was partially supported by NSFC grants (Nos. 72061127002, 2018WZDXM020). The fifth author would like to acknowledge that this research was partially supported by the Major Program of the National Social Science Foundation of China (No. 19ZDA103) and the Key Program of Research Center of Scientific Finance and

References (59)

  • N. Kozodoi et al.

    A multi-objective approach for profit-driven feature selection in credit scoring

    Decis. Support Syst.

    (2019)
  • S. Lessmann et al.

    Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research

    Eur. J. Oper. Res.

    (2015)
  • J. Li et al.

    An evolution strategy-based multiple kernels multi-criteria programming approach: The case of credit decision making

    Decis. Support Syst.

    (2011)
  • J. Lopez et al.

    Profit-based credit scoring based on robust optimization and feature selection

    Inf. Sci.

    (2019)
  • G. Loterman et al.

    Benchmarking regression algorithms for loss given default modeling

    Int. J. Forecast.

    (2012)
  • L. Ma et al.

    A new aspect on P2P online lending default prediction using meta-level phone usage data in China

    Decis. Support Syst.

    (2018)
  • X. Ma et al.

    Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning

    Electron. Commer. Res. Appl.

    (2018)
  • S. Maldonado et al.

    Integrated framework for profit-based feature selection and SVM classification in credit scoring

    Decis. Support Syst.

    (2017)
  • S. Maldonado et al.

    Cost-based feature selection for Support Vector Machines: An application in credit scoring

    Eur. J. Oper. Res.

    (2017)
  • M. Moscatelli et al.

    Corporate default forecasting with machine learning

    Expert Syst. Appl.

    (2020)
  • V. Moscato et al.

    A benchmark of machine learning approaches for credit score prediction

    Expert Syst. Appl.

    (2021)
  • S. Oreski et al.

    Genetic algorithm-based heuristic for feature selection in credit risk assessment

    Expert Syst. Appl.

    (2014)
  • M. Oskarsdóttir et al.

    The value of big data for credit scoring: Enhancing financial inclusion using mobile phone data and social network analytics

    Appl. Soft Computing

    (2019)
  • M. Papouskova et al.

    Two-stage consumer credit risk modelling using heterogeneous ensemble learning

    Decis. Support Syst.

    (2019)
  • P. Pławiak et al.

    DGHNL: A new deep genetic hierarchical network of learners for prediction of credit scoring

    Inf. Sci.

    (2020)
  • C. Rao et al.

    2-stage modified random forest model for credit risk assessment of P2P network lending to Three Rurals borrowers

    Appl. Soft Computing

    (2020)
  • F. Shen et al.

    Three-stage reject inference learning framework for credit scoring using unsupervised transfer learning and three-way decision theory

    Decis. Support Syst.

    (2020)
  • F. Shen et al.

    A new deep learning ensemble credit risk evaluation model with an improved synthetic minority oversampling technique

    Applied Soft Computing

    (2021)
  • Y. Song et al.

    Multi-view ensemble learning based on distance-to-model and adaptive clustering for imbalanced credit risk assessment in P2P lending

    Inf. Sci.

    (2020)
  • Cited by (6)

    • Domain-Adversarial Neural Network with Joint-Distribution Adaption for Credit Risk Classification

      2023, Proceedings of the International Conference on Electronic Business (ICEB)
    View full text