Statistics-based wrapper for feature selection: An implementation on financial distress identification with support vector machine
Introduction
Financial distress identification (FDI) is an effective tool of risk management. This area received lots of focuses from academic and industrial views [1], [2], [3], [5], [6], [11], [16], [19], [27], [29], [33], [34], [35], [36], [37], [39], [40], [41], [42], [44], [48]. Identification on whether or not a company will fail helps financial institutions, managers, employees, investors and government officials to control risk in their decisions. Predictive accuracy of the tool is a key index indicating whether it is helpful in real-world life. Basically, a predictive model is assumed to be more useful if it is more accurate. A whole dataset is commonly partitioned into training dataset, validating dataset, and testing dataset. The identification of new problems is simulated by using the model constructed on labelled data to predict unlabeled data. This partition is commonly repeated for lots of times in order to provide statistical analysis on significance. Under this assumption, support vector machine (SVM) is an important technique for FDI for the following two reasons: (1) SVM is constructed from mature statistical learning theory [10], [46]; (2) previous evidence shows that SVM produced dominating predictive performance in FDI [12], [17], [18], [26], [30], [31], [38], [44], [47].
Feature selection is a process that chooses information-rich features and retains the meaning of original features [14], [23]. Filters and wrappers are two chief methods of feature selection. Filters refer to the use of an algorithm to search through the space of possible features and then to evaluate each subset by running a filter function on the subset. The so-called filter function is not the same as the model used for prediction or classification. Thus, the feature selection approach does not consider preference of the model. Wrappers are similar to filters, but evaluate against the current model instead of a filter function.
A common drawback of previous researches of SVM-based FDI is that they used either filters or genetic algorithm to select optimal feature subsets for SVM. Wrappers are supposed to yield a feature subset that helps model produce dominating predictive performance. Greedy hill climbing, which finds the optimal feature subset by iteratively evaluating a candidate subset of features, is commonly used in wrappers. Genetic algorithm belongs to this type. However, the drawback of the use of genetic algorithm in wrappers is that the outputted feature subset is not the same when the approach is implemented several times.
This research attempts to construct a novel stable wrapper for SVM to identify financial distress. Two key issues in application of SVM include kernel selection and parameter optimization. This new wrapper is constructed on the base of each of the following type of SVM, including: linear SVM (LSVM), polynomial SVM (PSVM), Gaussian SVM (GSVM) and sigmoid SVM (SSVM). Lots of SVM models are produced by using various pairs of parameters after kernel function is selected. Predictions of various SVM models on each candidate feature are transferred into ranking-order information of each feature. The two statistical indices of mean and standard deviation computed from the ranking-order information are combined to calculate a feature selection index of SVM. This index is used to select optimal features.
This paper is organized as follows. Section 2 gives a brief review on feature selection and parameter searching methods used in previous researches of SVM-based FDI. Section 3 presents the new wrapper for SVM-based FDI. Section 4 designs an experiment to testify the efficiency and feasibility of the statistics-based wrapper. Section 5 discusses the experimental results. Section 6 makes conclusion.
Section snippets
Feature selection and parameter search in previous researches of SVM-based FDI
When SVM was firstly applied to identify financial distress, Shin et al. [38] employed a two-stage feature selection process, which is composed of t-test and stepwise multivariate discriminant analysis (MDA) in consecutive sequence. This type of feature selection belongs to the family of filters. The comparison on predictive performance between SVM and back-propagation neural networks (NN) indicated that SVM produced more accurate ratios than NN. Gaussian kernel was used and its parameters were
Kernels and parameters of SVM when constructing the approach
Wrapper for SVM evaluates against predictive performance of SVM itself. Kernel functions and parameters of SVM must be set up before constructing a wrapper. There are four commonly used kernel functions for SVM, i.e., linear kernel (u′*v), polynomial kernel ((gamma*u′*v)^p), Gaussian kernel (exp(-gamma*|u-v|^2)) and sigmoid kernel (tanh(gamma*u′*v)) [7], [8]. Each one of the four commonly kernel functions can be employed in the wrapper. The reason why we use the common range of {2−10,2−9,…27,28
Objective, data and variables
The objective of this empirical research is to test effectiveness and feasibility of SVM with the statistics-based wrapper when solving the task of FDI. A pioneer research on the issue of feature selection for SVM was conducted by Chen and Lin [9]. They applied feature selection for SVM on a dataset which has 500 features. Filters were firstly used to filter out only 16 features, which are further used in a wrapper. We used the following procedure, namely: firstly using some filter rules to
Optimal features from the filters and wrappers
Gamma for PSVM was set as the default value of libSVM, since the model could not been trained on the data set when it was larger than 26. Mean and S.D. of ranking orders of all features are listed in Table 3, where ΣFS/2M = 60.08. The index values of feature selection in the statistics-based wrapper are illustrated in Table 3. Meanwhile, features respectively selected by a wrapper integrating forward and backward selection on Mahalanobis distance with SVM [39] with RBF kernel and default
Implication and conclusion
One implication of the research is that it is effective to consider preferences of a predictive model for FDI on various parameters in feature selection. Mean and standard deviation information is derived from ranking-order information of performance of the model with various parameters on each feature. By using the statistics-bases approach, preferences on features are transferred to be feature selection index, which is used to select preferred features. Non-paramedic techniques have
Acknowledgements
This research is partially supported by the National Natural Science Foundation of China (No. 71171179; 71371171), the Zhejiang Provincial National Science Foundation for Distinguished Young Scholars of China (No. LR13G010001), the Zhejiang Provincial National Science Foundation of China (No. LY13G010001), and the Humanities and Social Science Foundation of Ministry of Education of China (no. 13YJC630140). The authors gratefully thank editors and anonymous referees for their comments and
References (48)
- et al.
Bankruptcy prediction modeling with hybrid case-based reasoning and genetic algorithms approach
Applied Soft Computing
(2009) Deciding the financial health of dot-coms using rough sets
Information & Management
(2006)- et al.
Business failure prediction using rough sets
European Journal of Operational Research
(1999) - et al.
Forecasting financial condition of Chinese listed companies based on support vector machine
Expert Systems with Applications
(2008) - et al.
Tests of the generalizability of Altman's bankruptcy prediction model
Journal of Business Research
(2001) Incorporating a non-additive decision making method into multi-layer neural networks and its application to financial distress analysis
Knowledge-Based Systems
(2008)- et al.
Predicting corporate financial distress based on integration of support vector machine and logistic regression
Expert Systems with Applications
(2007) - et al.
Bankruptcy prediction using case-based reasoning, neural network and discriminant analysis for bankruptcy prediction
Expert Systems with Applications
(1997) - et al.
Bankruptcy prediction: application of the Taylor's expansion in logistic regression
International Review of Financial Analysis
(2000) - et al.
Ranking-order case-based reasoning for financial distress prediction
Knowledge-Based Systems
(2008)
Predicting business failure using multiple case-based reasoning combined with support vector machine
Expert Systems with Applications
Developing a business failure prediction model via RST, GRA and CBR
Expert Systems with Applications
Early warning of bank failure: a logit regression approach
Journal of Banking and Finance
Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters
Expert Systems with Applications
Hybrid genetic algorithms and support vector machines for bankruptcy prediction
Expert Systems with Applications
A threshold-varying artificial neural network approach for classification and its application to bankruptcy prediction problem
Computers and Operations Research
DEA as a tool for bankruptcy assessment: a comparative study with logistic regression technique
European Journal of Operational Research
Soft computing system for bank performance prediction
Applied Soft Computing
An application of support vector machines in bankruptcy prediction model
Expert Systems with Applications
Dynamic financial distress prediction using instance selection for the disposal of concept drift
Expert Systems with Applications
Financial distress early warning based on group decision making
Computers and Operations Research
Using Bayesian networks for bankruptcy prediction: some methodological issues
European Journal of Operational Research
Using neural network ensembles for bankruptcy prediction and credit scoring
Expert Systems with Applications
A quadratic interval logit model for forecasting bankruptcy
Omega
Cited by (52)
Impacts of crisis on SME bankruptcy prediction models’ performance
2023, Expert Systems with ApplicationsCatBoost model and artificial intelligence techniques for corporate failure prediction
2021, Technological Forecasting and Social ChangeCitation Excerpt :Huynh (2020a) proposed the perceptron neural network nonlinear Granger causality and transfer entropy to examine the complex causal relationship between precious metals, economic policy uncertainty and the Chicago board exchange volatility index. Among the machine learning models, the SVM has gained wide popularity in bankruptcy prediction (Tsai and Cheng, 2012; Li et al., 2014; Barboza et al., 2017; Erdogan et al., 2019). Specifically, Erdogan (2013) found that a support vector machine with a Gaussian kernel provides useful information from accounting data and an effective warning system for Turkish commercial banks.
Financial distress prediction: Regularized sparse-based Random Subspace with ER aggregation rule incorporating textual disclosures
2020, Applied Soft Computing JournalBankruptcy prediction for the European aviation industry: An application of the Altman model
2024, Managerial and Decision Economics
- 1
Young Researcher of World Federation on Soft Computing.