Combining market and accounting-based models for credit scoring using a classification scheme based on support vector machines
Introduction
Credit risk refers to the probability that a client will not be able to meet his/her debt obligations (default). Over the years, many factors have contributed to the increasing importance of accurate credit risk measurement. Altman and Saunders [1] list five main issues, which are still valid in the current context: (i) a worldwide structural increase in the number of defaults, (ii) a trend towards disintermediation by the highest quality and largest borrowers, (iii) more competitive margins on loans, (iv) a declining value of real assets (and thus collateral) in many markets, and (v) a dramatic growth of high risk exposures including credit derivatives. Credit risk measurement is nowadays a critical issue as demonstrated by the recent outbreak of the global credit crisis in 2007–2008.
In a credit risk management context, the accurate estimation of the probability of default is a crucial point. Credit rating models (CRMs) are widely used for that purpose. CRMs evaluate the creditworthiness of obligors, estimate the probabilities of default, and classify obligors into risk groups. In a corporate credit granting context, most CRMs combine key financial (accounting) and non-financial data into an aggregate index indicating the credit risk of the firms. Such models can be constructed with a variety of statistical, data mining, and operations research techniques (e.g., logistic regression, neural networks, support vector machines, rule induction algorithms, multicriteria decision making, etc.). Comprehensive reviews of this line of research can be found in [2], [3], [4]. Despite their success and popularity, traditional credit scoring models are mostly static and they are based on historical accounting data, which may fail to represent adequately the future of the firms and the trends in the business environment [1], [5]. This is particularly important in the context of an economic turmoil, where exogenous conditions deteriorate rapidly in a short time period, thus affecting corporate activity and leading to increased credit risk levels throughout the market. Mensah [6] and Hillegeist et al. [7] also discuss issues related to the accounting standards and practices, which affect the quality of the information that financial statements provide.
The shortcomings of accounting-based credit scoring models have led to the consideration of a wide variety of alternative approaches (comprehensive overviews can be found in [1], [8]). Among them, structural models have attracted considerable interest. Structural models use stock exchange data to assess the probability of default [9], [10]. Stock prices reflect all the information related to the current status of the firms as well as the investors’ expectations about their future prospects [5]. Furthermore, market data are constantly updated in accordance with new information that becomes available about the operation of firms and the environment in which they operate. These features of market data and models indicate that they may be better suited for default prediction and credit risk measurement. Actually, several studies provide empirical results in support of market models in the context of credit risk modeling and bankruptcy prediction [5], [7]. Market models have also been shown to contribute in the construction of improved hybrid systems in combination with accounting-based models [11], [12].
Despite their strong theoretical grounds and good predictive power, market models are limited to firms listed in stock exchanges. Therefore, their extension to non-listed firms has attracted some interest over the past decade. Moody’s KMV RiskCalc™ model [13] is a commercial implementation, which has been employed in several countries with positive results [14], [15]. Altman et al. [16] used US data to examine the potential of developing multivariate regression models providing estimates for the probability of default implied by a market model. The authors found that this approach provides similar results to default prediction models, thus concluding that both approaches should be treated as complementary sources of information.
This study extends the results of Altman et al. [16] by investigating the applicability of a market-based credit risk modeling approach in a context where the hypotheses of market efficiency may be invalid [17]. In particular, we test whether a definition of default on the basis of a market model can be employed to build a credit scoring model for non-listed firms and compare the results to a default prediction model fitted on historical default data. The analysis is based on data from Greece over the period 2005–2010 using samples of listed and non-listed firms. The Greek case provides a challenging context due to two main reasons. First, the Greek stock market, after flourishing at the end of the 1990s, it entered a period characterized by increasing volatility, decreasing liquidity, and high market concentration with few large capitalization companies dominating the market. These features became even clearer during the international credit crisis and the subsequent sovereign debt crisis that hit the country, thus putting into serious question the efficiency of the Greek stock market [18]. Second, the crisis had a particularly strong effect on the Greek economy, with a sharp deterioration of the general economic and business conditions, which led to an unprecedented increase in the number of defaults and bankruptcies over a very short period of time. Thus, credit risk management becomes a challenging issue in this context, and the peculiarities of the Greek case cast doubts on whether an approach based on the grounds of a market model could actually provide useful results.
On the methodological side, non-parametric machine learning techniques are employed based on the framework of support vector machines (SVMs). The analysis is performed in two stages. First, a market model is used to assess the probability of default for listed companies and classify them into risk groups under different risk-taking scenarios. Risk assessment and classification models are then developed using linear and nonlinear support vector machines, as well as a recently developed innovative additive SVM model that suits well the requirements of credit rating systems. Logistic regression is also employed for comparative purposes and feature selection. The developed models are applied to a sample of non-listed firms. The comparison against traditional credit scoring models fitted on historical default data shows that the market-based modeling approach provides very competitive results. Among, the machine learning techniques used in the analysis, the additive SVM model provides the best results.
The rest of article is organized as follows. Section 2 presents the market model used in the analysis as well as the SVM formulations used for constructing the credit risk assessment models. Section 3 is devoted to the empirical analysis, including the presentation of the data and the obtained results. Finally, Section 4 concludes the paper, summarizes the main findings of this research, and proposes some future research directions.
Section snippets
The market model
Market-based models for credit risk assessment are founded on the works of Black, Scholes and Merton (henceforth referred to as BSM) [9], [10]. In the BSM framework, a firm is assumed to have a simple debt structure, consisting of a single liability L that is due in time T. From the financial point of view, a firm is assumed to default on its debt, if the market value of its assets (A) at time T is lower than L (i.e., if the firm’s assets are not enough to cover its debt). In this context,
Data and variables
Two data samples are used in the analysis. The first includes 1314 firm-year observations involving (non-financial) firms listed in the Athens Stock Exchange (ASE) over the period 2005–2010. For each year t in that period, the sample includes all firms traded throughout year t in ASE and their daily logarithmic returns over the whole year were used to estimate their PDs at the end of year t. The second sample consists of 10,716 firm-year observations for non-listed Greek firms from the
Conclusion and future perspectives
This study examined the development and implementation of a framework for building corporate credit scoring models based solely on publicly available data. To this end, the BSM model was used to introduce a proxy definition of default, based on market data instead of the traditional approach based on the credit history of the firms. The market model’s estimates of default were linked to models combining publicly available financial data. These models can be easily employed to evaluate any firm
References (35)
- et al.
Credit risk measurement: developments over the last 20 years
J. Banking Finance
(1997) A survey of credit and behavioral scoring: forecasting financial risk of lending to consumers
Int. J. Forecasting
(2000)- et al.
Comparing the performance of market-based and accounting-based bankruptcy prediction models
J. Banking Finance
(2008) - et al.
A hybrid bankruptcy prediction model with dynamic loadings on accounting-ratio-based and market-based information: a binary quantile regression approach
J. Empir. Finance
(2010) - et al.
A hybrid KMV model, random forests and rough set theory approach for credit rating
Knowledge Based Syst.
(2012) Inefficient markets and credit risk modeling: Why Merton’s model failed
J. Policy Model.
(2006)- et al.
Greek market efficiency and its international integration
J. Int. Finance Markets Inst. Money
(2011) - et al.
Comprehensible credit scoring models using rule extraction from support vector machines
Eur. J. Oper. Res.
(2007) - et al.
Support vector machines for credit scoring and discovery of significant features
Expert Syst. Appl.
(2009) Using Gaussian process based kernel classifiers for credit rating forecasting
Expert Syst. Appl.
(2011)
Prototype risk rating system
J. Banking Finance
Introduction to ROC analysis
Pattern Recogn. Lett.
Economic benefit of powerful credit scoring
J. Banking Finance
Credit rating systems: regulatory framework and comparative evaluation of existing methods
Credit scoring, statistical techniques and evaluation criteria: a review of the literature
Intell. Syst. Acc. Finance Manage.
An examination of the stationarity of multivariate bankruptcy prediction models: a methodological study
J. Acc. Res.
Assessing the probability of bankruptcy
Rev. Acc. Stud.
Cited by (25)
Machine learning models for credit analysis improvements: Predicting low-income families’ default
2019, Applied Soft Computing JournalCitation Excerpt :For example, Tsai [31], Chang et al. [32], Feng et al. [6], Jadhav et al. [7], Tian et al. [33], Yu et al. [34], Óskarsdóttir et al. [35] among others, analysed datasets on a variety of topics. More recently, several studies have demonstrated the adoption of machine-learning techniques in credit modelling, highlighting various methodologies to estimate the probability of default, such as SVM [36,37], Decision Tree [38], Random Forest [39], and Bagging and Boosting [40]. Most studies highlight the advantages of using machine-learning systems in credit-risk analysis due to a better classification performance than that of traditional techniques, such as Logistic Regression [5,31,32]).
Supply chain finance: From traditional to supply chain credit rating
2019, Journal of Purchasing and Supply ManagementA new decision-making approach for multiple criteria sorting with an imbalanced set of assignment examples
2018, European Journal of Operational ResearchCitation Excerpt :One noteworthy difficulty is the imbalanced distribution of alternatives among considered categories, which exists in a wide range of real-world applications. For example, in credit risk assessment, firms are classified into two classes by a bank loan officer: default and non-default, and the number of default firms is significantly less than that of non-default firms (Angilella & Mazzù, 2015; Marinakis, Marinaki, Doumpos, Matsatsinis, & Zopounidis, 2008; Niklis, Doumpos, & Zopounidis, 2014); in ABC inventory classification, inventory items are assigned to three classes according to specific criteria, items of high value but small in number are termed as class A, items of low value but large in number are termed as class C, and items that fall between these two classes are termed as class B (Liu et al., 2016); in engineering management, activities carried out by a project team are assigned into classes of managerial practices, which include different control mechanisms for a project manager, and the class of activities which require most attention are usually small in number while the class of non-critical activities are large in quantity (de Miranda Mota & de Almeida, 2012). It is challenging to develop a sorting model from an imbalanced set of assignment examples.
Selection of Support Vector Machines based classifiers for credit risk domain
2015, Expert Systems with ApplicationsCitation Excerpt :The amount of it is not large, which may be influenced by the limitations of availability of the necessary financial/bankruptcy data (although the number of open financial datasources seems to be rising). ( Harris, 2015) used a dataset of over 20,000 entries from Barbados credit unions for model development to develop SVM linear and nonlinear classifier together with clustered SVM; the results indicated that performance of linear SVM did not significantly differ from SVM using RBF kernel; similar conclusion can be drawn from the results in (Niklis, Doumpos, & Zopounidis, 2014). Other recent research (Zhang, Gao, & Shi, 2014) used a USA credit dataset of over 6000 instances and also reported results which indicate that application of nonlinear SVM kernel for generic SVM, fuzzy SVM and hybrid fuzzy SVM does not show significant increase in classification accuracy, compared to linear SVM (resulting in accuracy of ∼75%).
Enterprise credit risk portrait and evaluation from the perspective of the supply chain
2024, International Transactions in Operational Research