Abstract
A credit scoring classification problem can be defined as a decision process in which information from application forms for new or extended credit is used to separate the applicants into good and bad credit risks. In the credit industry, it is important to find a method that optimally separates applicants into ‘goods’ and ‘bads’ as good classification models can provide competitive advantage. These classification models can be developed by statistical techniques (e.g. statistical discriminant analysis and logistic regression), neural networks and mathematical programming (MP) discriminant analysis methods, although MP methods are less widely used in practice in spite of their advantages, e.g. MP methods are non-parametric and desired classifier characteristics can be represented by constraints in the MP model. In this paper, a MP model is described and compared with other known methods, using real data. The MP model uses minimization of the sum of the deviations of misclassified observations from the discriminant function as its objective function. The performance of this MP model is evaluated on three datasets for credit card applications and is compared with the performance of ak-NN classifier, discriminant analysis, support vector machines and and logistic regression.
Access this article
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Similar content being viewed by others
References
Abad PL, Banks WJ. (1993). New LP based heuristics for the classification problem.European Journal of Operational Research, 67,88–100.
Adam, NM and Hand, DJ. (2000) Improving the practice of classifier performance assessment.Neural computation, 12, 305–311.
Adam, NM and Hand, DJ (1999). Comparing classifiers when the misallocation costs are uncertain.Pattern Recognition, 32, 1139–1147.
Appa, G. and Smith, C. (1973). On L1 and Chebyshev estimation.Mathematical Programming, 5, 73–87.
Armstrong, R.D., Frome, E.L. and Sklar, M.G. (1980). Linear programming in exploratory data analysis.Journal of Educational Statistics, 5, 293–307.
Ash, D. and Vlatsa, D. (2001). Scorecard modelling with continuous Vs Classed variables, inHandbook of Credit Scoring. Edited, by E. Mays. Glenlake: London.
Baesens B (2003). Developing intelligent systems for credit scoring using machine learning techniques.PhD thesis, Katholieke Universiteit Leuven.
Boyes, W.J., Hoffman, D.L. and Low, S.A. (1989). An econometric analysis of the bank credit scoring problem,Journal of Econometrics, 40, 3–14.
Dash Associates (2000). XPRESS-MP User guide and reference manual. Dash Associates, Blisworth, England.
Desai, V.S., Conway, D.G., Crook, J.N. and Overstreet, G.A. (1997). Credit scoring in the credit union environment using neural networks and genetic algorithms.IMA Journal of Mathematics Applied in Business and Industry, 8/ 323–346.
Dietterich, T.G. (1998). Approximate statistical tests for comparing supervised classification learning algorithms.Neural Computation 10, 1895–1923.
Eisenbeis, R.A. (1977). Pitfalls in the application of discriminant analysis in business, finance, and economics. Journal of Finance, 3, 875–900.
Eisenbeis, R.A.(1978). Problems in applying discriminant analysis in credit scoring models.Journal of Banking and Finance, 2, 205–219.
Fisher, R.A. (1936). The use of multiple measurements in taxonomy problems,Annals of Eugenics,7, 179–188.
Freed, N. and Glover, N. (1981a). A Linear programming approach to the discriminant problem.Decision Sciences, 12, 68–74.
Freed, N., and Glover, N (1981b). Simple but powerful goal programming models for discriminant problems,European Journal of Operational Research,7/44–60.
Freed, N., and Glover, N (1982). Linear programming and statistical discrimination- The LP side.Decision Sciences, 13, 172–175.
Freed, N., and Glover, N (1986a). Evaluating alternative linear programming models to solve the two-group discriminant problem.Decision Sciences,17/151–162.
Freed, N., and Glover, N (1986b). Resolving certain difficulties and improving the classification power of LP discriminant analysis formulations.Decision Sciences,17/589–595.
Fogarty, T.C., and Ireson, N.S. (1994). Evolving Bayesian classifiers for credit control-A comparison with other machine learning methods.IMA Journal of Mathematics Applied in Business and Industry, 5, 63–75.
Gehrelein, W.V. and Wagner, B.J. (1997). A two-stage least cost credit scoring model.Annals of Operations Research, 74, 159–171.
Glen, JJ. (1999). Integer programming methods for normalization and variable selection in mathematical programming discriminant analysis models.Journal of the Operational Research Society, 50, 1043–1053.
Glen, JJ. (2003). An iterative mixed integer programming method for classification accuracy maximizing discriminant analysis.Computers and Operations Research, 30, 181–198.
Glover, F., Keene S, Duea B. (1988). A new class of models for the discriminant problem, 19,Decision Sciences.
Glover, F. and Better, M. (2007). Improved classification and discrimination by successive hyperplane and multi-hyperplane separation (Updated draft). Available in www.optek.com
Greene, W. (1998). Sample selection in credit scoring models,Japan and the World Economy, 10 (3), 299–316.
Grinold, R.C. (1972). Mathematical programming methods for pattern classification,Management Sciences, 19, 272–289.
Hand D.J. (1981).Discrimination and Classification. Wiley: Chichester.
Hand DJ. (1997) Construction and assessment of classification rules, Wiley, Chichester.
Hand DJ, Henley WE. (1997). Statistical classification methods in consumer credit scoring: a review.Journal of the Royal Statistical Society; Series A, 160, 523–541.
Henley WE, Statistical aspects of credit scoring. Unpublished PhD thesis, The Open University, Milton Keynes, UK
Henley WE, Hand DJ. (1996). A k-NN classifier for assessing consumer credit risk.The Statistician, 45, 77–95.
Hosmer, D.W., and Lemeshow, S. (1989).Applied logistic regression, Wiley: New York.
Huang, C.-L., Chen, M.-C., Wang, C.-J. (2006). Credit scoring with a data mining approach based on support vector machines.Expert Systems with Applications, On press.
Joachimsthaler, E., and Stam, A. (1990). Mathematical Programming Approaches for the Classification Problem in Two-Group Discriminant Analysis.Multivariate Behavioral Research,25/427–454.
Klecka, W.R. (1981). Discriminant analysis. Quantitative Applications in the Social Sciences, Sage University Press, London.
Lane S. (1972). Submarginal credit risk classification.Journal of Financial and Quantitative Analysis, 7, 1379–1385.
Lemeshow, S., Teres, D. Avrunin, J.S. and Pastides, H. (1988). Predicting the outcome of intensive care unit patients.J. Am. Stat. Assoc 83: 348–356.
Leonard KJ. (1993). Empirical Bayes analysis of the commercial loan evaluation process.Statistics and Probability Letters, 18, 289–296.
Lewis EM. (1992). An introduction to credit scoring, Athena Press, San Rafael, US.
Mangasarian O. (1965). Linear and Nonlinear Separation of patterns by Linear Programming.Operations Research,13/444–452.
Markowski, E.P. and Markowski, C.A. (1985). Some difficulties and improvements in applying linear programming formulations to the discriminant problem.Decision sciences,16/237–247.
Michalopoulos M, Hatas D. and Zopounidis C. (2001). An automated knowledge generation approach for managing credit scoring problems. InFuzzy Sets in Management, Economics and Marketing.
Negnevitsky, M. (2002). Artificial Intelligence: A guide to intelligent systems. Pearson, London, UK.
Piramuthu S., (1999). Financial credit risk evaluation with neural and neurofuzzy systems.European Journal of Operational Research, 112, 310–321.
Piramuthu S., (2004). Evaluating feature selection methods for learning in data mining applications,European Journal of Operational Research,156/483–494.
Porath, D. (2006). Scoring models for retail exposure,In Basel II Risk Parameters, Edited by Engelman, B. and Rauhmeier, R. Springer: Frankfurt.
Press S.J., and Wilson S. (1978). Choosing between logistic regression and discriminant analysis.Journal of the American Statistical Association, 73, 699–705.
Rosen, J.B. (1965). Pattern separation by convex programming.Journal of Mathematical Analysis and Applications, 10, 123–134.
Smith, F.W. (1968). Pattern classifier design by linear programming,IEEE Transactions on Computers, C-17 (4), 367–372.
Srinivasan, V. and Kim, Y.H. (1987) Credit granting: a comparative analysis of classification procedures.The Journal of Finance, 42, 665–683
Stam, A. and Joachimsthaler, E.A. (1989). Solving the classification problem in discriminant analysis via Linear and Nonlinear programming methods.Decision Sciences, 20, 285–293.
Stam A, Joachimsthaler EA. (1990). A comparison of a robust mixed-integer approach to existing methods for establishing classification rules for the discriminant problem.European Journal of Operational Research, 46, 113–122.
Stam, A. and Ragsdale, C.T. (1992). On the classification gap in mathematical programming-based approaches to the discriminant problem.Naval Research Logistics, 39, 545–559.
Stam, A. (1997). Nontraditional approaches to statistical classification: Some perspectives onLp-norm methods.Annals of Operations Research, 74, 1–36.
Sun, M. and Xiong, M. (2003). A mathematical programming approach for gene selection and tissue classification,Bioinformatics,19/1243–1251.
Thomas LC. (2000). A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers.International Journal of Forecasting, 16, 149–172.
Thomas L.C., Edelman D.B., and Crook J.N.Credit Scoring and its Applications. Philadelphia: SIAM
Thomas LC, Oliver RW and Hand DJ.(2005). A survey of the issues in consumer credit modelling research.Journal of the Operational Research Society, 56, 1006–1015.
Tian, X., and Deng, F. (2004). A credit scoring model using Support Vector Machine,Proceedings of the 5 th World Congress on Intelligent Control and Automation, Hangzhou, China.
Vapnik, V. (1995).Nature of statistical learning theory, New York, Springer-Verlag.
Vinciotti, V. and Hand, D.J. (2003). Scorecard construction with unbalanced class sizes,Journal of the Iranian Statistical Society, 2, 189–205.
West D. (2000). Neural network credit scoring models.Computers and Operational Research, 27, 1131–1152.
Wiginton, J.C. (1980). A note on the comparison of logit and discriminant models of consumer credit behaviour.Journal of Financial and Quantitative Analysis, 15, 757–770.
Witten, I.H. and Eibe, F. (2005). “Data Mining: Practical machine learning tools and techniques”, 2nd Edition, Morgan Kaufmann, San Francisco.
Yang, J. MILL (2006): A Multiple Instance Learning Library, http://www.cs.cmu.edu/~juny/MILL.
Ziari HA, Leatham, DJ, Ellinger PN (1997). Development of Statistical Discriminant Mathematical Programming Model via Resampling Estimation Techniques. American Journal of Agricultural Economics, 79, 1352–1362.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Falangis, K. The use of MSD model in credit scoring. Oper Res Int J 7, 481–503 (2007). https://doi.org/10.1007/BF03024859
Issue Date:
DOI: https://doi.org/10.1007/BF03024859