Abstract
This paper is concerned with the selection of inputs for classification models based on ratios of measured quantities. For this purpose, all possible ratios are built from the quantities involved and variable selection techniques are used to choose a convenient subset of ratios. In this context, two selection techniques are proposed: one based on a pre-selection procedure and another based on a genetic algorithm. In an example involving the financial distress prediction of companies, the models obtained from ratios selected by the proposed techniques compare favorably to a model using ratios usually found in the financial distress literature.
Similar content being viewed by others
References
Alici, Y. 1996. Neural networks in corporate failure prediction: The UK experience. In Neural Networks in Financial Engineering, Refenes, A. Abu-Mostafa, Y., and Moody, J. (Eds.), London: World Scientific.
Altman, E. 1968. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. Journal of Finance, 23:505–609.
Altman, E. 1993. Corporate Financial Distress and Bankruptcy: A Complete Guide to Predicting and Avoiding Distress and Profiting from Bankruptcy. New York: John Wiley & Sons.
Araujo, M.C.U., Saldanha, T.C.B., Galvao, R.K.H., Yoneyama, T., Chame, H.C., and Visani, V. 2001. The successive projections algorithm for variable selection in spectroscopic multicomponent analysis. Chemometrics and Intelligent Laboratory Systems, 57:65–73.
Atiya, A.F. 2001. Bankruptcy prediction for credit risk using neural networks: A survey and new results.IEEE Trans. Neural Networks, 12(4):929–935.
Beaver, W. 1966. Financial ratios as predictors of failure. Empirical Research in Accounting: Selected Studies, 5:71–111.
Becerra, V.M., Galvao, R.K.H., and Abou-Seada, M. 2001. Financial distress classification employing neural networks. In Proc. IASTED International Conference on Artificial Intelligence and Applications. pp. 45–49.
Centner, V., Massart, D., Noord, O., Jong, S., Vandeginste, B., and Sterna, C. 1996. Elimination of uninformative variables for multivariate calibration. Analytical Chemistry, 68:3851–3858.
Chen, K.H. and Shimerda, T.A. 1981. An empirical analysis of useful financial ratios. Financial Management, Spring: 51–60.
Ezekiel, M. and Fox, K.A. 1959. Methods of Correlation and Regression Analysis, 3rd ed., New York: JohnWiley.
Foster, G. 1986. Financial Statement Analysis. London: Prentice-Hall.
Galvao, R.K.H., Pimentel, M.F., Araujo, M.C.U., Yoneyama, T., and Visani, V. 2001. Aspects of the successive projections algorithm for variable selection in multivariate calibration applied to plasma emission spectrometry. Analytica Chimica Acta, 443:107–115.
Goldberg, D.E. 1989. Genetic Algorithms in Search, Optimization, and Machine Learning. Reading,MA: Addison-Wesley.
Han, C.Z., Jing, J.X., Zhao, X.W., Guo, J.G., and Hou, S.L. 2001. Classification and prognostic value of serum copper/zinc ratio in Hodgkin's disease. Biological Trace Element Research, 83(2):133–138.
Hirashima, M., Higuchi, S., Sakamoto, K., Nishiyama, T., and Okada, H. 1998. The ratio of neutrophils to lymphocytes and the phenotypes of neutrophils in patients with early gastric cancer. Journal of Cancer Research and Clinical Oncology, 124(6):329–334.
Hughes, W.B., Holba, A.G., and Dzou, L.I.P. 1995. The ratios of dibenzothiophene to phenanthrene and pristane to phytane as indicators of depositional environment and lithology of petroleum source rocks. Geochimica et Cosmochimica Acta, 59(17):3581–3598.
Johnson,W.B. 1979. The cross-sectional stability of financial ratio patterns.Journal of Financial and Quantitative Analysis, 14:1035–1048.
Jones, F.L. 1987. Current techniques in bankruptcy prediction. Journal of Accounting Literature, 6:131–167.
Jouanrimbaud, D., Massart, D.L., Leardi, R., and Denoord, O.E. 1995. Genetic algorithms as a tool for wavelength selection in multivariate calibration. Analytical Chemistry, 67(23):4295–4301.
Juhl, L.L. and Kalivas, J.H. 1986. Evaluation of the calibration matrix condition number as a criterion for optimal derivative spectrophotometric multicomponent quantitation. Analytica Chimica Acta, 187:347–351.
Kalivas, J.H. 1986. Determination of optimal parameters for multicomponent analysis using the calibration matrix condition number. Analytical Chemistry, 58(4):989–992.
Kim, M., Lee, Y.H., and Han, C.G. 2000. Real-time classification of petroleum products using near-infrared spectra. Computers and Chemical Engineering, 24:513–517.
Letellier, S., Garnier, J.P., Spy, J., et al. 1999. Development of metastases in malignant melanoma is associated with an increase in the plasma L-dopa/L-tyrosine ratio. Melanoma Research, 9(4):389–394.
Lucasius, C.B., Beckers, M.L.M., and Kateman, G. 1994. Genetic algorithms in wavelength selection—A comparative study. Analytica Chimica Acta, 286(2):135–153.
Lugger, K., Flotzinger, D., Schlogl, A., Pregenzer, M., and Pfurtscheller, G. 1998. Feature extraction for on-line EEG classification using principal components and linear discriminants. Medical and Biological Engineering and Computing, 36(3):309–314.
Malope, B.I., MacPhail, A.P., Alberts, M., and Hiss, D.C. 2001. The ratio of serum transferrin receptor and serum ferritin in the diagnosis of iron status. British Journal of Haematology, 115(1):84–89.
Marchant, J.A., Andersen, H.J., and Onyango, C.M. 2001. Evaluation of an imaging sensor for detecting vegetation using different waveband combinations. Computers and Electronics in Agriculture, 32(2):101–117.
Morrison, D. 1990. Multivariate Statistical Methods. New York: McGraw-Hill.
Naes, T. and Mevik, B.H. 2001. Understanding the collinearity problem in regression and discriminant analysis. Journal of Chemometrics, 15(4):413–426.
NavarroVilloslada, F., Perez Arribas, L.V., Leon Gonzalez,M.E., and Polodiez, L.M. 1995. Selection of calibration mixtures and wavelengths for different multivariate calibration methods. Analytica Chimica Acta, 313(1-2):93–101.
Perez Arribas, L.V., Navarro Villoslada, F., Leon Gonzalez, M.E., and Polodiez, L.M. 1993. Use of the Kalman filter for multivariate calibration in a real system and its comparison with CLS and pure component calibraton methods. Journal of Chemometrics, 7(4):267–275.
Pinches, G.E., Eubank, A.A., Mingo, K.A., and Caruthers, J.K. 1975. The Hierarchical classification of financial ratios. Journal of Business Research, 3(4):295–310.
Sheth, S.G., Flamm, S.L., Gordon, F.D., and Chopra, S. 1998. AST/ALT ratio predicts cirrhosis in patients with chronic hepatitis C virus infection. American Journal of Gastroenterology, 93(1):44–48.
Spiegelman, C.H., McShane, M.J., Goetz, M.J., Motamedi, M., Yue, Q.L., and Cote, G.L. 1998. Theoretical justification of wavelength selection in PLS calibration development of a new algorithm. Analytical Chemistry, 70(1):35–44.
Tabachnick, B.G., and Fidell, L.S. 2001. Using Multivariate Statistics, 4th ed. Boston: Allyn and Bacon.
Taffler, R. 1982. Forecasting company failure in the UK using discriminant analysis and financial ratio data. Journal of the Royal Statistical Society, Series A, 145:342–358.
Thompson, K.F.M. 1994. A classification of petroleum on the basis of the ratio of sulfur to nitrogen. Organic Geochemistry, 21(8-9):877–890.
Wilson, R.L. and Sharda, R. 1994. Bankruptcy prediction using neural networks. Decision Support Systems, 11:545–557.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Galvão, R.K., Becerra, V.M. & Abou-Seada, M. Ratio Selection for Classification Models. Data Mining and Knowledge Discovery 8, 151–170 (2004). https://doi.org/10.1023/B:DAMI.0000015913.38787.b3
Issue Date:
DOI: https://doi.org/10.1023/B:DAMI.0000015913.38787.b3