Abstract
A variable selection method using global score estimation is proposed, which is applicable as a selection criterion in any multivariate method without external variables such as principal component analysis, factor analysis and correspondence analysis. This method selects a subset of variables by which we approximate the original global scores as much as possible in the context of least squares, where the global scores, e.g. principal component scores, factor scores and individual scores, are computed based on the selected variables. Global scores are usually orthogonal. Therefore, the estimated global scores should be restricted to being mutually orthogonal. According to how to satisfy that restriction, we propose three computational steps to estimate the scores. Example data is analyzed to demonstrate the performance and usefulness of the proposed method, in which the proposed algorithm is evaluated and the results obtained using four cost-saving selection procedures are compared. This example shows that combining these steps and procedures yields more accurate results quickly.
Similar content being viewed by others
References
Bonifas I, Escoufier Y, Gonzalez PL, Sabatier R (1984) Choix de variables en analyse en composantes principales. Rev Stat Appl 23: 5–15
Falguerolles A, Jmel S (1993) Un critere de choix de variables en analyse en composantes principales fonde sur des modeles graphiques gaussiens particuliers. Rev Can Stat 21(3): 239–256
Fueda K, Iizuka M, Mori Y (2003) Orthogonal score estimation with variable selection in multivariate methods (in Japanese). In: Proceedings of the 17th symposium of Japanese society of computational statistics, pp 129–132
Iizuka M, Mori Y, Tarumi T, Tanaka Y (2002) Statistical software VASMM for variable selection in multivariate methods. In: Härdle W, Rönz B(eds) COMPSTAT2002 proceedings in computational statistics. Springer, Heidelberg, pp 563–568
Iizuka M, Mori Y, Tanaka Y, Tarumi T (2002b) Some new modules in variable selection software VASMM. In: Proceedings of the 4th ARS conference of the IASC, pp 166–169
Jolliffe IT (1972) Discarding variables in a principal component analysis I—Artificial data. Appl Stat 21: 160–173
Jolliffe IT (1973) Discarding variables in a principal component analysis II—Real data. Appl Stat 22: 21–31
Kano Y, Harada A (2000) Stepwise variable selection in factor analysis. Psychometrika 65(1): 7–22
Krzanowski WJ (1987a) Selection of variables to preserve multivariate data structure, using principal components. Appl Stat 36: 22–33
Krzanowski WJ (1987b) Cross-validation in principal component analysis. Biometrics 43: 575–584
McCabe GP (1984) Principal variables. Technometrics 26: 137–144
Mori Y (1997) Statistical software VASPCA—variable selection in PCA. Bull Okayama Univ Sci 33(A): 329–340
Mori M, Du X, Iizuka M (2004) Considering variable selection criteria in correspondence analysis (in Japanese). Bull Faculty Environ Sci Technol Okayama Univ 10(2): 49–56
Mori Y, Fueda K, Iizuka M (2004) Orthogonal score estimation with variable selection in multivariate methods. In: Antoch J(eds) COMPSTAT2004 proceedings in computational statistics. Springer, Heidelberg, pp 1527–1534
Mori Y, Fueda K, Iizuka M (2007) Variable selection based on global score estimation and its numerical investigation (in Japanese). J Faculty Environ Sci Technol Okayama Univ 12(1): 29–40
Mori Y, Tarumi T, Tanaka Y (1998) Principal Component analysis based on a subset of variables—numerical investigation on variable selection procedures (in Japanese). Bull Comput Stat Jpn 11(1): 1–12
Nikkei Research Inc (1997–2006). In: Environmental management survey (1st in 1997 to 9th in 2006), Nihon Keizai Shimbun
Robert P, Escoufier Y (1976) A unifying tool for linear multivariate statistical methods: the RV-coefficient. Appl Stat 25: 257–265
Tanaka Y (1983) Some criteria for variable selection in factor analysis. Behaviormetrika 13: 31–45
Tanaka Y, Kodake K (1981) A method of variable selection in factor analysis and its numerical investigation. Behaviormetrika 10: 49–61
Tanaka Y, Mori Y (1997) Principal component analysis based on a subset of variables: variable selection and sensitivity analysis. Am J Math Manage Sci 17: 61–89
Xia L, Yang Y (1988) A method of variable selection in Hayashi’s third method of quantification. J Jpn Soc Comp Stat 1: 27–43
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fueda, K., Iizuka, M. & Mori, Y. Variable selection in multivariate methods using global score estimation. Comput Stat 24, 127–144 (2009). https://doi.org/10.1007/s00180-008-0109-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-008-0109-9