Abstract
An approach to non-linear principal components using radially symmetric kernel basis functions is described. The procedure consists of two steps: a projection of the data set to a reduced dimension using a non-linear transformation whose parameters are determined by the solution of a generalized symmetric eigenvector equation. This is achieved by demanding a maximum variance transformation subject to a normalization condition (Hotelling's approach) and can be related to the homogeneity analysis approach of Gifi through the minimization of a loss function. The transformed variables are the principal components whose values define contours, or more generally hypersurfaces, in the data space. The second stage of the procedure defines the fitting surface, the principal surface, in the data space (again as a weighted sum of kernel basis functions) using the definition of self-consistency of Hastie and Stuetzle. The parameters of this principal surface are determined by a singular value decomposition and crossvalidation is used to obtain the kernel bandwidths. The approach is assessed on four data sets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Becker, R. A. and Chambers, J. M. (1984) S. An Interactive Environment for Data Analysis and Graphics. Wadsworth Statistics/Probability Series, Belmont, CA.
Bekker, P. and de Leeuw, J. (1988) Relations between variants of non-linear principal components analysis. In J. L. A. van Rijckevorsel and J. de Leeuw, eds, Component and Correspondence Analysis. Dimension Reduction by Function Approximation, pp. 1–31, Wiley, New York.
Bennett, G. W. (1988) Determination of anaerobic threshold. Canadian Journal of Statistics, 16(3), 307–16.
Broomhead, D. S. and Lowe, D. (1988) Multi-variable functional interpolation and adaptive networks. Complex Systems, 2(3), 269–303.
de Leeuw, J. (1982) Nonlinear principal components analysis. In H. Caussinus, ed., COMPSTAT '82. Proceedings in Computational Statistics. Physica-Verlag, Vienna.
Flury, B. D. (1993) Estimation of principal points. Applied Statistics, 42(1), 139–51.
Gifi, A. (1990) Nonlinear Multivariate Analysis. Wiley, New York.
Hand, D. J. (1981) Discrimination and Classification. Wiley, New York.
Hand, D. J. (1982) Kernel Discriminant Analysis. Volume 2 of Pattern Recognition and Image Processing Research Studies Series. Research Studies Press, Letchworth, Herts.
Hand, D. J., Daly, F., Lunn, A. D., McConway, K. J. and Ostrowski, E. (1994) A Handbook of Small Data Sets. Chapman & Hall, London.
Hastie, T. and Stuetzle, W. (1989) Principal curves. Journal of the American Statistical Association, 84(406), 502–16.
Hotelling, H. (1933) Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417–41, 498–520.
Kramer, M. A. (1991) Nonlinear principal component analysis using autoassociative neural networks. American Institute of Chemical Engineers Journal, 37(2), 233–43.
LeBlanc, M. and Tibshirani, R. (1994) Adaptive principal surfaces. Journal of the American Statistical Association, 89(425), 53–64.
Lowe, D. (1995) On the use of nonlocal and non positive definite basis functions in radial basis function networks. Fourth IEE International Conference on Artificial Neural Networks, Cambridge, pp. 206–211. IEE Conference Publication 409.
Martin, J.-F. (1988) On probability coding. In J. L. A. van Rijckevorsel and J. de Leeuw, editors, Component and Correspondence Analysis. Dimension Reduction by Function Approximation, pp. 103–14. Wiley, New York.
Nakagawa, S., Ono, Y. and Hirata, Y. (1991) Dimensionality reduction of dynamical patterns using a neural network. In B. H. Juang, S. Y. Kung, and C. A. Kamm, eds, Neural Networks for Signal Processing, Proceedings of the 1991 IEEE Workshop, pp. 256–65, Princeton, NJ.
Pearson, K. (1901) On lines and planes of closest fit. Philosophical Magazine, 6, 559–72.
Powell, M. J. D. (1990) The theory of radial basis function approximation in 1990. DAMPT Numerical Analysis Report 1990/NA11, University of Cambridge, Department of Applied Mathematics and Theoretical Physics, Silver Street, Cambridge CB3 9EW, UK.
Silverman, B. W. (1986) Density Estimation for Statistics and Data Analysis. Chapman & Hall, London.
Stewart, G. W. (1973) Introduction to Matrix Computations. Academic Press, London.
Tibshirani, R. (1992) Principal curves revisited. Statistics and Computing, 2(4), 183–90.
Usui, S., Nakauchi, S. and Nakano, M. (1990) Reconstruction of Munsell color space by a five-layered neural network. In IJCNN Int. Joint Conf. Neural Networks, San Diego, Vol. II, pp. 515–20.
van Rijckevorsel, J. L. A. (1988) Fuzzy coding and B-splines. In J. L. A. van Rijckevorsel and J. de Leeuw, eds, Component and Correspondence Analysis. Dimension Reduction by Function Approximation, pp. 33–54. Wiley, New York.
Wyse, N., Dubes, R. and Jain, A. K. (1980) A critical evaluation of intrinsic dimensionality algorithms. In E. S. Gelsema and L. N. Kanal, eds, Pattern Recognition in Practice, pp. 415–25. North-Holland, Amsterdam.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Webb, A.R. An approach to non-linear principal components analysis using radially symmetric kernel functions. Stat Comput 6, 159–168 (1996). https://doi.org/10.1007/BF00162527
Issue Date:
DOI: https://doi.org/10.1007/BF00162527