Skip to main content
Log in

Exploring multivariate data using directions of high density

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

The most common techniques for graphically presenting a multivariate dataset involve projection onto a one or two-dimensional subspace. Interpretation of such plots is not always straightforward because projections are smoothing operations in that structure can be obscured by projection but never enhanced. In this paper an alternative procedure for finding interesting features is proposed that is based on locating the modes of an induced hyperspherical density function, and a simple algorithm for this purpose is developed. Emphasis is placed on identifying the non-linear effects, such as clustering, so to this end the data are firstly sphered to remove all of the location, scale and correlational structure. A set of simulated bivariate data and artistic qualities of painters data are used as examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Becker, R. A., Cleveland, W. S. and Shyu, M.-J. (1996) A Tour of Trellis Graphics. Technical Report. Statistics Research Department, Bell Laboratories, Murray Hill, New Jersey, USA.

    Google Scholar 

  • Bellman, R. E. (1961) Adaptive Control Processes. Princeton University Press, Princeton, NJ.

    Google Scholar 

  • Bowman, A. W. (1985) A comparative study of some kernel-based nonparametric density estimates. Journal of Statistics and Computer Simulation, 21, 313–327.

    Google Scholar 

  • Bowman, A. W. and Foster, P. J. (1993) Density based exploration of bivariate data. Statistics and Computing, 3, 171–177.

    Google Scholar 

  • Cook, D., Buja, A. and Cabrera, J. (1993) Projection pursuit indicies based on expansions with orthonormal functions. Journal of Computing and Graph Statistics, 2, 225–250.

    Google Scholar 

  • Davenport, M. and Studdert-Kennedy, G. (1972) The statistical analysis of aesthetic judgement: an exploration. Applied Statistics, 21, 324–333.

    Google Scholar 

  • Friedman, J. H. (1987) Exploratory projection pursuit. Journal of the American Statistical Association, 82, 249–266.

    Google Scholar 

  • Hall, P., Watson, G. S. and Cabrera, J. (1987) Kernal density estimation with spherical data. Biometrika, 74, 751–762.

    Google Scholar 

  • Hartigan, J. A. (1977) Clusters as modes, in First International Symposium on Data Analysis and Informatics, Vol. 2, IRIA, Versailles.

    Google Scholar 

  • Huber, P. J. (1985) Projection pursuit. Annals of Statistics, 13, 435–475.

    Google Scholar 

  • Jolliffe, I. T. (1986) Principal Component Analysis. Springer-Verlag, New York.

    Google Scholar 

  • Jones, M. C. and Sibson, R. (1987) What is projection pursuit? Journal of the Royal Statistical Society. Series A, 150, 1–36.

    Google Scholar 

  • Mardia, K. V. (1972) Statistics of Directional Data. London, Academic Press.

    Google Scholar 

  • Nason, G. (1995) Three-dimensional projection pursuit. Applied Statistics, 44, 411–430.

    Google Scholar 

  • Scott, D. W. (1992) Multivariate Density Estimation: Theory, Practice and Visualisation. Wiley, New York.

    Google Scholar 

  • Scott, D. W. and Factor, L. E. (1981) Monte Carlo study of the three data-based nonparametric density estimators. Journal of the American Statistical Association, 76, 9–15.

    Google Scholar 

  • Sneath, P. H. A. (1957) The application of computers to taxonomy. Journal of General Microbiology, 17, 201–226.

    Google Scholar 

  • Swayne, D. F. and Cook, D. (1990) XGobi. Available from the StatLib via anonymous ftp from lib. stat. cmu. edu.

  • Swayne, D. F., Cook, D. and Buja, A. (1991) User's manual for XGobi, a dynamic graphic program for data analysis implemented in the X window system (release 2). Available from the StatLib archive via anonymous ftp from lib. stat. CMU. edu.

  • Tukey, P. A. and Tukey, J. W. (1981) Preparation; prechosen sequences of views, in Interpreting Multivariate Data Barnett. V. (ed.), Wiley, Chichester, pp. 189–213.

    Google Scholar 

  • Wand, M. P. and Jones, M. C. (1994) Multivariate plug-in bandwidth selection. Computational Statistics, 9, 97–116.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

FOSTER, P. Exploring multivariate data using directions of high density. Statistics and Computing 8, 347–355 (1998). https://doi.org/10.1023/A:1008828723097

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008828723097

Navigation