Abstract
Many data analytic questions can be formulated as (noisy) optimization problems. They explicitly or implicitly involve finding simultaneous combinations of values for a set of (“input”) variables that imply unusually large (or small) values of another designated (“output”) variable. Specifically, one seeks a set of subregions of the input variable space within which the value of the output variable is considerably larger (or smaller) than its average value over the entire input domain. In addition it is usually desired that these regions be describable in an interpretable form involving simple statements (“rules”) concerning the input values. This paper presents a procedure directed towards this goal based on the notion of “patient” rule induction. This patient strategy is contrasted with the greedy ones used by most rule induction methods, and semi-greedy ones used by some partitioning tree techniques such as CART. Applications involving scientific and commercial data bases are presented.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Barnett, V. (1976) The ordering of multivariate data (with dis-cussion). J. Roy. Statist. Soc., A 139, 318–354.
Bishop, C. M. (1995) Neural Networks for Pattern Recognition. Oxford University Press.
Breiman, L. (1996) Bagging predictors. Machine Learning, 24, 123–140.
Breiman, L., Friedman, J. H., Olshen, R. A. and Stone, C. J. (1984) Classification and Regression Trees. Wadsworth.
Clark, P. and Niblett, R. (1989) The CN2 induction algorithm. Machine Learning, 3, 261–284.
Cohen W. W. (1995) Fast efficient rule induction. In Machine Learning: Proceedings of the Twelfth International Confer-ence, Lake Tahoe, CA (115–123). Morgan-Kaufmann.
Donoho, D. and Gasko, M. (1992) Breakdown properties of lo-cation estimates based on halfspace depth and projected outlyingness. Annals of Statistics, 20, 1803–1827.
Efron, B. and Tibshirani, R. J. (1993) An Introduction to the Bootstrap, Chapman and Hall.
Friedman, J. H. (1997) On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1, 55–77.
Green, P. J. (1981) Peeling bivariate data. In Interpreting Multi-variate Data (V. Barnett, ed.) Wiley.
Griffin, W. L., Fisher, N. I., Friedman, J. H., Ryan, C. G., and O'Reilly, S. (1999) Cr-Pyrope garnets in lithospheric mantle. J. Petrology to appear.
Hall, P. (1989) On projection pursuit regression. Annals of Sta-tistics, 17, 573–588.
Lorentz, G. G. (1986) Approximation of Functions. Chelsea.
Mitchell, T. M. (1997) Machine Learning. McGraw-Hill.
Quinlan, J. R. (1990) Learning logical definitions from relations. Machine Learning, 5, 239–266.
Quinlan, J. R. (1994) C4.5: Programs for Machine Learning. Morgan-Kaufmann.
Quinlan, J. R. (1995) MDL and categorical theories (continued). In Machine Learning: Proceedings of the Twelfth Interna-tional Conference, Lake Tahoe, CA (464–470). Morgan-Kaufmann.
Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge University Press.
Rivest, R. L. (1987) Learning decision lists. Machine Learning, 2, 229–246.
Tibshirani, R. J. and Knight, K. (1995) Model search and infer-ence by bootstrap “bumping”. Technical Report, University of Toronto.
Vapnik, V. (1995) The Nature of Statistical Learning Theory. Springer.
Wahba, G. (1990) Spline Models for Observational Data. SIAM.
Rights and permissions
About this article
Cite this article
Friedman, J.H., Fisher, N.I. Bump hunting in high-dimensional data. Statistics and Computing 9, 123–143 (1999). https://doi.org/10.1023/A:1008894516817
Issue Date:
DOI: https://doi.org/10.1023/A:1008894516817