Abstract
In many situations, one wishes to group objects into well-defined classes on the basis of one set of descriptor variables, and then predict the classes of new objects from a different set of variables. For example, a bank may categorise customers into distinct financial behaviour pattern classes by observing how they have behaved over a period of years, and then seek to assign new customers to future behaviour classes using information captured when they open an account. Such situations require the striking of a compromise between the compactness and integrity of the cluster structure, and the accuracy of the predictive assignment to clusters. We describe two algorithms for achieving such a compromise, discuss some of their features, and illustrate their performance in a simulation study and in a liver transplant problem.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Arabie P., Hubert, L.J., and DeSoete G. 1996. Clustering and Classification. Singapore, World Scientific.
Banfield C.F. and Bassill L.C. 1977. A transfer algorithm for non-hierarchical classification. Algorithm AS113: Applied Statistics 26: 206–210.
Benton T.C. and Hand D.J. 2002. Segmentation into predictable classes. IMA Journal of Management Mathematics 13: 245–259.
Bock H.H. 1987. On the interface between cluster analysis, principal component analysis and multidimensional scaling. In: Bozdogan H. and Gupta A. K. (Eds.), Multivariate Statistical Modeling and Data Analysis. Dordrecht, Reidel, pp. 17–34.
Bolton R.J. and Krzanowski W.J. 2003. Projection pursuit clustering for exploratory data analysis. Journal of Computational and Graphical Statistics 12: 121–142.
Everitt B.S., Landau S., and Leese M. 2001. Cluster Analysis (4th Ed). London, Arnold.
Forgey E.W. 1965. Cluster analysis of multivariate data: efficiency versus interpretability of classification. Biometrics, 21: 768–769.
Friedman J.H. and Meulman J.J. 2004. Clustering objects on subsets of attributes (with discussion). Journal of the Royal Statistical Society Series B 66: 815–849.
Gordon A.D. 1999. Classification (2nd edn). Boca Raton, Chapman & Hall/CRC.
Gower J.C. 1974. Maximal predictive classification. Biometrics, 30: 643–654.
Hand D.J. 1997. Construction and Assessment of Classification Rules. Chichester, John Wiley & Sons.
Hand D.J., Li H.G., and Adams N.M. 2001. Supervised classification with structured class definitions. Computational Statistics and Data Analysis 36: 209–225.
Hand D.J., Oliver J.J., and Lunn A.D. 1998. Discriminant analysis when the classes arise from a continuum. Pattern Recognition 31: 641–650.
Hartigan J.A. and Wong M.A. 1979. A k-means clustering algorithm. Algorithm AS136 Applied Statistics, 28: 100–108.
Kelly M.G., Hand D.J., and Adams N.M. 1998. Defining the goals to optimise data mining performance. In: Agrawal R., Stolorz P., and Piatetsky-Shapiro G. (Eds.), Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, Menlo Park, AAAI Press, pp. 234–238.
Kelly M.G. and Hand D.J. 1999. Credit scoring with uncertain class definitions. IMA Journal of Mathematics Applied in Business and Industry, 10: 331–345.
Kelly M.G., Hand D.J., and Adams N.M. 1999. Supervised classification problems: how to be both judge and jury. In: Hand D.J., Kok J.N., and Berthold M.R. (Eds.), Advances in Intelligent Data Analysis Berlin, Springer, pp. 235–244.
Krzanowski W.J. and Marriott F.H.C. 1995. Multivariate Analysis, part 2: Classification, Covariance Structures, and Repeated Measurements. London, Arnold.
Lewis E.M. 1994. An Introduction to Credit Scoring. San Rafael, California, Athena Press.
MacQueen J. 1967. Some methods for classification and analysis of multivariate observations. In: LeCam L. and Neyman J., (Eds.), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, University of California Press, Vol. 1, pp. 281–297.
McLachlan G.J. 1992. Discriminant Analysis and Statistical Pattern Recognition. New York, John Wiley & Sons.
Ward, J.H. 1963. Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58: 236–244.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hand, D.J., Krzanowski, W.J. & Crowder, M.J. Optimal predictive partitioning. Stat Comput 17, 11–21 (2007). https://doi.org/10.1007/s11222-006-9003-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-006-9003-x