Optimal predictive partitioning

Hand, David J.; Krzanowski, Wojtek J.; Crowder, Martin J.

doi:10.1007/s11222-006-9003-x

Optimal predictive partitioning

Published: 30 January 2007

Volume 17, pages 11–21, (2007)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

David J. Hand^1,3,
Wojtek J. Krzanowski² &
Martin J. Crowder¹

124 Accesses
Explore all metrics

Abstract

In many situations, one wishes to group objects into well-defined classes on the basis of one set of descriptor variables, and then predict the classes of new objects from a different set of variables. For example, a bank may categorise customers into distinct financial behaviour pattern classes by observing how they have behaved over a period of years, and then seek to assign new customers to future behaviour classes using information captured when they open an account. Such situations require the striking of a compromise between the compactness and integrity of the cluster structure, and the accuracy of the predictive assignment to clusters. We describe two algorithms for achieving such a compromise, discuss some of their features, and illustrate their performance in a simulation study and in a liver transplant problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arabie P., Hubert, L.J., and DeSoete G. 1996. Clustering and Classification. Singapore, World Scientific.
MATH Google Scholar
Banfield C.F. and Bassill L.C. 1977. A transfer algorithm for non-hierarchical classification. Algorithm AS113: Applied Statistics 26: 206–210.
Article Google Scholar
Benton T.C. and Hand D.J. 2002. Segmentation into predictable classes. IMA Journal of Management Mathematics 13: 245–259.
Article MATH MathSciNet Google Scholar
Bock H.H. 1987. On the interface between cluster analysis, principal component analysis and multidimensional scaling. In: Bozdogan H. and Gupta A. K. (Eds.), Multivariate Statistical Modeling and Data Analysis. Dordrecht, Reidel, pp. 17–34.
Google Scholar
Bolton R.J. and Krzanowski W.J. 2003. Projection pursuit clustering for exploratory data analysis. Journal of Computational and Graphical Statistics 12: 121–142.
Article MathSciNet Google Scholar
Everitt B.S., Landau S., and Leese M. 2001. Cluster Analysis (4th Ed). London, Arnold.
Google Scholar
Forgey E.W. 1965. Cluster analysis of multivariate data: efficiency versus interpretability of classification. Biometrics, 21: 768–769.
Google Scholar
Friedman J.H. and Meulman J.J. 2004. Clustering objects on subsets of attributes (with discussion). Journal of the Royal Statistical Society Series B 66: 815–849.
Google Scholar
Gordon A.D. 1999. Classification (2nd edn). Boca Raton, Chapman & Hall/CRC.
MATH Google Scholar
Gower J.C. 1974. Maximal predictive classification. Biometrics, 30: 643–654.
Article MATH Google Scholar
Hand D.J. 1997. Construction and Assessment of Classification Rules. Chichester, John Wiley & Sons.
MATH Google Scholar
Hand D.J., Li H.G., and Adams N.M. 2001. Supervised classification with structured class definitions. Computational Statistics and Data Analysis 36: 209–225.
Article MATH MathSciNet Google Scholar
Hand D.J., Oliver J.J., and Lunn A.D. 1998. Discriminant analysis when the classes arise from a continuum. Pattern Recognition 31: 641–650.
Article Google Scholar
Hartigan J.A. and Wong M.A. 1979. A k-means clustering algorithm. Algorithm AS136 Applied Statistics, 28: 100–108.
Article MATH Google Scholar
Kelly M.G., Hand D.J., and Adams N.M. 1998. Defining the goals to optimise data mining performance. In: Agrawal R., Stolorz P., and Piatetsky-Shapiro G. (Eds.), Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, Menlo Park, AAAI Press, pp. 234–238.
Kelly M.G. and Hand D.J. 1999. Credit scoring with uncertain class definitions. IMA Journal of Mathematics Applied in Business and Industry, 10: 331–345.
MATH Google Scholar
Kelly M.G., Hand D.J., and Adams N.M. 1999. Supervised classification problems: how to be both judge and jury. In: Hand D.J., Kok J.N., and Berthold M.R. (Eds.), Advances in Intelligent Data Analysis Berlin, Springer, pp. 235–244.
Google Scholar
Krzanowski W.J. and Marriott F.H.C. 1995. Multivariate Analysis, part 2: Classification, Covariance Structures, and Repeated Measurements. London, Arnold.
MATH Google Scholar
Lewis E.M. 1994. An Introduction to Credit Scoring. San Rafael, California, Athena Press.
Google Scholar
MacQueen J. 1967. Some methods for classification and analysis of multivariate observations. In: LeCam L. and Neyman J., (Eds.), Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, University of California Press, Vol. 1, pp. 281–297.
Google Scholar
McLachlan G.J. 1992. Discriminant Analysis and Statistical Pattern Recognition. New York, John Wiley & Sons.
Book Google Scholar
Ward, J.H. 1963. Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58: 236–244.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Imperial College of Science, Technology and Medicine, Huxley Building, 180 Queen’s Gate, London, SW7 2AZ, UK
David J. Hand & Martin J. Crowder
School of Engineering, Computer Science and Mathematics, University of Exeter, North Park Road, Exeter, EX4 4QE, UK
Wojtek J. Krzanowski
Institute for Mathematical Sciences, Imperial College of Science, Technology and Medicine, 53 Princes Gate, London, SW7 2AZ, UK
David J. Hand

Authors

David J. Hand
View author publications
You can also search for this author in PubMed Google Scholar
Wojtek J. Krzanowski
View author publications
You can also search for this author in PubMed Google Scholar
Martin J. Crowder
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wojtek J. Krzanowski.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hand, D.J., Krzanowski, W.J. & Crowder, M.J. Optimal predictive partitioning. Stat Comput 17, 11–21 (2007). https://doi.org/10.1007/s11222-006-9003-x

Download citation

Published: 30 January 2007
Issue Date: March 2007
DOI: https://doi.org/10.1007/s11222-006-9003-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal predictive partitioning

Abstract

Access this article

Similar content being viewed by others

Supervised Classification Algorithms in Machine Learning: A Survey and Review

Data clustering: application and trends

Introduction to Machine Learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimal predictive partitioning

Abstract

Access this article

Similar content being viewed by others

Supervised Classification Algorithms in Machine Learning: A Survey and Review

Data clustering: application and trends

Introduction to Machine Learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation