Abstract
We propose a new model for supervised classification for data mining applications. This model is based on products of trees. The information given by each predictor variable is separately extracted by means of a recursive partition structure. This information is then combined across predictors using a weighted product model form, an extension of the naive Bayes model. Empirical results are presented comparing this new method with other methods in the machine learning literature, for several data sets. Two typical data mining applications, a chromosome identification problem and a forest cover type identification problem are used to illustrate the ideas. The new approach is fast and surprisingly accurate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blake, C., Keogh, E. and Merz, C.J. (1998) UCI repository of machine learning databases [http://ics.uci.edu/mlearn/MLRepository.html]. Department of Information and Computing Science, University of California, Irvine, CA.
Breiman, L., Freidman, J.H., Olshen, R.A., and Stone, C.J. (1984) Classification and Regression Trees. Belmont, California: Wadsworth.
Chipman, H., George, E.I. and McCulloch, R.E. (1998) Bayesian CART model search (with discussion). J. Am. Statist. Assoc., 93, 935–960.
Denison, D.G.T., Adams, N.M., Holmes, C.C. and Hand, D.J. (2000) Bayesian Partition Modelling. Technical Report, Imperial College.
Ferreira, J.T.A.S., Denison, D.G.T. and Hand, D.J. (2001) Weighted naive Bayes modelling for data mining. Technical Report, Imperial College.
Hand, D.J. (1997) Construction and Assessment of Classification Rules. Chichester, Wiley.
Hand, D.J. (1998) Data Mining: Statistics and More?. The American Statistician, 52(2), 112–118.
Hand, D.J. and Adams, N.M. (2000) Defining attributes for scorecard construction in credit scoring. Journal of Applied Statistics, 27, 527–540.
Hand, D.J., Blunt, G., Kelly, M.G. and Adams, N.M. (2000) Data Mining for Fun and Profit. Statistical Science, 15(2) 111–131.
Hand, D.J. and Yu, K. (2001) Idiot’s Bayes-not so stupid after all? To appear in International Statistical Review
Kohavi, R. (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the Fourteenth International Joint Conference on Knowledge Discovery and Artificial Intelligence, San Mateo,CA, Morgan Kaufmann, 1137–1143.
Kohavi, R. (1996) Scaling up the accuracy of Naive-Bayes classifiers: a decisiontree hybrid. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 202–207.
Kohavi, R. and John, G.H. (1997) Wrappers for feature subset selection. Artificial Intelligence, 97(1-2), 273–324.
Kononenko, I. (1990) Comparision of inductive and naive Bayesian learning approaches to automatic knowledge aquisition. In B. Wielinga et al. (eds.) Current trends in knowledge acquisition, Amsterdam, IOS Press.
Langley, P. (1994) Selection of relevant features in machine learning. In AAAI Fall Symposium on Relevance, 140–144.
McLachlan, G.J. (1992) Discriminant Analysis and Statistical Pattern Recognition. New York: John Wiley and Sons.
Moore,. A.W., Hill, D.J. and Johnson, M.P. (1992) An empirical investigation of brute force to choose features, smoothers and function approximators. In Hansin, S. et al. (eds.) Computational Learning Theory and Natural Learning Systems Conference, Vol. 3, MIT Press.
Ohmann, C., Yang, Q., Künneke, M., Stöltzing, H., Thon, K. and Lorenz W. (1988) Bayes theorem and conditional dependence of symptoms: different models applied to data of upper gastrointestinal bleeding. Methods of information in Medicine, 27, 73–83.
Quinlan, J.R. (1993) C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.
Shapire, R.E., Freund, Y., Bartlett, P. and Lee, W.S. (1998) Boosting the margin: A new explanation for the effectiveness of voting methods. Annals of Statistics, 26(5), 1651–1686.
Todd, B.S. amd Stamper, R. (1994) The relative accuracy of a variety of medical diagnostic programmes. Methods of Information in Medicine, 33, 402–416.
Webb, A. (1999) Statistical Pattern Recognition. London: Arnold.
Zheng, Z. and Webb, G.I. (2000) Lazy Learning of Bayesian Rules. Machine Learning, 41, 53–84.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ferreira, J.T.A., Denison, D.G., Hand, D.J. (2001). Data Mining with Products of Trees. In: Hoffmann, F., Hand, D.J., Adams, N., Fisher, D., Guimaraes, G. (eds) Advances in Intelligent Data Analysis. IDA 2001. Lecture Notes in Computer Science, vol 2189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44816-0_17
Download citation
DOI: https://doi.org/10.1007/3-540-44816-0_17
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42581-6
Online ISBN: 978-3-540-44816-7
eBook Packages: Springer Book Archive