Skip to main content

Boosting with Diverse Base Classifiers

  • Conference paper
Learning Theory and Kernel Machines

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2777))

Abstract

We establish a new bound on the generalization error rate of the Boost-by-Majority algorithm. The bound holds when the algorithm is applied to a collection of base classifiers that contains a “diverse” subset of “good” classifiers, in a precisely defined sense. We describe cross-validation experiments that suggest that Boost-by-Majority can be the basis of a practically useful learning method, often improving on the generalization of AdaBoost on large datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ali, K.M., Pazzani, M.J.: Error reduction through learning multiple descriptions. Machine Learning 24, 173–202 (1996)

    Google Scholar 

  2. Alon, N., Ben-David, S., Cesa-Bianchi, N., Haussler, D.: Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the Association for Computing Machinery 44(4), 616–631 (1997)

    MathSciNet  Google Scholar 

  3. Amit, Y., Blanchard, G.: Multiple randomized classifiers: MRCL (2001) (manuscript)

    Google Scholar 

  4. Anthony, M., Bartlett, P.L.: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge (1999)

    Book  MATH  Google Scholar 

  5. Bartlett, P.L.: The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory 44(2), 525–536 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  6. Bartlett, P.L., Jordan, M.I., McAuliffe, J.D.: Convexity, classification, and risk bounds. Technical Report 638, Department of Statistics, U.C. Berkeley (2003)

    Google Scholar 

  7. Blanchard, G., Lugosi, G., Vayatis, N.: On the rate of convergence of regularized boosting methods (2003) (manuscript)

    Google Scholar 

  8. Breiman, L.: Prediction games and arcing algorithms. Neural Computation 11(7) (1999)

    Google Scholar 

  9. Breiman, L.: Some ininity theory for predictor ensembles. Technical Report 577, Statistics Department, UC Berkeley (2000)

    Google Scholar 

  10. Breiman, L.: Arcing classiiers. The Annals of Statistics (1998)

    Google Scholar 

  11. Bülmann, P., Yu, B.: Boosting with the l2 loss: regression and classification. Journal of the American Statistical Association (to appear)

    Google Scholar 

  12. Dubhashi, D., Ranjan, D.: Balls and bins: A study in negative dependence. Random Structures & Algorithms 13(2), 99–124 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  13. Dudoit, S., Fridlyand, J., Speed, T.P.: Comparison of discrimination methods for the classiication of tumors using gene expression data. Journal of the American Statistical Association 97(457), 77–87 (2002)

    Google Scholar 

  14. Freund, Y.: Boosting a weak learning algorithm by majority. Information and Computation 121(2), 256–285 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  15. Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proceedings of the 13th International Conference on Machine Learning (1996)

    Google Scholar 

  16. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of the 2nd European Conference on Computational Learning Theory, pp. 23-37 (1995)

    Google Scholar 

  17. Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. The Annals of Statistics 38(2), 337–407 (2000)

    Article  MathSciNet  Google Scholar 

  18. Gavinsky, D.: Optimally-smooth adaptive boosting and application to agnostic learning. In: Proceedings of the 13th International Workshop on Algorithmic Learning Theory (2002)

    Google Scholar 

  19. Grove, A.J., Schuurmans, D.: Boosting in the limit: Maximizing the margin of learned ensembles. In: Proceedings of the 15th National Conference on Artifical Intelligence (1998)

    Google Scholar 

  20. Hajnal, A., Maass, W., Pudlák, P., Szegedy, M., Turán, G.: Threshold circuits of bounded depth. Journal of Computer and System Sciences 46, 129–154 (1993)

    Google Scholar 

  21. Impagliazzo, R.: Hard-core distributions for somewhat hard problems. In: IEEE Symposium on Foundations of Computer Science, pp. 538-545 (1995)

    Google Scholar 

  22. Jiang, W.: Process consistency for AdaBoost. Annals of Statistics (to appear)

    Google Scholar 

  23. Klivans, A., Servedio, R.A.: Boosting and hard-core sets. In: IEEE Symposium on Foundations of Computer Science, pp. 624-633 (1999)

    Google Scholar 

  24. Koltchinskii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classiiers. Annals of Statistics 30(1) (2002)

    Google Scholar 

  25. Koltchinskii, V., Panchenko, D.: Complexities of convex combinations and bounding the generalization error in classiication (2003) (manuscript)

    Google Scholar 

  26. Langford, J., Shawe-Taylor, J.: PAC-bayes and margins. In: NIPS (2002)

    Google Scholar 

  27. Liu, J.S.: Monte Carlo Strategies in Scientific Computing. Springer, Heidelberg (2001)

    MATH  Google Scholar 

  28. Liu, J.S., Chen, R.: Sequential Monte Carlo methods for dynamic systems. Journal of the American Statistical Association 93(443), 1032–1044 (1998)

    Google Scholar 

  29. Long, P.M.: Minimum majority classiication and boosting. In: Proceedings of the The 18th National Conference on Artificial Intelligence (2002)

    Google Scholar 

  30. Long, P.M., Vega, V.B.: Boosting and microarray data. Machine Learning (to appear)

    Google Scholar 

  31. Lugosi, G., Vayatis, N.: On the bayes-risk consistency of regularized boosting methods. Annals of Statistics (2004); Preliminary version in COLT 2002 (2002)

    Google Scholar 

  32. Mannor, S., Meir, R., Mendelson, S.: The consistency of boosting algorithms (2001) (manuscript)

    Google Scholar 

  33. Mannor, S., Meir, R., Zhang, T.: The consistency of greedy algorithms for classification. In: Proc. 15th Annual Conference on Computational Learning Theory (2002)

    Google Scholar 

  34. Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Proc. 14th International Conference on Machine Learning, pp. 211–218. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  35. Mason, L., Baxter, J., Bartlett, P.L., Frean, M.: Boosting algorithms as gradient descent. In: Advances in Neural Information Processing Systems, vol. 12, pp. 512–518. MIT Press, Cambridge (2000)

    Google Scholar 

  36. Mason, L., Bartlett, P.L., Baxter, J.: Improved generalization through explicit optimization of margins. Machine Learning 38(3), 243–255 (2000)

    Article  MATH  Google Scholar 

  37. McAllester, D.: Simplified PAC-Bayesian margin bounds. In: Proceedings of the 2003 Conference on Computational Learning Theory (2003)

    Google Scholar 

  38. McAllester, D.A.: PAC-Bayesian model averaging. In: Proc. 12th Annu. Conf. on Comput. Learning Theory, pp. 164–170. ACM Press, New York (1999)

    Google Scholar 

  39. McAllester, D.A.: Some PAC-Bayesian theorems. Machine Learning 37(3), 355–363 (1999)

    Article  MATH  Google Scholar 

  40. Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press, Cambridge (1995)

    MATH  Google Scholar 

  41. Niyogi, P., Pierrot, J.-B., Siohan, O.: On decorrelating classifiers and combining them (2001) (manuscript), see http://people.cs.uchicago.edu/~niyogi/decorrelation.ps

  42. Pisier, G.: Remarques sur un resultat non publi’e de B. Maurey. Sem. d’Analyse Fonctionelle 1(12), 1980–1981 (1981)

    Google Scholar 

  43. Quinlan, J.: Bagging, boosting and c4.5. In: Proceedings of the 13th National Conference on Artifiicial Intelligence, pp. 725–730. AAAI/MIT Press (1996)

    Google Scholar 

  44. Rätsch, G., Warmuth, M.K.: Marginal boosting. In: Proceedings of the Annual Conference on Computational Learning Theory (2002)

    Google Scholar 

  45. Rosset, S., Zhu, J., Hastie, T.: Boosting as a regularized path to a maximum margin classiier. In: NIPS (2002)

    Google Scholar 

  46. Schapire, R.: The strength of weak learnability. Machine Learning 5, 197–227 (1990)

    Google Scholar 

  47. Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics 26(5), 1651–1686 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  48. Southey, F., Schuurmans, D., Ghodsi, A.: Regularized greedy importance sampling. In: NIPS 2002 (2002)

    Google Scholar 

  49. West, M., et al.: Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. USA 98(20), 11462–11467 (2001)

    Article  Google Scholar 

  50. Zhang, T.: Statistical behavior and consistency of classiication methods based on convex risk minimization. Annals of Statistics (to appear)

    Google Scholar 

  51. Zhang, T., Yu, B.: Boosting with early stopping: convergence and consistency. Technical Report 635, Statistics Department, UC Berkeley (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dasgupta, S., Long, P.M. (2003). Boosting with Diverse Base Classifiers. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45167-9_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40720-1

  • Online ISBN: 978-3-540-45167-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics