Abstract
InfoBoost is a boosting algorithm that improves the performance of the master hypothesis whenever each weak hypothesis brings non-zero mutual information about the target. We give a somewhat surprising observation that InfoBoost can be viewed as an algorithm for growing a branching program that divides and merges the domain repeatedly. We generalize the merging process and propose a new class of boosting algorithms called BP.InfoBoost with various merging schema. BP.InfoBoost assigns to each node a weight as well as a weak hypothesis and the master hypothesis is a threshold function of the sum of the weights over the path induced by a given instance. InfoBoost is a BP.InfoBoost with an extreme scheme that merges all nodes in each round. The other extreme that merges no nodes yields an algorithm for growing a decision tree. We call this particular version DT.InfoBoost. We give an evidence that DT.InfoBoost improves the master hypothesis very efficiently, but it has a risk of overfitting because the size of the master hypothesis may grow exponentially. We propose a merging scheme between these extremes that improves the master hypothesis nearly as fast as the one without merge while keeping the branching program in a moderate size.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aslam, J.A.: Improving algorithms for boosting. In: Proc. 13th Annu. Conference on Comput. Learning Theory, pp. 200–207. Morgan Kaufmann, San Francisco (2000)
Breiman, L.: Prediction games and arcing algorithms. Neural Computation 11(7), 1493–1518 (1999)
Domingo, C., Watanabe, O.: MadaBoost: A modification of AdaBoost. In: Proc. 13th Annu. Conference on Comput. Learning Theory, pp. 180–189. Morgan Kaufmann, San Francisco (2000)
Duffy, N., Helmbold, D.: A geometric approach to leveraging weak learners. Theoret. Comput. Sci. 284(1), 67–108 (2002)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting. Annals of Statistics 2, 337–374 (2000)
Grove, A.J., Schuurmans, D.: Boosting in the limit: Maximizing the margin of learned ensembles. 15th AAAI, 692–699 (1998)
Kalai, A., Servedio, R.A.: Boosting in the presence of noise. In: Proceedings of the 35th Annual ACM Symposium on Theory of Computing, San Diego, California, USA, June 9-11, pp. 196–205. ACM Press, New York (2003)
Kearns, M., Mansour, Y.: On the boosting ability of top-down decision tree learning algorithms. J. of Comput. Syst. Sci. 58(1), 109–128 (1999)
Kivinen, J., Warmuth, M.K.: Boosting as entropy projection. In: Proc. 12th Annu. Conf. on Comput. Learning Theory, pp. 134–144. ACM Press, New York (1999)
Mansour, Y., McAllester, D.A.: Boosting using branching programs. J. of Comput. Syst. Sci. 64(1), 103–112 (2002); Special Issue for COLT 2000
Rätsch, G., Warmuth, M.K.: Maximizing the margin with boosting. In: Kivinen, J., Sloan, R.H. (eds.) COLT 2002. LNCS (LNAI), vol. 2375, pp. 334–350. Springer, Heidelberg (2002)
Schapire, R.E., Freund, Y., Bartlett, P., Lee, W.S.: Boosting the margin: a new explanation for the effectiveness of voting methods. Annals of Statistics 26(5), 1651–1686 (1998)
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidencerated predictions. Machine Learning 37(3), 297–336 (1999)
Takimoto, E., Maruoka, A.: Top-down decision tree learning as information based boosting. Theoret. Comput. Sci. 292(2), 447–464 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takimoto, E., Koya, S., Maruoka, A. (2004). Boosting Based on Divide and Merge. In: Ben-David, S., Case, J., Maruoka, A. (eds) Algorithmic Learning Theory. ALT 2004. Lecture Notes in Computer Science(), vol 3244. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30215-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-30215-5_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23356-5
Online ISBN: 978-3-540-30215-5
eBook Packages: Springer Book Archive