Abstract
We study the problem of learning ensembles in the online setting, when the hypotheses are selected out of a base family that may be a union of possibly very complex sub-families. We prove new theoretical guarantees for the online learning of such ensembles in terms of the sequential Rademacher complexities of these sub-families. We also describe an algorithm that benefits from such guarantees. We further extend our framework by proving new structural estimation error guarantees for ensembles in the batch setting through a new data-dependent online-to-batch conversion technique, thereby also devising an effective algorithm for the batch setting which does not require the estimation of the Rademacher complexities of base sub-families.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bartlett, P.L., Mendelson, S.: Rademacher and Gaussian complexities: risk bounds and structural results. J. Mach. Learn. Res. 3, 463–482 (2002)
Beygelzimer, A., Hazan, E., Kale, S., Luo, H.: Online gradient boosting. In: Proceedings of NIPS, pp. 2449–2457 (2015)
Beygelzimer, A., Kale, S., Luo, H.: Optimal and adaptive algorithms for online boosting. In: ICML, volume 37 of JMLR Proceedings (2015)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Cesa-Bianchi, N., Conconi, A., Gentile, C.: On the generalization ability of on-line learning algorithms. IEEE Trans. Inf. Theor. 50(9), 2050–2057 (2004)
Cesa-Bianchi, N., Lugosi, G.: Prediction, Learning, and Games. Cambridge University Press, New York (2006)
Cesa-Bianchi, N., Mansour, Y., Stoltz, G.: Improved second-order bounds for prediction with expert advice. Mach. Learn. 66(2–3), 321–352 (2007)
Cortes, C., Mohri, M., Syed, U.: Deep boosting. In: Proceedings of ICML (2014)
Dekel, O., Singer, Y.: Data-driven online to batch conversions. In: NIPS, pp. 267–274 (2005)
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Jin, R., Hoi, S.C.H., Yang, T.: Online multiple kernel learning: algorithms and mistake bounds. In: Hutter, M., Stephan, F., Vovk, V., Zeugmann, T. (eds.) ALT 2010. LNCS, vol. 6331, pp. 390–404. Springer, Heidelberg (2010)
Koltchinskii, V., Panchenko, D.: Empirical margin distributions and bounding the generalization error of combined classifiers. Ann. Stat. 30, 1–50 (2002)
Littlestone, N., Warmuth, M.K.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)
Rakhlin, A., Shamir, O., Sridharan, K.: Relax, randomize: from value to algorithms. In: NIPS, pp. 2150–2158 (2012)
Rakhlin, A., Sridharan, K., Tewari, A.: Online learning: random averages, combinatorial parameters, and learnability. In: Proceedings of NIPS, pp. 1984–1992 (2010)
Rätsch, G., Onoda, T., Müller, K.-R.: Soft margins for adaboost. Mach. Learn. 42(3), 287–320 (2001)
Acknowledgements
This work was partly funded by the NSF awards IIS-1117591 and CCF-1535987 and was also supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE 1342536.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Mohri, M., Yang, S. (2016). Structural Online Learning. In: Ortner, R., Simon, H., Zilles, S. (eds) Algorithmic Learning Theory. ALT 2016. Lecture Notes in Computer Science(), vol 9925. Springer, Cham. https://doi.org/10.1007/978-3-319-46379-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-46379-7_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46378-0
Online ISBN: 978-3-319-46379-7
eBook Packages: Computer ScienceComputer Science (R0)