Abstract
In classification, class noise alludes to incorrect labelling of instances and it causes the classifiers to perform worse. In this contribution, we test the resistance against noise of the most influential boosting algorithms. We explain the fundamentals of these state-of-the-art algorithms, providing an unified notation to facilitate their comparison. We analyse how they carry out the classification, what loss functions use and what techniques employ under the boosting scheme.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
References
Alfaro, E., Gámez, M., García, N.: Adabag: an R package for classification with boosting and bagging. J. Stat. Softw. 54(2), 1–35 (2013). https://www.jstatsoft.org/article/view/v054i02
Álvarez, P.M., Luengo, J., Herrera, F.: A first study on the use of boosting for class noise reparation. In: Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E. (eds.) HAIS 2016. LNCS, vol. 9648, pp. 549–559. Springer, Cham (2016). doi:10.1007/978-3-319-32034-2_46
Cao, J., Kwong, S., Wang, R.: A noise-detection based AdaBoost algorithm for mislabeled data. Pattern Recogn. 45(12), 4451–4465 (2012)
Chen, T., Gestrin, C.: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)
Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000)
Frénay, B., Verleysen, M.: Classification in the presence of noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2014)
Freund, Y., Schapire, R.E.: Foundations and algorithms. MIT press, Cambridge (2012)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 337–374 (2002)
Friedman, J.H.: Stochastic gradient boosting. Comput. Stat. Data Anal. 38, 367–378 (2002)
García, S., Luengo, J., Herrera, F.: Data Preprocessing in Data Mining. Springer, New York (2015)
Karmaker, A., Kwek, S.: A boosting approach to remove class label noise. Int. J. Hybrid Intell. Syst. 3(3), 169–177 (2006)
McDonald, R.A., Hand, D.J., Eckley, I.A.: An empirical comparison of three boosting algorithms on real data sets with artificial class noise. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, pp. 35–44. Springer, Heidelberg (2003). doi:10.1007/3-540-44938-8_4
Miao, Q., Cao, Y., Xia, G., Gong, M., Liu, J., Song, J.: RBoost: label noise-robust boosting algorithm based on a nonconvex loss function and the numerically stable base learners. IEEE Trans. Neural Netw. Learn. Syst. 27(11), 2216–2228 (2015)
Rätsch, G., Onoda, T., Mller, K.R.: Soft margins for AdaBoost. Mach. Learn. 42(3), 287–320 (2001)
Ridgeway, G.: Generalized Boosted Models: A guide to the gbm package. Update 1(1), 1–15 (2007)
Sáez, J.A., Luengo, J., Herrera, F.: Evaluating the classifier behaviour with noisy data considering performance and robustness: the equalized loss of accuracy measure. Neurocomputing 176, 26–35 (2016)
Sun, B., Chen, S., Wang, J., Chen, H.: A robust multi-class AdaBoost algorithm for mislabeled noisy data. Knowl. Based Syst. 102, 87–102 (2016)
Acknowledgments
This work was supported by the National Research Project TIN2014-57251-P and Andalusian Research Plan P11-TIC-7765.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Gómez-Ríos, A., Luengo, J., Herrera, F. (2017). A Study on the Noise Label Influence in Boosting Algorithms: AdaBoost, GBM and XGBoost. In: Martínez de Pisón, F., Urraca, R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2017. Lecture Notes in Computer Science(), vol 10334. Springer, Cham. https://doi.org/10.1007/978-3-319-59650-1_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-59650-1_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59649-5
Online ISBN: 978-3-319-59650-1
eBook Packages: Computer ScienceComputer Science (R0)