LocalBoost: A Parallelizable Approach to Boosting Classifiers

Valle, Carlos; Ñanculef, Ricardo; Allende, Héctor; Moraga, Claudio

doi:10.1007/s11063-018-9924-3

LocalBoost: A Parallelizable Approach to Boosting Classifiers

Published: 28 September 2018

Volume 50, pages 19–41, (2019)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Carlos Valle ORCID: orcid.org/0000-0001-7158-2069^1,2,
Ricardo Ñanculef¹,
Héctor Allende¹ &
…
Claudio Moraga³

286 Accesses
Explore all metrics

Abstract

Ensemble learning is an active field of research with applications to a broad range of problems. Adaboost is a widely used ensemble approach, however, its computational burden is high because it uses an explicit diversity method for building the individual learners. To address this issue, we present a variant of Adaboost where the learners can be trained in parallel, exchanging information on a sparse collaborative communication that restricts the visibility among them. Experiments on 12 UCI datasets show that this approach is competitive in terms of generalization error but more efficient than Adaboost and two other parallel approximations of this algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving the Weighted Distribution Estimation for AdaBoost Using a Novel Concurrent Approach

Scaling Machine Learning with an Efficient Hybrid Distributed Framework

Random Local SVMs for Classifying Large Datasets

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Artificial Intelligence

Notes

Under a client/server (master/slave) model, this cost can become linear if all nodes send their classification to one coordinator that then computes the weight updates and send those weights back to every node. This approach exhibits however more limited scalability because of the synchronization operations and the communication bottleneck around the coordinator [20].

References

Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36:105–139
Article Google Scholar
Bi Y (2012) The impact of diversity on the accuracy of evidential classifier ensembles. Int J Approx Reason 53(4):584–607
Article MathSciNet Google Scholar
Bradley JK, Schapire RE (2007) Filterboost: regression and classification on large datasets. In: Platt JC, Koller D, Singer Y, Roweis ST (eds) NIPS. Curran Associates, Inc., Red Hook, pp 185–192
Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MATH Google Scholar
Breiman L (2001) Using iterated bagging to debias regressions. Mach Learn 45(3):261–277
Article MATH Google Scholar
Brown G, Wyatt JL, Harris R, Yao X (2005) Diversity creation methods: a survey and categorisation. Inf Fusion 6(1):5–20
Article Google Scholar
Brown G, Wyatt JL, Tiňo P (2005) Managing diversity in regression ensembles. J Mach Learn Res 6:1621–1650
MathSciNet MATH Google Scholar
Bühlmann P (2003) Bagging subagging and bragging for improving some prediction algorithms. In: Akritas MG, Politis DN (eds) Recent advances and trends in nonparametric statistics, Elsevier, New York, pp 19–34
Bukhtoyarov V, Semenkin E (2012) Neural networks ensemble approach for detecting attacks in computer networks. In: 2012 IEEE Congress on evolutionary computation (CEC), pp 1–6
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 785–794
Chen T, He T (2015) Higgs boson discovery with boosted trees. In: NIPS 2014 workshop on high-energy physics and machine learning, pp 69–80
Deveci M, Rajamanickam S, Leung VJ, Pedretti K, Olivier SL, Bunde DP, Catalyurek UV, Devine K (2014) Exploiting geometric partitioning in task mapping for parallel computers. In: 2014 IEEE 28th international on parallel and distributed processing symposium. IEEE, pp 27–36
Escudero G, Màrquez L, Rigau G (2001) Using lazyboosting for word sense disambiguation. In: The proceedings of the second international workshop on evaluating word sense disambiguation systems, SENSEVAL ’01. Association for Computational Linguistics, Stroudsburg, PA, USA, pp 71–74
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Article MathSciNet MATH Google Scholar
Galtier V, Genaud S, Vialle S (2009) Implementation of the Adaboost algorithm for large scale distributed environments: comparing JavaSpace and MPJ. In: 2013 international conference on parallel and distributed systems, pp 655–662
Grandvalet Y (2004) Bagging equalizes influence. Mach Learn 55(3):251–270
Article MATH Google Scholar
Hoefler T, Snir M (2011) Generic topology mapping strategies for large-scale parallel architectures. In: Proceedings of the international conference on supercomputing. ACM, pp 75–84
Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, New York
Book MATH Google Scholar
Dua D, Karra Taniskidou E (2017) UCI machine learning repository. School of Information and Computer Science, University of California, Irvine, CA. http://archive.ics.uci.edu/ml
Lua EK, Crowcroft J, Pias M, Sharma R, Lim S (2005) A survey and comparison of peer-to-peer overlay network schemes. IEEE Commun Surv Tutor 7(2):72–93
Article Google Scholar
Merler S, Caprile B, Furlanello C (2007) Parallelizing Adaboost by weights dynamics. Comput Stat Data Anal 51(5):2487–2498
Article MathSciNet MATH Google Scholar
Mukherjee I, Rudin C, Schapire RE (2013) The rate of convergence of Adaboost. J Mach Learn Res 14:2315–2347
MathSciNet MATH Google Scholar
$\tilde{\rm N}$anculef R, Valle C, Allende H, Moraga C (2012) Training regression ensembles by sequential target correction and resampling. Inf Sci 195:154–174. https://doi.org/10.1016/j.ins.2012.01.035
Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Intell Res 11:169–198
Article MATH Google Scholar
Palit I, Reddy CK (2012) Scalable and parallel boosting with MapReduce. IEEE Trans Knowl Data Eng 24(10):1904–1916
Article Google Scholar
Poggio T, Rifkin R, Mukherjee S, Rakhlin A (2002) Bagging regularizes. Technical report, AI Memo 2002-003, CBCL Memo 214, MIT AI Lab
Polikar R (2009) Ensemble learning. Scholarpedia 4(1):2776
Article Google Scholar
Ren Y, Zhang L, Suganthan PN (2016) Ensemble classification and regression—recent developments, applications and future directions. IEEE Comput Intell Mag 11(1):41–53
Article Google Scholar
Rudin C, Schapire R, Daubechies I (2007) Precise statements of convergence for Adaboost and ARC-GV. Contemp Math 443:131–146
Article MathSciNet MATH Google Scholar
Sluban B, Lavra N (2015) Relating ensemble diversity and performance: a study in class noise detection. Neurocomputing 160:120–131
Article Google Scholar
Tang EK, Suganthan PN, Yao X (2006) An analysis of diversity measures. Mach Learn 65(1):247–271
Article Google Scholar
Valle C, Ñanculef R, Allende H, Moraga C (2007) Two bagging algorithms with coupled learners to encourage diversity. In: IDA, Lecture Notes in Computer Science, vol 4723. Springer, pp 130–139
Valle C, Saravia F, Allende H, Monge R, Fernández C (2010) Parallel approach for ensemble learning with locally coupled neural networks. Neural Process Lett 32(3):277–291
Article Google Scholar
Wu G, Li H, Hu X, Bi Y, Zhang J, Wu X (2009) Mrec4.5: C4.5 ensemble classification with MapReduce. In: 2009 fourth China grid annual conference, pp 249–255
Wu Y, Arribas J (2003) Fusing output information in neural networks: ensemble performs better. In: Proceedings of the 25th annual international conference of the IEEE, vol 3. Engineering in Medicine and Biology Society, pp 2265–2268
Zeng K, Tang Y, Liu F (2011) Parallization of Adaboost algorithm through hybrid MPI/OpenMP and transactional memory. In: Cotronis Y, Danelutto M, Papadopoulos GA (eds) Proceedings of the 19th international Euromicro conference on parallel, distributed and network-based processing, PDP 2011, Ayia Napa, Cyprus, 9–11 Feb 2011. IEEE Computer Society, pp 94–100
Zhang L, Suganthan PN (2014) Random forests with ensemble of feature spaces. Pattern Recognit 47(10):3429–3437
Article Google Scholar
Zhang L, Suganthan PN (2015) Oblique decision tree ensemble via multisurface proximal support vector machine. IEEE Trans Cybern 45(10):2165–2176
Article Google Scholar

Download references

Acknowledgements

This work was supported by Research Project DGIP-UTFSM (Chile) 116.24.2, Basal Project FB 0821. and from CONICYT Chile through FONDECYT Project 11130122 and FONDECYT Project 1170123.

Author information

Authors and Affiliations

Department of Informatics, Federico Santa María Technical University, Valparaíso, Chile
Carlos Valle, Ricardo Ñanculef & Héctor Allende
Department of Computing and Informatics, Universidad de Playa Ancha, Valparaíso, Chile
Carlos Valle
Dortmund University, 44221, Dortmund, Germany
Claudio Moraga

Authors

Carlos Valle
View author publications
You can also search for this author inPubMed Google Scholar
Ricardo Ñanculef
View author publications
You can also search for this author inPubMed Google Scholar
Héctor Allende
View author publications
You can also search for this author inPubMed Google Scholar
Claudio Moraga
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Carlos Valle.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Valle, C., Ñanculef, R., Allende, H. et al. LocalBoost: A Parallelizable Approach to Boosting Classifiers. Neural Process Lett 50, 19–41 (2019). https://doi.org/10.1007/s11063-018-9924-3

Download citation

Published: 28 September 2018
Issue Date: 15 August 2019
DOI: https://doi.org/10.1007/s11063-018-9924-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LocalBoost: A Parallelizable Approach to Boosting Classifiers

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improving the Weighted Distribution Estimation for AdaBoost Using a Novel Concurrent Approach

Scaling Machine Learning with an Efficient Hybrid Distributed Framework

Random Local SVMs for Classifying Large Datasets

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now