Abstract
Ensemble learning has gained considerable attention in different tasks including regression, classification and clustering. Adaboost and Bagging are two popular approaches used to train these models. The former provides accurate estimations in regression settings but is computationally expensive because of its inherently sequential structure, while the latter is less accurate but highly efficient. One of the drawbacks of the ensemble algorithms is the high computational cost of the training stage. To address this issue, we propose a parallel implementation of the Resampling Local Negative Correlation (RLNC) algorithm for training a neural network ensemble in order to acquire a competitive accuracy like that of Adaboost and an efficiency comparable to that of Bagging. We test our approach on both synthetic and real datasets from the UCI and Statlib repositories for the regression task. In particular, our fine-grained parallel approach allows us to achieve a satisfactory balance between accuracy and parallel efficiency.
Similar content being viewed by others
References
Blake C, Newman D, Merz C (1998) UCI repository of machine learning databases
Breiman L (1996) Bagging predictors. Mach Learn 24(2): 123–140
Breshears C (2009) The art of concurrency: a thread monkey’s guide to writing parallel applications. O’Reilly Media
Brown G (2003) Diversity in neural network ensembles. PhD Thesis, School of Computer Science, University of Birmingham
Chu C, Kim S, Lin Y, Yu Y, Bradski GR, Ng A, Olukotun K (2006) Map-reduce for machine learning on multicore. In: NIPS MIT Press, pp 281–288
Dietterich T, Fisher D (2000) An experimental comparison of three methods for constructing ensembles of decision trees. In: Bagging, boosting, and randomization. Machine learning. pp 139–157
Drucker H (1997) Improving regressors using boosting techniques. In: Fourteenth international conference on machine learning. pp 107–115
Estévez P, Paugam-Moisy H, Puzenat D, Ugarte M (2002) A scalable parallel algorithm for training a hierarchical mixture of neural experts. Parallel Comput, 28(6): 861–891
Friedman J (1991) Multivariate adaptive regression splines. Ann Stat 19(1): 1–67
Galtier V, Pietquin O, Vialle S (2007) Adaboost parallelization on pc clusters with virtual shared memory for fast feature selection. In: Proceedings of the IEEE international conference on signal processing and communication. November 2007. Dubai, United Arab Emirates, pp 165–168
Grama A, Karypis G, Kumar V, Gupta A (2003) Introduction to parallel computing. 2nd edn. Addison Wesley, Boston, MA
Grandvalet Y (2004) Bagging equalizes influence. Mach Learn 55(3): 251–270
Gropp W (2002) Mpich2: a new start for mpi implementations. In: Proceedings of the 9th European PVM/MPI Users’ Group Meeting on recent advances in parallel virtual machine and message passing interface. Springer-Verlag, London, UK, p 7
Hastie T, Tibshirani R, Friedman J (2003) The elements of statistical learning, Chap. 10. Springer, New York
Kargupta, H, Joshi, A, Sivakumar, K, Yesha, Y (eds) (2004) Data mining: next generation challenges and future directions. AAAI Press, Menlo Park, CA
Lazarevic A, Obradovic Z (2002) Boosting algorithms for parallel and distributed learning. Distrib Parallel Databases 11(2): 203–229
Martínez-Muñoz G, Suárez A (2010) Out-of-bag estimation of the optimal sample size in bagging. Pattern Recogn 43(1): 143–152
Merler S, Caprile B, Furlanello C (2007) Parallelizing adaboost by weights dynamics. Comput Stat Data Anal, 51: 2487–2498
Ñanculef R, Valle C, Allende H, Moraga C (2006) Ensemble learning with local diversity. In: ICANN (1), Lecture Notes in Computer Science, vol 4131. Springer, pp 570–577
Ñanculef R, Valle C, Allende H, Moraga C (2006) Local negative correlation with resampling. In: IDEAL, Lecture Notes in Computer Science, vol 4224. Springer, pp 570–577
Schapire R (1999) A short introduction to boosting. In Proceedings of the sixteenth international joint conference on artificial intelligence. pp 771–780
Shrestha D, Solomatine D (2006) Experiments with adaboost.rt, an improved boosting scheme for regression. Neural Comput 18(7): 1678–1710
Vin T, Seng M, Kuan N, Haron F (2005) A framework for grid-based neural networks. In: Distributed frameworks for multimedia applications, International conference on, pp 246–253
Vlachos P (2000) Statlib project repository. Department of Statistics, Carnegie Mellon University
Woodsend K, Gondzio J (2009) Hybrid MPI/OpenMP parallel linear support vector machine training. J Mach Learn Res 10: 1937–1953
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Valle, C., Saravia, F., Allende, H. et al. Parallel Approach for Ensemble Learning with Locally Coupled Neural Networks. Neural Process Lett 32, 277–291 (2010). https://doi.org/10.1007/s11063-010-9157-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-010-9157-6