Abstract
The experiments, aimed to compare the performance of bagging ensembles using three different test sets composed of base, out-of-bag, and 30% holdout instances were conducted. Six weak learners including conjunctive rules, decision stump, decision table, pruned model trees, rule model trees, and multilayer perceptron, implemented in the data mining system WEKA, were applied. All algorithms were employed to real-world datasets derived from the cadastral system and the registry of real estate transactions, and cleansed by property valuation experts. The analysis of the results was performed using recently proposed statistical methodology including nonparametric tests followed by post-hoc procedures designed especially for multiple n×n comparisons. The results showed the lowest prediction error with base test set only in the case of model trees and a neural network.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bańczyk, K.: Multi-agent system based on heterogeneous ensemble machine learning models. Master’s Thesis, Wrocław University of Technology, Wrocław, Poland (2011)
Breiman, L.: Bagging Predictors. Machine Learning 24(2), 123–140 (1996)
Büchlmann, P., Yu, B.: Analyzing bagging. Annals of Statistics 30, 927–961 (2002)
Cordón, O., Quirin, A.: Comparing Two Genetic Overproduce-and-choose Strategies for Fuzzy Rule-based Multiclassification Systems Generated by Bagging and Mutual Information-based Feature Selection. Int. J. Hybrid Intel. Systems 7(1), 45–64 (2010)
Cunningham, S.J., Frank, E., Hall, M., Holmes, G., Trigg, L., Witten, I.H.: WEKA: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann. New Zealand (2005)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)
Efron, B., Tibshirani, R.J.: Improvements on cross-validation: the.632+ bootstrap method. Journal of the American Statistical Association 92(438), 548–560 (1997)
Friedman, J.H., Hall, P.: On bagging and nonlinear estimation. Journal of Statistical Planning and Inference 137(3), 669–683 (2007)
Fumera, G., Roli, F., Serrau, A.: A theoretical analysis of bagging as a linear combination of classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(7), 1293–1299 (2008)
García, S., Fernandez, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences 180, 2044–2064 (2010)
García, S., Fernandez, A., Luengo, J., Herrera, F.: A Study of Statistical Techniques and Performance Measures for Genetics-Based Machine Learning: Accuracy and Interpretability. Soft Computing 13(10), 959–977 (2009)
García, S., Herrera, F.: An Extension on “Statistical Comparisons of Classifiers over Multiple Data Sets” for all Pairwise Comparisons. Journal of Machine Learning Research 9, 2677–2694 (2008)
Graczyk, M., Lasota, T., Trawiński, B.: Comparative Analysis of Premises Valuation Models Using KEEL, RapidMiner, and WEKA. In: Nguyen, N.T., Kowalczyk, R., Chen, S.-M. (eds.) ICCCI 2009. LNCS (LNAI), vol. 5796, pp. 800–812. Springer, Heidelberg (2009)
Graczyk, M., Lasota, T., Trawiński, B., Trawiński, K.: Comparison of Bagging, Boosting and Stacking Ensembles Applied to Real Estate Appraisal. In: Nguyen, N.T., Le, M.T., Świątek, J., et al. (eds.) ACIIDS 2010. LNCS (LNAI), vol. 5991, pp. 340–350. Springer, Heidelberg (2010)
Król, D., Lasota, T., Trawiński, B., Trawiński, K.: Investigation of Evolutionary Optimization Methods of TSK Fuzzy Model for Real Estate Appraisal. International Journal of Hybrid Intelligent Systems 5(3), 111–128 (2008)
Krzystanek, M., Lasota, T., Telec, Z., Trawiński, B.: Analysis of Bagging Ensembles of Fuzzy Models for Premises Valuation. In: Nguyen, N.T., Le, M.T., Świątek, J. (eds.) Intelligent Information and Database Systems. LNCS (LNAI), vol. 5991, pp. 330–339. Springer, Heidelberg (2010)
Lasota, T., Mazurkiewicz, J., Trawiński, B., Trawiński, K.: Comparison of Data Driven Models for the Validation of Residential Premises using KEEL. International Journal of Hybrid Intelligent Systems 7(1), 3–16 (2010)
Lasota, T., Telec, Z., Trawiński, B., Trawiński, K.: Exploration of Bagging Ensembles Comprising Genetic Fuzzy Models to Assist with Real Estate Appraisals. In: Corchado, E., Yin, H. (eds.) IDEAL 2009. LNCS, vol. 5788, pp. 554–561. Springer, Heidelberg (2009)
Polikar, R.: Ensemble Learning. Scholarpedia 4(1), 2776 (2009)
Schapire, R.E.: The Strength of Weak Learnability. Mach. Learning 5(2), 197–227 (1990)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bańczyk, K., Kempa, O., Lasota, T., Trawiński, B. (2011). Empirical Comparison of Bagging Ensembles Created Using Weak Learners for a Regression Problem. In: Nguyen, N.T., Kim, CG., Janiak, A. (eds) Intelligent Information and Database Systems. ACIIDS 2011. Lecture Notes in Computer Science(), vol 6592. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20042-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-20042-7_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20041-0
Online ISBN: 978-3-642-20042-7
eBook Packages: Computer ScienceComputer Science (R0)