Abstract
The intuition behind ensembles is that different prediciton models compensate each other’s errors if one combines them in an appropriate way. In case of large ensembles a lot of different prediction models are available. However, many of them may share similar error characteristics, which highly depress the compensation effect. Thus the selection of an appropriate subset of models is crucial. In this paper, we address this problem. As major contribution, for the case if a large number of models is present, we propose a graph-based framework for model selection while paying special attention to the interaction effect of models. In this framework, we introduce four ensemble techniques and compare them to the state-of-the-art in experiments on publicly available real-world data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bacauskiene, M., Verikas, A., Gelzinis, A., Valincius, D.: A feature selection technique for generation of classification committees and its application to categorization of laryngeal images. Pattern Recognition 42, 645–654 (2009)
Bryll, R., Gutierrez-Osuna, R., Quek, F.: Attribute bagging: improving accuracy of classifier ensembles by using random feature subsets. Pattern Recognition 36(6), 1291–1302 (2003)
Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
Li, G.-Z., Liu, T.-Y.: Feature selection for bagging of support vector machines. In: Yang, Q., Webb, G. (eds.) PRICAI 2006. LNCS (LNAI), vol. 4099, pp. 271–277. Springer, Heidelberg (2006)
Peng, Y.: A novel ensemble machine learning for robust microarray data classification. Computers in Biology and Medicine 36(6), 553–573 (2006)
Preisach, C., Schmidt-Thieme, L.: Ensembles of relational classifiers. Knowl. Inf. Syst. 14, 249–272 (2008)
Tan, A.C., Gilbert, D.: Ensemble machine learning on gene expression data for cancer classification (2003)
Ting, K.M., Witten, I.H.: Stacked generalization: when does it work? In: Int’l. Joint Conf. on Artificial Intelligence, pp. 866–871. Morgan Kaufmann, San Francisco (1997)
Tsymbal, A., Patterson, D.W., Puuronen, S.: Ensemble feature selection with simple bayesian classification. Inf. Fusion 4, 87–100 (2003)
Webb, G.I., Boughton, J.R., Wang, Z.: Not so naive bayes: Aggregating one-dependence estimators. Mach. Learn. 58(1), 5–24 (2005)
Wolpert, D.H.: Stacked generalization. Neural Networks 5, 241–259 (1992)
Yang, Y., et al.: To select or to weigh: A comparative study of linear combination schemes for superparent-one-dependence estimators. IEEE Trans. on Knowledge and Data Engineering 19, 1652–1665 (2007)
Zhou, Z.-H., Wu, J., Tang, W., Zhou, Z.h., Wu, J., Tang, W.: Ensembling neural networks: Many could be better than all. Artificial Intelligence 137(1-2), 239–263 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Buza, K., Nanopoulos, A., Schmidt-Thieme, L. (2010). Graph-Based Model-Selection Framework for Large Ensembles. In: Graña Romay, M., Corchado, E., Garcia Sebastian, M.T. (eds) Hybrid Artificial Intelligence Systems. HAIS 2010. Lecture Notes in Computer Science(), vol 6076. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13769-3_68
Download citation
DOI: https://doi.org/10.1007/978-3-642-13769-3_68
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13768-6
Online ISBN: 978-3-642-13769-3
eBook Packages: Computer ScienceComputer Science (R0)