Abstract
Stability selection [9] is a general principle for performing feature selection. It functions as a meta-layer on top of a “baseline” feature selection method, and consists in repeatedly applying the baseline to random data subsamples of half-size, and finally outputting the features with selection frequency larger than a fixed threshold. In the present work, we suggest and study a simple extension of the original stability selection. It consists in applying the baseline method to random submatrices of the data matrix X of a given size and returning those features having the largest selection frequency. We analyze from a theoretical point of view the effect of this subsampling on the selected variables, in particular the influence of the data subsample size. We report experimental results on large-dimension artificial and real data and identify in which settings stability selection is to be recommended.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
MASH project, http://www.mash-project.eu
Bi, J., Bennett, K., Embrechts, M., Breneman, C., Song, M.: Dimensionality reduction via sparse support vector machines. JMLR 3, 1229–1243 (2003)
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Bühlmann, P., Yu, B.: Analyzing Bagging. The Annals of Statistics 30(4), 927–961 (2002)
Escudero, G., Màrquez, L., Rigau, G.: Boosting Applied to Word Sense Disambiguation. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 129–141. Springer, Heidelberg (2000)
Fleuret, F.: Fast binary feature selection with conditional mutual information. JMLR 5, 1531–1555 (2004)
Kirk, P., Lewin, A., Stumpf, M.: Discussion of ”Stability Selection” by Meinshausen and Bühlmann. J. Roy. Statist. Soc., Ser. B 72(4), 456–458 (2010)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Meinshausen, N., Bühlmann, P.: Stability selection. J. Roy. Statist. Soc., Ser. B 72(4), 417–448 (2010)
Schapire, R., Singer, Y.: Improved boosting algorithms using confidence-rated predictions. Machine Learning 37(3), 297–336 (1999)
Shah, R., Samworth, R.: Variable selection with error control: Another look at stability selection. J. Roy. Statist. Soc., Ser. B (to appear, 2012)
Wang, S., Nan, B., Rosset, S., Zhu, J.: Random Lasso. The Annals of Applied Statistics 5(1), 468–485 (2011)
Zaman, F., Hirose, H.: Effect of Subsampling Rate on Subbagging and Related Ensembles of Stable Classifiers. In: Chaudhury, S., Mitra, S., Murthy, C.A., Sastry, P.S., Pal, S.K. (eds.) PReMI 2009. LNCS, vol. 5909, pp. 44–49. Springer, Heidelberg (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Beinrucker, A., Dogan, Ü., Blanchard, G. (2012). A Simple Extension of Stability Feature Selection. In: Pinz, A., Pock, T., Bischof, H., Leberl, F. (eds) Pattern Recognition. DAGM/OAGM 2012. Lecture Notes in Computer Science, vol 7476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32717-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-32717-9_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32716-2
Online ISBN: 978-3-642-32717-9
eBook Packages: Computer ScienceComputer Science (R0)