Abstract
Group Feature Selection (GFS) has proven to be useful in improving the interpretability and prediction performance of learned model parameters in many machine learning and data mining applications. Existing GFS models were mainly based on square loss and logistic loss for regression and classification, leaving the \(\epsilon \)-insensitive loss and the hinge loss popularized by Support Vector Learning (SVL) machines still unexplored. In this paper, we present a Bayesian GFS framework for SVL machines based on the pseudo likelihood and data augmentation idea. With Bayesian inference, our method can circumvent the cross-validation for regularization parameters. Specifically, we apply the mean field variational method in an augmented space to derive the posterior distribution of model parameters and hyper-parameters for Bayesian estimation. Both regression and classification experiments conducted on synthetic and real-world data sets demonstrate that our proposed approach outperforms a number of competitors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anguita, D., Ghio, A., Oneto, L., Parra, X., Reyes-Ortiz, J.L.: A public domain dataset for human activity recognition using smartphones. In: ESANN (2013)
Babacan, S.D., Nakajima, S., Do, M.N.: Bayesian group-sparse modeling and variational inference. IEEE Trans. Sig. Process 62(11), 2906–2921 (2014)
Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2, 265–292 (2002)
Drucker, H., Burges, C.J., Kaufman, L., Smola, A., Vapnik, V.: Support vector regression machines. In: NIPS, pp. 155–161 (1997)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
Hall, D.L., McMullen, S.A.: Mathematical Techniques in Multisensor Data Fusion. Artech House, Norwood (2004)
Hernández-Lobato, D., Hernández-Lobato, J.M., Dupont, P.: Generalized spike-and-slab priors for bayesian group feature selection using expectation propagation. J. Mach. Learn. Res. 14(1), 1891–1945 (2013)
Hull, J.: A database for handwritten text recognition research. IEEE Trans. PAMI 16(5), 550–554 (1994)
Jacob, L., Obozinski, G., Vert, J.P.: Group lasso with overlap and graph lasso. In: ICML, pp. 433–440 (2009)
Liu, J., Ji, S., Ye, J.: Slep: Sparse Learning with Efficient Projections. Arizona State University, Tempe (2009)
Meier, L., Van De Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 70(1), 53–71 (2008)
Polson, N.G., Scott, S.L.: Data augmentation for support vector machines. Bayesian Anal. 6(1), 1–23 (2011)
Raman, S., Fuchs, T.J., Wild, P.J., Dahl, E., Roth, V.: The bayesian group-lasso for analyzing contingency tables. In: ICML, pp. 881–888 (2009)
Roth, V., Fischer, B.: The group-lasso for generalized linear models: uniqueness of solutions and efficient algorithms. In: ICML, pp. 848–855 (2008)
Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: A sparse-group lasso. J. Comput. Graph. Stat. 22(2), 231–245 (2013)
Subrahmanya, N., Shin, Y.C.: Sparse multiple kernel learning for signal processing applications. IEEE Trans. PAMI 32(5), 788–798 (2010)
Subrahmanya, N., Shin, Y.C.: A variational bayesian framework for group feature selection. Int. J. Mach. Learn. Cybern. 4(6), 609–619 (2013)
Tan, M., Wang, L., Tsang, I.W.: Learning sparse svm for feature selection on very high dimensional datasets. In: ICML, pp. 1047–1054 (2010)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)
Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1, 211–244 (2001)
Wang, J., Zhao, Z.Q., Hu, X., Cheung, Y.M., Wang, M., Wu, X.: Online group feature selection. In: IJCAI, pp. 1757–1763 (2013)
Yang, H., Xu, Z., King, I., Lyu, M.R.: Online learning for group lasso. In: ICML, pp. 1191–1198 (2010)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Series B 68(1), 49–67 (2006)
Zhu, J., Rosset, S., Hastie, T., Tibshirani, R.: 1-norm support vector machines. In: NIPS, pp. 49–56 (2004)
Zhu, J., Chen, N., Perkins, H., Zhang, B.: Gibbs max-margin topic models with data augmentation. J. Mach. Learn. Res. 15, 1073–1110 (2014)
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 9154610306, 61573335, 61473273, 61473274, 11390371, 11233004), National Key Basic Research Program of China (Grant No. 2014CB845700), National High-tech R&D Program of China (863 Program) (No. 2014AA015105), Guangdong provincial science and technology plan projects (No. 2015 B010109005).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Du, C., Du, C., Zhe, S., Luo, A., He, Q., Long, G. (2016). Bayesian Group Feature Selection for Support Vector Learning Machines. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9651. Springer, Cham. https://doi.org/10.1007/978-3-319-31753-3_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-31753-3_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31752-6
Online ISBN: 978-3-319-31753-3
eBook Packages: Computer ScienceComputer Science (R0)