ABSTRACT
In this paper, L1/2+1 regularized logistic regression model and corresponding algorithm are proposed. The L1/2 regular term has unbiased, sparsity and Oracle properties. The L1 regular term guarantees the convex function characteristics in theory. The regular term of the model is composed of the linear combination of L1/2 norm and L1 norm, which can effectively improve the over fitting problem and the generalization ability of the model. In this algorithm, the idea of coordinate descent method is adopted, and the solution of parameters is transformed into a series of extremum problems of one variable function, thus the analytical expression of parameter estimation is given. Experiments on simulated data and real data show that, in some cases, the model and algorithm proposed in this paper are better than the traditional logistic regression and several classical regularized logistic regression in variable selection and prediction ability, and are suitable for small sample data sets with low correlation between variables.
- Thompson E, Bowling B, and Markle R. 2018. Predicting Student Success in a Major's Introductory Biology Course via Logistic Regression Analysis of Scientific Reasoning Ability and Mathematics Scores. J. Research in Science Education. 48 (February 2018): 151-163. https://doi.org/10.1007/s11165-016-9563-5Google Scholar
- David MC, Van Der Pols JC, Williams GM, 2013. Risk of attrition in a longitudinal study of skin cancer: logistic and survival models can give different results. J. Journal of clinical epidemiology. 66 (August 2013): 888-895. https://doi.org/10.1016/j.jclinepi.2013.03.008Google ScholarCross Ref
- Gajović V, Kerkez M, and Kočović J. 2018. Modeling and simulation of logistic processes: risk assessment with a fuzzy logic technique. J. Simulation (San Diego, Calif.). 94 (June 2018): 507-518. https://doi.org/10.1177/0037549717738351Google ScholarDigital Library
- Hoerl A, Kennard R. 1970. Ridge Regression: Biased Estimation for Nonorthogonal Problems. J. Technometrics. 12 (1): 55-67. https://doi.org/10.1080/00401706.1970.10488634Google ScholarCross Ref
- Tibshirani R. 1996. Regression Shrinkage and Selection Via the Lasso. J. Journal of the Royal Statistical Society: Series B. 58 (January 1996): 229-243. https://doi.org/10.1111/j.2517-6161.1996.tb02080.xGoogle Scholar
- Fan J, Li R. 2011. Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties. J. Journal of the American Statistical Association. 96 (December 2011): 1348-1360. https://doi.org/10.1198/016214501753382273Google Scholar
- Zou H, Hastie T. 2005. Regularization and Variable Selection via the Elastic Net. J. Journal of the Royal Statistical Society: Series B (Statistical Methodology). 67 (April 2005): 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.xGoogle ScholarCross Ref
- Kim SH. 2003. An Investigation of Bayes Estimation Procedures for the Two-Parameter Logistic Model. J. Springer Japan. 389-396. https://doi.org/10.1007/978-4-431-66996-8_44Google Scholar
- Friedman J, Hastie T, and Tibshirani R. 2010. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Journal of Statistical Software. 33 (February 2010), 1-22. https://doi.org/10.18637/jss.v033.i01Google ScholarCross Ref
- Yuan Guoxun, Ho C, and Lin C. 2011. An Improved GLMNET for L1-Regularized Logistic Regression. J. ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. San Diego, USA. (August 2011), 33-41. https://doi.org/10.1145/2020408.2020421Google ScholarDigital Library
- Balamurugan P. 2013. Large-Scale Elastic Net Regularized Linear Classification SVMs and Logistic Regression. J. ACM Sigkdd International Conference on Knowledge Discovery & Data Mining. Dallas, USA. (December 2013). 10.1109/ICDM.2013.126Google Scholar
- Bach F. 2013. Adaptivity of Averaged Stochastic Gradient Descent to Local Strong Convexity for Logistic Regression. J. Mach. Learn. Res. 15 (March): 595-627. 10.1142/S0218194014500065Google Scholar
- Koh K, Kim S, and Boyd S. 2007. An Interior-Point Method for Large-Scale L1-Regularized Logistic Regression. J. Journal of Machine Learning Research. 8 (July 2007): 1519-1555. 10.1109/JSTSP.2007.910971Google Scholar
- Park M, Hastie T. 2007. L1‐Regularization Path Algorithm for Generalized Linear Models. J. Journal of the Royal Statal Society Series B. 69 (August 2007): 659-677. https://doi.org/10.1111/j.1467-9868.2007.00607.xGoogle ScholarCross Ref
- Zongben X, Hailiang Guo, Yao Wang, 2012. The Representative of L1/2 Regularization among Lq (1<q≤1) Regularizations: An Experimental Study Based on Phase Diagram. J. Acta Automatica Sinica. 38 (July 2012): 1225-1228 (in Chinese). https://doi.org/10.1016/S1874-1029(11)60293-0Google Scholar
- Zongben X, Xiangyu C, Fengmin X, 2012. L1/2 Regularization: A Thresholding Representation Theory and a Fast Solver. IEEE Transactions on Neural Networks and Learning Systems. United States, 23 (July 2012): 1013-1027. 10.1109/TNNLS.2012.2197412Google ScholarCross Ref
- Xing Fuchong. 2003. Investigation on Solutions of Cubic Equations with One Unknown. J. Journal of The Central University for Nationalities: Natural Science Edition. 12 (3): 207-218 (in Chinese). 10.3969/j.issn.1005-8036.2003.03.003Google Scholar
Index Terms
- Sparse Logistic Regression with the Hybrid L1/2+1 Regularization
Recommendations
Gene expression data classification with robust sparse logistic regression using fused regularisation
Microarray technology has become popular and is extensively used for gene classification. It is essential to identify a proper set of gene expressions that help to classify cancer data. However, microarray data comprises large number of genes with small ...
Sparse Recovery via Partial Regularization: Models, Theory, and Algorithms
In the context of sparse recovery, it is known that most of the existing regularizers such as ý 1 suffer from some bias incurred by some leading entries in magnitude of the associated vector. To neutralize this bias, we propose a class of models with ...
A novel l1/2 sparse regression method for hyperspectral unmixing
Hyperspectral unmixing HU is a popular tool in remotely sensed hyperspectral data interpretation, and it is used to estimate the number of reference spectra end-members, their spectral signatures, and their fractional abundances. However, it can also be ...
Comments