ABSTRACT
GLMNET proposed by Friedman et al. is an algorithm for generalized linear models with elastic net. It has been widely applied to solve L1-regularized logistic regression. However, recent experiments indicated that the existing GLMNET implementation may not be stable for large-scale problems. In this paper, we propose an improved GLMNET to address some theoretical and implementation issues. In particular, as a Newton-type method, GLMNET achieves fast local convergence, but may fail to quickly obtain a useful solution. By a careful design to adjust the effort for each iteration, our method is efficient regardless of loosely or strictly solving the optimization problem. Experiments demonstrate that the improved GLMNET is more efficient than a state-of-the-art coordinate descent method.
- J. Friedman, T. Hastie, and R. Tibshirani, "Regularization paths for generalized linear models via coordinate descent," Journal of Statistical Software, vol. 33, no. 1, pp. 1--22, 2010.Google ScholarCross Ref
- A. Genkin, D. D. Lewis, and D. Madigan, "Large-scale Bayesian logistic regression for text categorization," Technometrics, vol. 49, no. 3, pp. 291--304, 2007.Google ScholarCross Ref
- K. Koh, S.-J. Kim, and S. Boyd, "An interior-point method for large-scale l1-regularized logistic regression," Journal of Machine Learning Research, vol. 8, pp. 1519--1555, 2007. Google ScholarDigital Library
- G. Andrew and J. Gao, "Scalable training of L1-regularized log-linear models," in Proceedings of the Twenty Fourth International Conference on Machine Learning (ICML), 2007. Google ScholarDigital Library
- J. Liu, J. Chen, and J. Ye, "Large-scale sparse logistic regression," in Proceedings of The 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 547--556, 2009. Google ScholarDigital Library
- M. Schmidt, G. Fung, and R. Rosales, "Fast optimization methods for l1 regularization: A comparative study and two new approaches," in Proceedings of European Conference on Machine Learning, pp. 286--297, 2007. Google ScholarDigital Library
- G.-X. Yuan, K.-W. Chang, C.-J. Hsieh, and C.-J. Lin, "A comparison of optimization methods and software for large-scale l1-regularized linear classification," Journal of Machine Learning Research, vol. 11, pp. 3183--3234, 2010. Google ScholarDigital Library
- P. Tseng and S. Yun, "A coordinate gradient descent method for nonsmooth separable minimization," Mathematical Programming, vol. 117, pp. 387--423, 2009. Google ScholarDigital Library
- S. Yun and K.-C. Toh, "A coordinate gradient descent method for l1-regularized convex minimization," Computational Optimizations and Applications, vol. 48, no. 2, pp. 273--307, 2011. Google ScholarDigital Library
- K.-W. Chang, C.-J. Hsieh, and C.-J. Lin, "Coordinate descent method for large-scale L2-loss linear SVM," Journal of Machine Learning Research, vol. 9, pp. 1369--1398, 2008. Google ScholarDigital Library
- G.-X. Yuan, C.-H. Ho, and C.-J. Lin, "An improved GLMNET for l1-regularized logistic regression and support vector machines," tech. rep., National Taiwan University, 2011.Google Scholar
- H.-F. Yu, H.-Y. Lo, H.-P. Hsieh, J.-K. Lou, T. G. McKenzie, J.-W. Chou, P.-H. Chung, C.-H. Ho, C.-F. Chang, Y.-H. Wei, J.-Y. Weng, E.-S. Yan, C.-W. Chang, T.-T. Kuo, Y.-C. Lo, P. T. Chang, C. Po, C.-Y. Wang, Y.-H. Huang, C.-W. Hung, Y.-X. Ruan, Y.-S. Lin, S.-D. Lin, H.-T. Lin, and C.-J. Lin, "Feature engineering and classifier ensemble for KDD cup 2010," in JMLR Workshop and Conference Proceedings, 2011. To appear.Google Scholar
- R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, "LIBLINEAR: A library for large linear classification," Journal of Machine Learning Research, vol. 9, pp. 1871--1874, 2008. Google ScholarDigital Library
Index Terms
- An improved GLMNET for l1-regularized logistic regression
Recommendations
Large-scale sparse logistic regression
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data miningLogistic Regression is a well-known classification method that has been used widely in many applications of data mining, machine learning, computer vision, and bioinformatics. Sparse logistic regression embeds feature selection in the classification ...
An improved GLMNET for L1-regularized logistic regression
Recently, Yuan et al. (2010) conducted a comprehensive comparison on software for L1-regularized classification. They concluded that a carefully designed coordinate descent implementation CDN is the fastest among state-of-the-art solvers. In this paper, ...
An improved GLMNET for L1-regularized logistic regression
Recently, Yuan et al. (2010) conducted a comprehensive comparison on software for L1-regularized classification. They concluded that a carefully designed coordinate descent implementation CDN is the fastest among state-of-the-art solvers. In this paper, ...
Comments