research-article

An improved GLMNET for l1-regularized logistic regression

Authors:
Guo-Xun Yuan

National Taiwan University, Taipei, Taiwan Roc

National Taiwan University, Taipei, Taiwan Roc
View Profile

,
Chia-Hua Ho

National Taiwan University, Taipei, Taiwan Roc

National Taiwan University, Taipei, Taiwan Roc
View Profile

,
Chih-Jen Lin

National Taiwan University, Taipei, Taiwan Roc

National Taiwan University, Taipei, Taiwan Roc
View Profile

KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data miningAugust 2011Pages 33–41https://doi.org/10.1145/2020408.2020421

Published:21 August 2011Publication History

KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 33–41

ABSTRACT

GLMNET proposed by Friedman et al. is an algorithm for generalized linear models with elastic net. It has been widely applied to solve L1-regularized logistic regression. However, recent experiments indicated that the existing GLMNET implementation may not be stable for large-scale problems. In this paper, we propose an improved GLMNET to address some theoretical and implementation issues. In particular, as a Newton-type method, GLMNET achieves fast local convergence, but may fail to quickly obtain a useful solution. By a careful design to adjust the effort for each iteration, our method is efficient regardless of loosely or strictly solving the optimization problem. Experiments demonstrate that the improved GLMNET is more efficient than a state-of-the-art coordinate descent method.

References

J. Friedman, T. Hastie, and R. Tibshirani, "Regularization paths for generalized linear models via coordinate descent," Journal of Statistical Software, vol. 33, no. 1, pp. 1--22, 2010.Google ScholarCross Ref
A. Genkin, D. D. Lewis, and D. Madigan, "Large-scale Bayesian logistic regression for text categorization," Technometrics, vol. 49, no. 3, pp. 291--304, 2007.Google ScholarCross Ref
K. Koh, S.-J. Kim, and S. Boyd, "An interior-point method for large-scale l1-regularized logistic regression," Journal of Machine Learning Research, vol. 8, pp. 1519--1555, 2007. Google ScholarDigital Library
G. Andrew and J. Gao, "Scalable training of L1-regularized log-linear models," in Proceedings of the Twenty Fourth International Conference on Machine Learning (ICML), 2007. Google ScholarDigital Library
J. Liu, J. Chen, and J. Ye, "Large-scale sparse logistic regression," in Proceedings of The 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 547--556, 2009. Google ScholarDigital Library
M. Schmidt, G. Fung, and R. Rosales, "Fast optimization methods for l1 regularization: A comparative study and two new approaches," in Proceedings of European Conference on Machine Learning, pp. 286--297, 2007. Google ScholarDigital Library
G.-X. Yuan, K.-W. Chang, C.-J. Hsieh, and C.-J. Lin, "A comparison of optimization methods and software for large-scale l1-regularized linear classification," Journal of Machine Learning Research, vol. 11, pp. 3183--3234, 2010. Google ScholarDigital Library
P. Tseng and S. Yun, "A coordinate gradient descent method for nonsmooth separable minimization," Mathematical Programming, vol. 117, pp. 387--423, 2009. Google ScholarDigital Library
S. Yun and K.-C. Toh, "A coordinate gradient descent method for l1-regularized convex minimization," Computational Optimizations and Applications, vol. 48, no. 2, pp. 273--307, 2011. Google ScholarDigital Library
K.-W. Chang, C.-J. Hsieh, and C.-J. Lin, "Coordinate descent method for large-scale L2-loss linear SVM," Journal of Machine Learning Research, vol. 9, pp. 1369--1398, 2008. Google ScholarDigital Library
G.-X. Yuan, C.-H. Ho, and C.-J. Lin, "An improved GLMNET for l1-regularized logistic regression and support vector machines," tech. rep., National Taiwan University, 2011.Google Scholar
H.-F. Yu, H.-Y. Lo, H.-P. Hsieh, J.-K. Lou, T. G. McKenzie, J.-W. Chou, P.-H. Chung, C.-H. Ho, C.-F. Chang, Y.-H. Wei, J.-Y. Weng, E.-S. Yan, C.-W. Chang, T.-T. Kuo, Y.-C. Lo, P. T. Chang, C. Po, C.-Y. Wang, Y.-H. Huang, C.-W. Hung, Y.-X. Ruan, Y.-S. Lin, S.-D. Lin, H.-T. Lin, and C.-J. Lin, "Feature engineering and classifier ensemble for KDD cup 2010," in JMLR Workshop and Conference Proceedings, 2011. To appear.Google Scholar
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, "LIBLINEAR: A library for large linear classification," Journal of Machine Learning Research, vol. 9, pp. 1871--1874, 2008. Google ScholarDigital Library

Index Terms

An improved GLMNET for l1-regularized logistic regression
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees

Recommendations

Large-scale sparse logistic regression
KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

Logistic Regression is a well-known classification method that has been used widely in many applications of data mining, machine learning, computer vision, and bioinformatics. Sparse logistic regression embeds feature selection in the classification ...
Read More
An improved GLMNET for L1-regularized logistic regression

Recently, Yuan et al. (2010) conducted a comprehensive comparison on software for L1-regularized classification. They concluded that a carefully designed coordinate descent implementation CDN is the fastest among state-of-the-art solvers. In this paper, ...
Read More
An improved GLMNET for L1-regularized logistic regression

Recently, Yuan et al. (2010) conducted a comprehensive comparison on software for L1-regularized classification. They concluded that a carefully designed coordinate descent implementation CDN is the fastest among state-of-the-art solvers. In this paper, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
KDD '11: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
August 2011
1446 pages
ISBN:9781450308137
DOI:10.1145/2020408
General Chair:
Chid Apte
IBM Research
,
Program Chairs:
Joydeep Ghosh
UT Austin
,
Padhraic Smyth
UC Irvine
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 21 August 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
l1 regularization
linear classification
logistic regression
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,133of8,635submissions,13%
Upcoming Conference
KDD '24

Sponsor:

sigkdd

sigkdd

The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona , Spain
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 24
  Total Citations
  View Citations
- 743
  Total Downloads
- Downloads (Last 12 months)71
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.