research-article

Selection of regularization model for linear regression under high-dimensional data

Authors:

Changyin ZhouAuthors Info & Claims

ICDLT '23: Proceedings of the 2023 7th International Conference on Deep Learning Technologies

Pages 91 - 97

https://doi.org/10.1145/3613330.3613342

Published: 28 September 2023 Publication History

Abstract

The data collected in current practical applications in various fields is gradually developing towards the direction of ultra-high-dimensional and large-scale, and a considerable portion of traditional analysis methods significantly reduce the processing efficiency of high-dimensional data. Therefore, it is essential to establish methods that focus on processing high dimensional data. In this paper, the Elastic-net model is selected as the basic regularization model for processing high-dimensional sparse data, and a penalty factor is added to enhance its ability to retain key features. To reduce the computational burden brought by high-dimensional data, we propose applying the "two-step" procedure of SSR+PCD screening rule and fitting method to the model containing penalty factors. In terms of the selection of tuning parameters, the traditional Cross-validation is replaced by Information Criterion, and the application of Information Criterion is extended to the regularization model with screening rules, so as to broaden the application range of Information Criterion. Through data simulation studies, we confirm the rationality of penalty factor addition and the ability of the selected Information Criterion to choose tuning parameters under this model, and an example is given to illustrate its application in processing high-dimensional gene expression data.

References

[1]

Tibshirani R. Regression shrinkage and selection via the lasso[J]. Journal of the Royal Statistical Society, Series B, 1996, 58(1).

[2]

Hui Z, Hastie T. Regularization and variable selection via the elastic net[J]. Journal of the Royal Statistical Society, 2005, 67(5).

[3]

Zou, Hui. The Adaptive Lasso and Its Oracle Properties[J]. Publications of the American Statistical Association, 2006, 101(476).

[4]

Zou H, Zhang H H. On the adaptive elastic-net with a diverging number of parameters[J]. Annals of Statistics, 2009, 37(4).

[5]

Ghaoui L E, Viallon V, Rabbani T. Safe Feature Elimination in Sparse Supervised Learning[J]. Pacific Journal of Optimization, 2010, 8(4).

[6]

Tibshirani R, Bien J, Friedman J,et al. Strong rules for discarding predictors in lasso-type problems[J]. Journal of the Royal Statistical Society, 2012, 74(2).

[7]

Jie W, Wonka P, Ye J. Lasso Screening Rules via Dual Polytope Projection[J]. Journal of Machine Learning Research, 2015, 16(1).

[8]

Zeng Y, Yang T, Breheny P. Hybrid safe-strong rules for efficient optimization in lasso-type problems[J]. Computational Statistics and Data Analysis, 2021, 153(1).

[9]

Kim Y, Kwon S, Choi H. Consistent Model Selection Criteria on High Dimensions[M]. JMLR.org, 2012.

[10]

Wei S, Wang J, Fang Y. Consistent selection of tuning parameters via variable selection stability[J]. The Journal of Machine Learning Research, 2013.

[11]

H. Wang, R. Z. Li and C. Tsia, Tuning parameter selectors for the smoothly clipped absolute deviation method [J], Biometrika, 2007, 94(3).

[12]

Wang H, Leng L C. Shrinkage tuning parameter selection with a diverging number of parameters[J]. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2009, 71(3).

[13]

Chen J. Extended Bayesian information criteria for model selection with large model spaces[J]. Biometrika, 2008, 95(3).

[14]

T. Wang and L. Zhu, Consistent tuning parameter selection in high dimensional sparse linear regression [J], Journal of Multivariate Analysis, 2011, 102.

[15]

Li Y, Wu Y, Jin B. Consistent tuning parameter selection in high-dimensional group-penalized regression[J]. Science China Mathematics, 2019, 62(4).

Index Terms

Selection of regularization model for linear regression under high-dimensional data
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
      1. Regularization

Recommendations

Sparse regularization based feature selection: A survey
Abstract
Feature selection, as an essential preprocessing tool, aims to identify a subset of crucial features by eliminating redundant and noisy features according to a predefined criterion. In recent years, sparse learning has received considerable ...
Efficient model selection for regularized linear discriminant analysis
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

Classical Linear Discriminant Analysis (LDA) is not applicable for small sample size problems due to the singularity of the scatter matrices involved. Regularized LDA (RLDA) provides a simple strategy to overcome the singularity problem by applying a ...
A Variance Minimization Criterion to Feature Selection Using Laplacian Regularization

In many information processing tasks, one is often confronted with very high-dimensional data. Feature selection techniques are designed to find the meaningful feature subset of the original features which can facilitate clustering, classification, and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICDLT '23: Proceedings of the 2023 7th International Conference on Deep Learning Technologies

July 2023

115 pages

ISBN:9798400707520

DOI:10.1145/3613330

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICDLT 2023

ICDLT 2023: 2023 7th International Conference on Deep Learning Technologies

July 27 - 29, 2023

Dalian, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
29
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten