research-article

Algorithms for subset selection in linear regression

Authors:

David KempeAuthors Info & Claims

STOC '08: Proceedings of the fortieth annual ACM symposium on Theory of computing

Pages 45 - 54

https://doi.org/10.1145/1374376.1374384

Published: 17 May 2008 Publication History

Abstract

We study the problem of selecting a subset of k random variables to observe that will yield the best linear prediction of another variable of interest, given the pairwise correlations between the observation variables and the predictor variable. Under approximation preserving reductions, this problem is equivalent to the "sparse approximation" problem of approximating signals concisely. The subset selection problem is NP-hard in general; in this paper, we propose and analyze exact and approximation algorithms for several special cases of practical interest. Specifically, we give an FPTAS when the covariance matrix has constant bandwidth, and exact algorithms when the associated covariance graph, consisting of edges for pairs of variables with non-zero correlation, forms a tree or has a large (known) independent set. Furthermore, we give an exact algorithm when the variables can be embedded into a line such that the covariance decreases exponentially in the distance, and a constant-factor approximation when the variables have no "conditional suppressor variables". Much of our reasoning is based on perturbation results for the R² multiple correlation measure, which is frequently used as a natural measure for "goodness-of-fit statistics". It lies at the core of our FPTAS, and also allows us to extend our exact algorithms to approximation algorithms when the matrix "nearly" falls into one of the above classes. We also use our perturbation analysis to prove approximation guarantees for the widely used "Forward Regression" heuristic under the assumption that the observation variables are nearly independent.

References

[1]

K. Anstreicher, M. Fampa, J. Lee, and J. Williams. An exact algorithm for maximum entropy sampling. Operations Research, 43(4):684--691, 1995.

Digital Library

[2]

K. Anstreicher, M. Fampa, J. Lee, and J. Williams. Maximum-entropy remote sampling. Discrete Applied Mathematics, 108(3):211--226, 2001.

Digital Library

[3]

S. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2004.

[4]

E. J. Candes, J. Romberg, and T. Tao. Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics, 59:1207--1223, 2005.

[5]

W. Cochran. Some effects of errors of measurement on multiple correlation. Journal of the American Statistical Association, 65(329):22--34, 1970.

[6]

J. Cohen and P. Cohen. Applied multiple regression/correlation analysis for the behavioral sciences. Lawrence Erlbaum Assoc Publishers, 2003.

[7]

G. Cornuejols, M. Fisher, and G. Nemhauser. Location of bank accounts to optimize float. Management Science, 23:789--810, 1977.

Digital Library

[8]

C. Couvreur and Y. Bressler. On the optimality of the backward greedy algorithm for the subset selection problem. SIAM Journal on Matrix Analysis and Applications, 21(3):797--808, 2000.

Digital Library

[9]

G. Davis, S. Mallat, and M. Avellaneda. Greedy adaptive approximation. Journal of Constructive Approximation, 13:57--98, 1997.

[10]

A. Deshpande, C. Guestrin, S. Madden, J. Hellerstein, and W. Hong. Model driven data acquisition in sensor networks. In Proc. International Conference on Very Large Data Bases, VLDB, 2004.

Digital Library

[11]

G. Diekhoff. Statistics for the Social and Behavioral Sciences. Wm. C. Brown Publishers, 2002.

[12]

D. Donoho. For most large underdetermined systems of linear equations, the minimal 11-norm near-solution approximates the sparsest near-solution. Communications on Pure and Applied Mathematics, 59:1207--1223, 2005.

[13]

D. L. Donoho, M. Elad, and V. N. Temlyakov. Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Transaction on Information Theory, 52:6--18, 2006.

Digital Library

[14]

V. F. Flack and P. C. Chang. Frequency of selecting noise variables in subset regression analysis: A simulation study. The American Statistician Journal, 41(1):84--86, 1987.

[15]

A. Gilbert, S. Muthukrishnan, and M. Strauss. Approximation of functions over redundant dictionaries using coherence. In Proc. ACM-SIAM Symposium on Discrete Algorithms, 2003.

Digital Library

[16]

C. Guestrin, A. Krause, and A. Singh. Near-optimal sensor placements in gaussian processes. In International Conference on Machine Learning, ICML, 2005.

Digital Library

[17]

R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, 1999.

Digital Library

[18]

F. Hwang, S. Onn, and U. Rothblum. A polynomial time algorithm for shaped partition problems. SIAM Journal on Optimization, 10(1):70--81, 1999.

Digital Library

[19]

R. A. Johnson and D. W. Wichern. Applied Multivariate Statistical Analysis. Prentice Hall, 2002.

Digital Library

[20]

P. Liaskovitis and C. Schurgers. Leveraging redundancy in sampling-interpolation applications for sensor networks. In Proc. 3rd Intl. Conf. on Distributed Computing in Sensor Systems, 2007.

Digital Library

[21]

A. Miller. Subset Selection in Regression. Chapman and Hall, second edition, 2002.

[22]

S. Muthukrishnan. Data streams: Algorithms and applications. Foundations and Trends in Theoretical Computer Science, 1, 2005.

Digital Library

[23]

B. Natarajan. Sparse approximation solutions to linear systems. SIAM Journal on Computing, 24:227--234, 1995.

Digital Library

[24]

G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of the approximations for maximizing submodular set functions. Mathematical Programming, 14:265--294, 1978.

Digital Library

[25]

S. Onn and L. Schulman. The vector partition problem for convex optimization functions. Mathematics of Operations Research, 26(3):583--590, 2001.

Digital Library

[26]

M. H. Pesaran and R. J. Smith. A generalized r2 criterion for regression models estimated by the instrumental variables method. Econometrica, 62(3):705--710, 1994.

[27]

J. Saxe. Dynamic programming algorithms for recognizing small bandwidth graphs in polynomial time. SIAM Journal on Algebraic Methods I, 1(4):363--369, 1980.

[28]

G. W. Stewart and J.G. Sun. Matrix Perturbation Theory. Academic Press, 1990.

[29]

V. Temlyakov. Greedy algorithms and m-term approximation with regard to redundant dictionaries. Journal of Approximation Theory, 98:117--145, 1999.

Digital Library

[30]

V. Temlyakov. Nonlinear methods of approximation. Foundations of Computational Mathematics, 3:33--107, 2002.

[31]

R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of Royal Statistical Society, 58:267--288, 1996.

[32]

J. Tropp. Greed is good: algorithmic results for sparse approximation. IEEE Trans. Information Theory, 50:2231--2242, 2004.

Digital Library

[33]

J. Tropp. Topics in Sparse Approximation. PhD thesis, University of Texas, Austin, 2004.

[34]

J. Tropp. Just relax: Convex programming methods for identifying sparse signals. IEEE Trans. Information Theory, 51:1030--1051, 2006.

Digital Library

[35]

J. Tropp, A. Gilbert, S. Muthukrishnan, and M. Strauss. Improved sparse approximation over quasi-incoherent dictionaries. In Proc. IEEE-ICIP, 2003.

[36]

W. F. Velicer. Suppressor variables and the semipartial correlation coefficient. Educational and Psychological Measurement, 38:953--958, 1978.

[37]

M. Wainwright. Sharp thresholds for noisy and high-dimensional recovery of sparsity using l1-constrained quadratic programming. In Proc. Allerton Conference on Communication, 2006.

[38]

D. A. Walker. Suppressor variable(s) importance within a regression model. Journal of College Student Development, 44:127--133, 2003.

[39]

H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 67(2):301--320, 2005.

Cited By

Ye LChi MLiu ZWang XGupta V(2025)Online mixed discrete and continuous optimization: Algorithms, regret analysis and applicationsAutomatica10.1016/j.automatica.2025.112189175(112189)Online publication date: May-2025
https://doi.org/10.1016/j.automatica.2025.112189
Sun XZhang JZhang S(2025)Deterministic streaming algorithms for non-monotone submodular maximizationFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40266-419:6Online publication date: 1-Jun-2025
https://dl.acm.org/doi/10.1007/s11704-024-40266-4
Sim RFan JTian XJaillet PLow BSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Deletion-anticipative data selection with a limited budgetProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693920(45468-45507)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693920
Show More Cited By

Index Terms

Algorithms for subset selection in linear regression
1. Theory of computation
  1. Design and analysis of algorithms

Recommendations

Approximate Sparse Recovery: Optimizing Time and Measurements

A Euclidean approximate sparse recovery system consists of parameters $k,N$, an $m$-by-$N$ measurement matrix, $\bm{\Phi}$, and a decoding algorithm, $\mathcal{D}$. Given a vector, ${\mathbf x}$, the system approximates ${\mathbf x}$ by $\widehat {\...
On the General Position Subset Selection Problem

Let $f(n,\ell)$ be the maximum integer such that every set of $n$ points in the plane with at most $\ell$ collinear contains a subset of $f(n,\ell)$ points with no three collinear. First we prove that if $\ell\leqslant O(\sqrt{n})$, then $f(n,\ell)\geqslant\...
Column subset selection via sparse approximation of SVD

Given a real matrix A@__ __R^m^x^n of rank r, and an integer k

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

STOC '08: Proceedings of the fortieth annual ACM symposium on Theory of computing

May 2008

712 pages

ISBN:9781605580470

DOI:10.1145/1374376

General Chair:
Richard Ladner
University of Washington
,
Program Chair:
Cynthia Dwork
Microsoft Research, Silicon Valley

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 May 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

STOC '08

Sponsor:

STOC '08: Symposium on Theory of Computing

May 17 - 20, 2008

British Columbia, Victoria, Canada

Acceptance Rates

STOC '08 Paper Acceptance Rate 80 of 325 submissions, 25%;

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Upcoming Conference

STOC '25

Sponsor:
sigact

57th Annual ACM Symposium on Theory of Computing (STOC 2025)

June 23 - 27, 2025

Prague , Czech Republic

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

113
Total Citations
View Citations
1,526
Total Downloads

Downloads (Last 12 months)130
Downloads (Last 6 weeks)4

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ye LChi MLiu ZWang XGupta V(2025)Online mixed discrete and continuous optimization: Algorithms, regret analysis and applicationsAutomatica10.1016/j.automatica.2025.112189175(112189)Online publication date: May-2025
https://doi.org/10.1016/j.automatica.2025.112189
Sun XZhang JZhang S(2025)Deterministic streaming algorithms for non-monotone submodular maximizationFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40266-419:6Online publication date: 1-Jun-2025
https://dl.acm.org/doi/10.1007/s11704-024-40266-4
Sim RFan JTian XJaillet PLow BSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Deletion-anticipative data selection with a limited budgetProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693920(45468-45507)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693920
Ding TZhou WPeng B(2024)Enhancing Time Series Classification with Explainable Time-Frequency Features RepresentationPattern Recognition and Computer Vision10.1007/978-981-97-8487-5_36(522-536)Online publication date: 4-Nov-2024
https://doi.org/10.1007/978-981-97-8487-5_36
Zhang YChen X(2024)Sequential metamodel‐based approaches to level‐set estimation under heteroscedasticityStatistical Analysis and Data Mining10.1002/sam.1169717:3Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1002/sam.11697
Banihashem KBiabani LGoudarzi SHajiaghayi MJabbarzade PMonemizadeh MOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Dynamic non-monotone submodular maximizationProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666883(17369-17382)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3666883
Tan JSun YXu YZou J(2023)Streaming Algorithms for Non-Submodular Maximization on the Integer LatticeTsinghua Science and Technology10.26599/TST.2022.901003128:5(1-8)Online publication date: Oct-2023
https://doi.org/10.26599/TST.2022.9010031
Dutta SWilde NTokekar PSmith S(2023)Approximation Algorithms for Robot Tours in Random Fields with Guaranteed Estimation Accuracy2023 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA48891.2023.10160912(7830-7836)Online publication date: 29-May-2023
https://doi.org/10.1109/ICRA48891.2023.10160912
Sun XZhang JZhang SZhang Z(2023)Improved Deterministic Algorithms for Non-monotone Submodular MaximizationTheoretical Computer Science10.1016/j.tcs.2023.114293(114293)Online publication date: Nov-2023
https://doi.org/10.1016/j.tcs.2023.114293
Gong SNong QFang JDu D(2023)Algorithms for Cardinality-Constrained Monotone DR-Submodular Maximization with Low Adaptivity and Query ComplexityJournal of Optimization Theory and Applications10.1007/s10957-023-02353-7200:1(194-214)Online publication date: 18-Dec-2023
https://doi.org/10.1007/s10957-023-02353-7
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten