Skip to main content
Log in

A fast algorithm for non-negativity model selection

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

An efficient optimization algorithm for identifying the best least squares regression model under the condition of non-negative coefficients is proposed. The algorithm exposits an innovative solution via the unrestricted least squares and is based on the regression tree and branch-and-bound techniques for computing the best subset regression. The aim is to filling a gap in computationally tractable solutions to the non-negative least squares problem and model selection. The proposed method is illustrated with a real dataset. Experimental results on real and artificial random datasets confirm the computational efficacy of the new strategy and demonstrates its ability to solve large model selection problems that are subject to non-negativity constrains.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1

Similar content being viewed by others

References

  • Armstrong, R.D., Frome, E.L.: A branch-and-bound solution of a restricted least squares problem. Technometrics 18, 447–450 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  • Burks, A.W., Warren, D.W., Wright, J.B.: An analysis of a logical machine using parenthesis-free notation. Math. Tables Other Aids Comput. 8(46), 53–57 (1954)

    Article  MathSciNet  MATH  Google Scholar 

  • Cutler, A.: A branch and bound algorithm for constrained least squares. Commun. Stat., Simul. Comput. 2, 305–321 (1993)

    Article  MathSciNet  Google Scholar 

  • Furnival, G.M., Wilson, R.W.: Regression by leaps and bounds. Technometrics 16(4), 499–511 (1974). Republished in Technometrics 42, 69–79 (2000)

    Article  MATH  Google Scholar 

  • Gatu, C., Kontoghiorghes, E.J.: Parallel algorithms for computing all possible subset regression models using the QR decomposition. Parallel Comput. 29(4), 505–521 (2003)

    Article  MathSciNet  Google Scholar 

  • Gatu, C., Kontoghiorghes, E.J.: Branch-and-bound algorithms for computing the best-subset regression models. J. Comput. Graph. Stat. 15, 139–156 (2006)

    Article  MathSciNet  Google Scholar 

  • Gatu, C., Yanev, P.I., Kontoghiorghes, E.J.: A graph approach to generate all possible regression submodels. Comput. Stat. Data Anal. 52, 799–815 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996)

    MATH  Google Scholar 

  • Hofmann, M., Gatu, C., Kontoghiorghes, E.J.: Efficient algorithms for computing the best subset regression models for large-scale problems. Comput. Stat. Data Anal. 52, 16–29 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Judge, G.G., Takayama, T.: Inequality restrictions in regression analysis. J. Am. Stat. Assoc. 61, 166–181 (1966)

    Article  MathSciNet  MATH  Google Scholar 

  • Kontoghiorghes, E.J.: Parallel Algorithms for Linear Models: Numerical Methods and Estimation Problems. Advances in Computational Economics, vol. 15. Kluwer Academic, Boston (2000)

    Book  MATH  Google Scholar 

  • Kozlov, M.K., Tarasov, S.P., Khachiyan, L.G.: The polynomial solvability of convex quadratic programming. U.S.S.R. Comput. Math. Math. Phys. 20(5), 223–228 (1980)

    Article  MathSciNet  MATH  Google Scholar 

  • Lawson, C.L., Hanson, R.J.: Solving Least Squares Problems. Prentice-Hall, Englewood Cliffs (1974)

    MATH  Google Scholar 

  • Mantel, N.: Restricted least squares regression and convex quadratic programming. Technometrics 11, 763–773 (1969)

    Article  MathSciNet  MATH  Google Scholar 

  • McDonald, G.C., Schwing, R.C.: Instabilities of regression estimates relating air pollution to mortality. Technometrics 15, 463–482 (1973)

    Article  Google Scholar 

  • Miller, A.J.: Subset Selection in Regression. Monographs on Statistics and Applied Probability, vol. 40. Chapman and Hall, London (1990)

    MATH  Google Scholar 

  • Potra, F.A., Wright, S.J.: Interior-point methods. J. Comput. Appl. Math. 124, 281–302 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  • Rueda, C., Salvador, B., Fernández, M.A.: Simultaneous estimation in a restricted linear model. J. Multivar. Anal. 61, 61–66 (1997)

    Article  MATH  Google Scholar 

  • Sen, A., Srivastava, M.: Regression Analysis. Theory, Methods and Applications. Springer Texts in Statistics. Springer, New York (1990)

    Book  MATH  Google Scholar 

  • Smith, D.M., Bremner, J.M.: All possible subset regressions using the QR decomposition. Comput. Stat. Data Anal. 7, 217–235 (1989)

    Article  MATH  Google Scholar 

  • Waterman, M.S.: A restricted least squares problem. Technometrics 16, 135–136 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  • Waterman, M.S.: Least squares with nonnegative regression coefficients. J. Stat. Comput. Simul. 6, 67–70 (1977)

    Article  Google Scholar 

  • Wolfe, P.: The simplex method for quadratic programming. Econometrica 27, 382–398 (1959)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the Editor and referees for their valuable comments and suggestions. This work is in part supported by the University of Cyprus project UCY-31030, the Cyprus University of Technology project E.X. 200079 and the Romanian project PN-II-RU-TE-2011-3-0242.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cristian Gatu.

Appendices

Appendix A: Selection criteria equivalence

Lemma 2

Let V={1,2,…,n} denote a set of variables. Finding the best-subset regression with respect to some selection criterion f∈{RSS,AIC,BIC,…} reduces to finding the n-list of the best subset models corresponding to each model size p=1,…,n.

Proof

Without loss of generality it is assumed that the selection criterion f is to be minimized. The problem of finding the best subset-model with respect to f can be written as

(6)

Note that

Now, consider that the computed the n-list of the best submodels of each size p is given by

The problem (6) becomes

This completes the proof. □

There are \(\binom{n}{p} \) candidate submodels that selects p variables out of n. The total number of submodels of V when excluding the empty model is \(\sum_{p=1}^{n} \binom{n}{p} = 2^{n} - 1\). Lemma 2 states that once the n-list of the best submodels corresponding to each model size have been computed, the best submodel of V is obtained as the optimum of this n-best list.

Property 1

Let V={1,2,…,n} denote a set of variables. A selection criterion is said to be monotonous in the RSS for equal-size subsets if

Standard selection criteria such as AIC and BIC satisfies the above property, since for equal-size subsets the penalty term is constant.

Lemma 3

Let V={1,2,…,n} denote a set of variables and p such that 1≤pn. If f is a selection criterion that satisfies the monotonicity property 1, then

Proof

Let

It follows that

Using the monotonicity property 1 the latter becomes

which implies \(f(W_{p}^{*}) =f(\widehat{W}_{p}) \). This completes the proof. □

Lemma 3 states that if the model size p is fixed, then the same best submodel optimizes the RSS as well as any selection criterion that satisfies monotonicity in RSS Property 1.

Appendix B: The Survival data

Details and a description of the data can be found in Armstrong and Frome (1976) and Cutler (1993). The use of the original data derives the best-subset solution in the root node of the regression tree. Thus, for illustration purposes, in the case study the first and last independent variables (a 2 and a 4) have been interchanged to give the dataset below. The constant variable a 1=1 has been added. The variable labels as they appear in the original data are indicated in the last column.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gatu, C., Kontoghiorghes, E.J. A fast algorithm for non-negativity model selection. Stat Comput 23, 403–411 (2013). https://doi.org/10.1007/s11222-012-9318-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-012-9318-8

Keywords

Navigation