Monotone splines lasso
Introduction
Along with the massive production of large data sets within most areas of science and technology, methods for high dimensional regression have become increasingly important. When the number of predictors is large compared to the sample size , penalized regression methods handle the dimensionality problem by adding a penalty to the negative log-likelihood to be minimized. The lasso (Tibshirani, 1996) and its many variants (Zou, 2006, van de Geer et al., 2011, Yuan and Lin, 2006, Zou and Hastie, 2005, Meinshausen, 2007) have the advantage of setting some of the regression coefficients to zero, thus producing a sparse solution. Recently, nonparametric methods for high-dimensional regression have started to emerge. Recent papers (Avalos et al., 2007, Meier et al., 2009, Huang et al., 2010, Ravikumar et al., 2009) consider a generalized additive model (GAM) (Hastie and Tibshirani, 1990) in combination with spline approximations. Given the observations , where is the response and is the vector of covariates for observation , the additive model is given as Here is the intercept, the s are unknown functions to be estimated and is the independent random error with mean zero and variance . We assume , for , where now , to ensure unique identification of the s. In Avalos et al. (2007), Meier et al. (2009), Bühlmann and van de Geer (2011), Huang et al. (2010) and Ravikumar et al. (2009), each nonparametric component is represented by a linear combination of spline basis functions and the problem can be viewed as a group lasso problem (Yuan and Lin, 2006) by selecting groups of spline basis functions representing relevant covariates. Covariates are often represented by -splines due to their flexibility and minimal assumptions with respect to the form of function to be estimated. Combined with the group lasso, the framework becomes a highly flexible alternative to (standard) linear lasso-type methods.
Our aim is to construct a new method for high dimensional regression which is nonparametric and flexible as above, but which can be restricted to select and estimate monotone functions only. In certain bio-medical applications it is important to assume that the relationship between an explanatory variable and the outcome is monotonically increasing or decreasing. Actually, every time linear regression is applied, an implicit assumption of monotonicity is made. For example, monotone, but not necessarily linear relations typically appear for dose–response data. It is also reasonable to assume that the relationship between a disease and a risk factor is monotone, but not necessarily linear (Raftery and Richardson, 1996).
There has been a major effort in developing methods for monotone regression beyond the strictly linear regression models. In simple regression problems, monotone increasing relationships are often modeled through isotonic regression (Barlow et al., 1972, Robertson et al., 1988). Additive isotonic models, assuming that each component effect in the additive model is isotonic, were presented in Bacchetti (1989). However, most literature on monotone and isotonic regressions is limited to low dimensions. Very recently, one important contribution has appeared for monotone regression in high dimensions. Fang and Meinshausen (2012) propose Lasso Isotone (liso), combining estimation of nonparametric isotonic functions with ideas from sparse high-dimensional regression in an additive isotonic regression model. This is, to our knowledge, the only method feasible for monotone high-dimensional problems. Using an adaptive liso approach, Fang and Meinshausen (2012) also present a way of fitting the model without assuming that all effects are either increasing or decreasing, thus allowing for component effects of different signs. In this paper we develop another, substantially different, tool for the same purpose.
Isotonic regression is probably the best known method for preserving monotonicity, but has the disadvantage of producing step functions, which often have little biological plausibility, instead of smooth functions. For simple regression, it is possible to use an additional smoothing procedure in a second step to obtain a smooth function (He and Shi, 1998). Tibshirani et al. (2011) proposed nearly-isotonic regression which involves a penalty term controlling the level of monotonicity compared to the goodness of fit.
Another way of preserving monotonicity is to fit a smooth monotone function via monotone regression splines (Ramsay, 1988, He and Shi, 1998). While He and Shi (1998) proposed monotone -spline smoothing based on a constrained least absolute deviation principle, Ramsay (1988) introduced integrated splines (-splines), which essentially are integrated versions of -splines that in combination with strictly positive coefficients will produce monotone increasing smooth functions. -splines have previously been used in connection with a boosting technique to do monotonic regression in a multivariate model in Tutz and Leitenstorfer (2007). Meyer (2008) also considers shape-restricted regression splines by means of -splines, but only in the one-dimensional case.
In this paper a new approach to fit nonparametric additive models under the assumption that each component effect is monotone is proposed. The monotone splines lasso (MS-lasso) combines the idea of -splines with the cooperative lasso (Chiquet et al., 2012), and is feasible in high-dimensional settings where the number of covariates can exceed the number of observations . The cooperative lasso is a lasso method where known groups of covariates are treated together, but differs from the standard group lasso (Yuan and Lin, 2006) in that it assumes that the groups are sign-coherent. That is, the covariates inside a group are cooperating, so either the linear coefficients are all nonpositive, all nonnegative or all null inside a group. This can be combined with monotone -splines by letting each covariate, represented via an -spline basis, constitute a group in the cooperative lasso. Thus the MS-lasso fits the additive nonparametric regression model with components that can be either nondecreasing, nonincreasing or of no effect. The important advantages of the MS-lasso are that the monotone functions can be either monotone increasing or decreasing in the same model, and that it is fitting smooth monotone functions to each . In this way it is more flexible than the linear model, but more restrictive than purely nonparametric methods without any shape constraints. The method is also biologically more relevant than the adaptive liso, in that smooth representations of the functions are immediately obtainable. A two-step estimator is also proposed, the adaptive MS-lasso, which leads to less bias and fewer false positives in the final model.
This paper is organized as follows. In Section 2 we present the MS-lasso and discuss some of its properties. The adaptive MS-lasso is also presented, and connections to related methods are discussed. Section 3 is dedicated to simulation studies. In Section 4 the use of the method is illustrated in genomic data, before a final discussion is presented in Section 5.
Section snippets
Monotone splines lasso
Suppose that each of the regression functions in the additive model in (1) can be approximated by , a linear combination of spline basis functions, that is, Here is the basis function and is the th spline coefficient for the th covariate. Note that for the standard linear regression model and .
The standard lasso and the adaptive lasso using L1 norm penalty together with the original data measurements, will be restricted to important
Simulation studies
To demonstrate the finite-sample performance of this method, the results from several experiments are reported here. In all experiments we use the MS-lasso and the adaptive MS-lasso to estimate the component effects and compare with the methods listed in Table 1. For the monotone spline methods, a monotone -spline basis of order two with six evenly distributed knots at quantiles is used for all functions . For the BS-lasso, a quadratic -spline basis is used with six evenly distributed
Data illustrations
In this section the use of proposed methods are illustrated on two relevant data sets from genomics.
Discussion
The additive model provides a flexible alternative to the standard linear regression model. However, the monotonicity in the linear model is attractive and in some settings it is sensible to preserve monotonicity in the additive models by imposing restrictions on the additive components. In this paper we have proposed a new method for estimation and variable selection in high-dimensional additive models that is restricted to monotone effects. By combining group selection with spline
Acknowledgments
We would like to thank Hiroko Solvang for providing the breast cancer data, and Sjur Reppe who provided the bone biopsy data. We would also like to thank the Editor, Associate Editor and two anonymous reviewers for valuable comments and suggestions.
References (29)
- et al.
Parsimonious additive modeling
Comput. Statist. Data Anal.
(2007) Relaxed lasso
Comput. Statist. Data Anal.
(2007)- et al.
Eight genes are highly associated with BMD variation in postmenopausal caucasian women
Bone
(2010) Additive isotonic models
J. Amer. Statist. Assoc.
(1989)- et al.
Statistical Inference Under Order Restrictions; The Theory and Application of Isotonic Regression
(1972) - et al.
- et al.
Sparsity with sign-coherent groups of variables via the cooperative-lasso
Ann. Appl. Stat.
(2012) - et al.
Spline and kernel regression under shape restrictions
- et al.
- et al.
Lasso isotone for high-dimensional additive isotonic regression
J. Comput. Graph. Statist.
(2012)
Generalized Additive Models
Monotone -spline smoothing
J. Amer. Statist. Assoc.
Variable selection in nonparametric additive models
Ann. Statist.
Estimation and variable selection for semiparametric additive partial linear models
Statist. Sinica
Cited by (4)
Partially linear monotone methods with automatic variable selection and monotonicity direction discovery
2020, Statistics in MedicineAdditive monotone regression in high and lower dimensions
2019, Statistics SurveysBayesian spatial monotonic multiple regression
2018, Biometrika