Simultaneous fitting of Bayesian penalised quantile splines

https://doi.org/10.1016/j.csda.2018.12.009Get rights and content

Abstract

Bayesian simultaneous estimation of nonparametric quantile curves is a challenging problem, requiring a flexible and robust data model whilst satisfying the monotonicity or noncrossing constraints on the quantiles. The pyramid quantile regression method for simultaneous linear quantile fitting is adapted for the spline regression setting. In high dimensional problems, the choice of the pyramid locations becomes crucial for a robust parameter estimation. The optimal pyramid locations are derived which then allows for an efficient adaptive block-update MCMC scheme to be proposed for posterior computation. Simulation studies show that the proposed method provides estimates with significantly smaller errors and better empirical coverage probability when compared to existing alternative approaches. The method is illustrated with three real applications.

Introduction

Quantile regression models (QR; Koenker et al., 1978) provide a comprehensive description of the conditional distribution of the response variable by capturing the effect of covariates at different quantile levels. It is a robust alternative to ordinary mean regression and has been embraced by a variety of fields, including biology, economics, environmental sciences, medicine and ecology (Fitzenberger et al., 2002, Reich et al., 2011, Portnoy, 2003, Cade et al., 1999). For modelling the τ-th conditional quantile curve of a response variable Y given the value X=x of some covariate X as Qτ(Y|x)=fτ(x), τ(0,1), consider Y|X=xfτ(x)+ϵ,where the error variable ϵ satisfies Qτ(ϵ|x)inf{a:P(ϵa|X=x)τ}=0, and the function fτ(x) changes with τ and describes the relationship between X and Y. In this article, we are interested in the case where the curve fτ is modelled with spline functions of a given degree, P1, so that fτ(x)=α0+j=1Pαjxj+k=1Kηk(xγk)+P,where z+=max(0,z) and where γk,k=1,,K, represent the locations of K knot points (see Hastie and Tibshirani (1990)). Typically, the degree P is set to 3, since cubic splines are known to approximate locally smooth functions sufficiently well.

Frequentist literature on quantile regression is largely based on the seminal work by Koenker et al. (1978), in which the noise distribution is left unspecified and quantiles are estimated by solving the linear optimisation problem Q̂τ(Y|x)=arg minfτ(x)i=1nρτ(yifτ(xi)),where ρτ() is an asymmetric loss function given by ρτ(ϵ)=τϵ if ϵ0 and ρτ(ϵ)=(τ1)ϵ otherwise. In the spline fitting context, a penalty term is added to (2) in order to restrict the class of functions fτ(x) to be sufficiently smooth. However, the simplicity of linear programming is jeopardised when the classical quadratic penalty, L2=(f(x))2dx, is added to this objective function (Bosch et al., 1995). Koenker et al. (1994) advocate the use of a linear norm under the total variation roughness penalty, L1=|(f(x))|dx, nevertheless one becomes harshly delimited by the space of piecewise linear fits. On the other hand, under the scope of linear programming, extension to quadratic splines is still possible when using the L-norm, L=sup|f(x)|, see e.g. Koenker et al. (1994) or He and Ng (1999). In the Bayesian framework, the need for a likelihood specification was initially accommodated by the asymmetric Laplace error model introduced by Yu and Moyeed (2001), based on its analogy to the minimisation problem (2). Smoothing cubic splines can be easily incorporated considering a prior density for fτ(x) proportional to exp{12λf(x)2dx} (see Thompson et al. (2010)).

However, one common issue facing these approaches is that quantiles at different levels have to be fitted singly for each τ and final estimates may cross, i.e. estimates may not respect the monotonicity of the quantile function. In the context of nonparametric regression, the flexibility granted to the quantile curves makes crossing more severe. Postprocessing procedures (e.g. Dette and Volgushev (2008), Chernozhukov et al. (2009)) which correct for crossing still suffer from a poor borrowing of information, and can still lead to wildly variable curves across quantile levels τ. Rodrigues and Fan (2017) proposed a procedure to postprocess crossing quantiles in a Bayesian framework, however its performance is still influenced by the initial estimates. Recently, several authors have argued that simultaneous estimation offers better estimates, better global efficiency for the estimators have been observed empirically in several studies (Reich and Smith, 2013), Yang and Tokdar (2017), Fang et al. (2015), (Rodrigues et al., 2018). Indeed simultaneous quantile fitting is a challenging problem that has gained a lot of attention in recent times. A solution to the constrained minimisation problem was proposed by Bondell et al. (2010), however estimation is again limited to piecewise linear fits. One can also achieve noncrossing by restricting the class of models to the location-scale family (He, 1997), or by minimising the objective function sequentially while imposing ordering through the parameters (Muggeo et al., 2013), but there are naturally limitations when rigid assumptions are imposed. More recently, Yang and Tokdar (2017) proposed a Bayesian model for joint estimation of quantile planes using a convenient parameterisation to facilitate the incorporation of monotonicity constraints. Although its focus is linear regression, shrinkage priors for variable selection could potentially be used for fitting splines.

Pyramid quantile regression was first presented in Rodrigues et al. (2018) as an alternative Bayesian procedure for simultaneous quantile fitting. This approach compared favourably with competing methods in empirical studies. It relies on the use of several so-called quantile pyramid priors introduced by Hjort and Walker (2009) placed at some chosen locations in the covariate space. By construction this method ensures non-crossing of the different quantile planes within the convex hull of the pyramid locations. Although the spline regression model (1) can be estimated under the linear regression umbrella, the pyramid quantile regression when a large number of basis functions are used can be problematic. As a matter of fact, the choice of pyramid locations is intricate in this situation since standard convex hull algorithms either cannot cope with high-dimensional spaces or return too many vertices. This complicates estimation, as crossing of quantile curves will need to be checked at many locations for each parameter update of an MCMC sampling algorithm.

In this paper, we propose a method based on the pyramid quantile regression for simultaneously fitting several penalised spline quantile curves which automatically satisfy the non-crossing constraints. In order to achieve that, we propose to enclose the data cloud with a polytope whose number of vertices equals the number of parameters in the regression spline model, and we derive a general algorithm to find the optimal vertex locations of this convex set, efficiently eliminating all monotonicity constraints. As far as we are aware, there are no existing algorithms in the convex optimisation literature which can compute such vertices. Our approach here is also scalable to high dimensions. Next, we develop penalty criterions for estimating flexible quantile curves, and propose an efficient adaptive block-update strategy for MCMC sampling, taking advantage of the proposed convex polytope approach.

The rest of the article is organised as follows. Section 2 presents the optimal convex set enclosing the data cloud with the given number of vertices. Section 3 recalls some basics on quantile pyramids introduced by Hjort and Walker (2009) and briefly describe their use in Rodrigues et al. (2018) in a regression set-up. In Section 4 we introduce the penalised quantile splines model and describe the modelling set up. Section 5 presents simulation studies and comparisons with the best alternative approaches. Several real datasets are analysed in Section 6, and concluding remarks are discussed in the final section.

Section snippets

Convex set vertices

The convex hull of the predictor cloud has a special role in simultaneous linear quantile regression. It is well known that if the conditional quantiles do not cross at the vertices of a convex set, they do not cross anywhere inside the convex set, see for example Bondell et al. (2010). This fact reduces the problem of infinite quantile monotonicity constraints to ensuring monotonicity only at the vertices of the convex set. Rodrigues et al. (2018) used quantile pyramids at selected vertices of

Quantile pyramids

Here we provide some background on quantile pyramids, which was introduced by Hjort and Walker (2009). Quantile pyramid is a method for constructing a random probability measure via the quantile function. We will use hereafter these priors for simultaneous inference on quantile spline curves, in the spirit of the linear quantile regression model recently proposed in Rodrigues et al. (2018).

Quantile pyramid is a tree generation process with M levels where, at each level m, quantiles for fixed

Quantile pyramids for penalised splines

The proposed model for simultaneously fitting penalised splines using quantile pyramids is discussed in this section. We consider jointly modelling quantile curves, for a number T of finite quantile levels τ=τ1<τ2<<τT, as a function of a covariate xR.

Simulated examples

In this section we examine the performances of the proposed method via a simulation study. We compare its results with the ones obtained by recent alternative approaches with available codes. This includes constrained B-splines smoothing (COBs), which provides individual quantile fittings using quadratic splines with L penalty, available in the R package cobs (see Ng and Maechler (2015) and Ng and Maechler (2007)). This includes also the simultaneous quantile curve estimation available from

Motorcycle data set

Here we analyse the prominent motorcycle dataset (Silverman, 1985). The dataset is obtained from an experiment on the efficacy of crash helmets, and contains 133 observations of head acceleration (in g) as a function of time since impact (in ms). As interest lies in describing the acceleration curve, and particularly the behaviour at the tails of the distribution, quantile regression modelling is an appealing technique which has been repeatedly considered here (see Koenker, 2005, Chen and Yu,

Discussion

In this article, we introduced a fully nonparametric quantile regression model using the quantile pyramids. Our method uses K+4 quantile pyramids to model a cubic regression spline with K knots, and places these pyramids on the vertices of an optimal convex set enclosing the predictor cloud to ensure that the fitted curves will not cross.

The main features of this modelling are flexibility and interpretability, as it avoids strong parametric assumptions about the data, and yet provides a

Acknowledgements

TR is funded by CAPES Foundation via the Science Without Borders (BEX 0979/13-9). TR and YF are grateful for the support of ARC ACEMS, Australia .

References (44)

  • FangY. et al.

    Bayesian quantile regression with approximate likelihood

    Bernoulli

    (2015)
  • FitzenbergerB. et al.

    Economic applications of quantile regression

    Physica

    (2002)
  • GarthwaiteP.H. et al.

    Adaptive optimal scaling of metropolis-hastings algorithms using the robbins-monro process

    Commun. Statist. Theory Methods

    (2016)
  • Han, S., Liao, X., Dunson, D.B., Carin, L., 2016. Variational gaussian copula inference. In: 19th International...
  • HastieT.J. et al.

    Generalised Additive Models

    (1990)
  • HeX.

    Quantile curves without crossing

    Amer. Statist.

    (1997)
  • HeX. et al.

    Cobs: Qualitatively constrained smoothing via linear programming

    Comput. Statist.

    (1999)
  • HjortN.L. et al.

    Quantile pyramids for Bayesian nonparametrics

    Ann. Statist.

    (2009)
  • HolstU. et al.

    Locally weighted least squares kernel regression and statistical evaluation of lidar measurements

    Environmetrics

    (1996)
  • IsaacsD. et al.

    Serum immunoglobulin concentration in preschool children measured by laser nephelometry: reference ranges for IgG, IgA, IgM

    J. Clin. Pathol.

    (1983)
  • KoenkerR.

    Quantile regression

  • KoenkerR. et al.

    Regression quantiles

    Econometrica

    (1978)
  • Cited by (9)

    View all citing articles on Scopus
    View full text