A convex version of multivariate adaptive regression splines

https://doi.org/10.1016/j.csda.2014.07.015Get rights and content

Highlights

  • Convex-MARS enables a convex approximation without degrading the quality of fit.

  • Convex-MARS is appropriate for approximations in convex optimization problems.

  • The threshold version of Convex-MARS provides stronger convexity and better accuracy.

Abstract

Multivariate adaptive regression splines (MARS) provide a flexible statistical modeling method that employs forward and backward search algorithms to identify the combination of basis functions that best fits the data and simultaneously conduct variable selection. In optimization, MARS has been used successfully to estimate the unknown functions in stochastic dynamic programming (SDP), stochastic programming, and a Markov decision process, and MARS could be potentially useful in many real world optimization problems where objective (or other) functions need to be estimated from data, such as in surrogate optimization. Many optimization methods depend on convexity, but a non-convex MARS approximation is inherently possible because interaction terms are products of univariate terms. In this paper a convex MARS modeling algorithm is described. In order to ensure MARS convexity, two major modifications are made: (1) coefficients are constrained, such that pairs of basis functions are guaranteed to jointly form convex functions and (2) the form of interaction terms is altered to eliminate the inherent non-convexity. Finally, MARS convexity can be achieved by the fact that the sum of convex functions is convex. Convex-MARS is applied to inventory forecasting SDP problems with four and nine dimensions and to an air quality ground-level ozone problem.

Introduction

Computer modeling is having a profound effect on scientific research. Many processes are so complex that physical experimentation is too time-consuming, too expensive or simply impossible. As a result, experiments have increasingly turned to mathematical models to simulate these complex systems. Advances in computational power have allowed both greater complexity and more extensive use of such models. The purpose of design and analysis of computer experiments (DACE,  Sacks et al., 1989, Kleijnen, 2008, Chen et al., 2006) is to provide methods for conducting computer experiments to build a metamodel that can be efficiently employed to improve the performance of a complex system. In DACE, the computer experiment replaces the physical experiment by organizing computer model runs and observing the model output of performance. A common DACE objective is to obtain a computationally-efficient response surface approximation (a.k.a., metamodel) of the output. This metamodel may then be used to study and potentially “optimize” the performance of the system. The effectiveness of an optimization method in using a metamodel to improve system performance depends on the convexity of the objective function (Luenberger, 2004). A non-convex metamodel requires a global optimization method, and in practice these typically cannot guarantee optimality. Consequently, if the true underlying performance objective function is known to be convex, it is highly desirable for the approximating metamodel to share this critical property.

Multivariate adaptive regression splines (MARS,  Friedman, 1991) modeling has been applied in DACE-based approaches for some large-scale optimization problems, including continuous-state stochastic dynamic programming (SDP,  Chen, 1999, Chen et al., 1999, Tsai et al., 2004, Tsai and Chen, 2005, Cervellera et al., 2007, Yang et al., 2007, Yang et al., 2009), Markov decision processes (MDP,  Chen et al., 2003, Siddappa et al., 2007, Siddappa et al., 2008), and two-stage stochastic programming (SP,  Pilla et al., 2008, Pilla et al., 2012, Shih et al., 2014). The DACE-based SDP and MDP approaches used an experimental design to discretize the continuous (or near-continuous) state space, and then used MARS to approximate the continuous value function over the state space. The MDP application studied an airline revenue management problem with the objective of more accurately estimating the fair market value of a seat over time. The two-stage SP problem studied an airline fleet assignment model that seeks an assignment of aircraft in the first stage, so that swapping of crew-compatible aircraft can be achieved in the second stage to maximize expected revenue. The DACE approach for SP was used to create a MARS approximation of the first-stage expected revenue objective function, so as to speed up the first-stage optimization. MARS has been successful in these applications not only because of the flexibility of its modeling, but also its parsimony. Parsimony is critical in achieving computational-tractability in large-scale complex problems. Shih et al. (2014) added a data mining variable selection phase that reduced the dimension of the airline fleet assignment model from about 1200 to 400 variables prior to executing DACE, so as to reduce the computational effort of DACE from 2.5 days to an estimated 0.5 days.

Under the assumption that an optimization function f is convex, it is desired that the response surface metamodel fˆ that estimates f be convex as well. For example, in the above-mentioned SDP, MDP, and SP problems, the underlying function is theoretically convex. Convexity is not a typical assumption of statistical modeling methods, and a specialized approach must be developed. There are several options for DACE metamodeling, including polynomial response surface models (Box and Draper, 1987), spatial correlation models, a.k.a., kriging (Sacks et al., 1989), MARS, regression trees (Breiman et al., 1984, Friedman, 2001), and artificial neural networks (Haykin, 1999). None of these guarantee convexity. Convex-MARS uses the modification of both the MARS basis functions and algorithms to build a sum of convex functions, therefore, the final approximation will be convex. The C code is available from this website: http://www.uta.edu/cosmos/software.php.

Section snippets

Multivariate adaptive regression splines (MARS)

Friedman (1991) introduced MARS as a statistical method for high-dimensional modeling with interactions. The MARS model is essentially a linear statistical model with a forward stepwise algorithm to select model terms followed by a backward procedure to prune the model terms. A univariate version (appropriate for additive relationships) was presented by Friedman and Silverman (1989). The MARS approximation bends to model curvature at “knot” locations, and one of the objectives of the forward

Achieving convexity in MARS

To guarantee MARS convexity, two major modifications are made: (1) coefficients are constrained, such that pairs of univariate basis functions are guaranteed to jointly form convex functions and (2) the form of interaction terms is altered to eliminate the inherent non-convexity. A preliminary version of Convex-MARS (Shih et al., 2006) essentially incorporated these modifications to guarantee convexity. However the flexibility of this version was limited, so the current paper presents an

Convex-MARS forward coefficient restriction algorithm

In Convex-MARS, the forward stepwise procedure of original MARS is modified to check the coefficients of newly added basis functions according to the criteria described in Sections  3.1 Convex univariate terms, 3.2 Convex interaction terms. This modified algorithm constrains the coefficients for the basis functions throughout the search process. Whenever there are basis functions being added to the current set of basis functions, either a paired or an unpaired basis function (univariate or

Case studies

In this section Convex-MARS is tested on four-dimensional and nine-dimensional inventory forecasting SDP problems studied by Chen (1999) and on an air quality ground-level ozone SDP problem studied by Yang et al., 2007, Yang et al., 2009. The goal of the inventory forecasting problem is to minimize inventory holding and backorder costs. The state of the system is represented by the inventory levels for the products and their demand forecasts. The goal of the ground-level ozone problem is to

Conclusions

The major contribution of this research is a version of MARS that guarantees convexity without degrading the quality of fit. Given the existing success of MARS in some complex, large-scale optimization problems, the convexity guarantee provides stronger motivation to employ Convex-MARS in problems with known convexity. Testing on inventory forecasting and air quality ground-level ozone SDP problems demonstrates a comparable fit to original MARS. While a significant structural modification for

Acknowledgments

This research was partially supported by the Dallas-Fort Worth International Airport contract #8002058 and National Science Foundation grant ECCS-0801802.

References (26)

  • J.H. Friedman

    Multivariate adaptive regression splines

    Ann. Statist.

    (1991)
  • J.H. Friedman

    Greedy function approximation: a gradient boosting machine

    Ann. Statist.

    (2001)
  • J.H. Friedman et al.

    Flexible parsimonious smoothing and additive modeling

    Technometrics

    (1989)
  • Cited by (15)

    • A Local Weighted Linear Regression (LWLR) Ensemble of Surrogate Models Based on Stacking Strategy: Application to Hydrodynamic Response Prediction for Submerged Floating Tunnel (SFT)

      2022, Applied Ocean Research
      Citation Excerpt :

      During past decades, a wide variety of surrogate models with different assumptions on the underlying response functions, datasets, and model structures have been developed rapidly and applied in many engineering disciplines. The classical metamodels include Kriging (KRG) (Aldosary et al., 2018; Simpson et al., 2001; Xu et al., 2020; Wang et al. 2022), radial basis functions (RBF) (Fang and Horstemeyer 2006), polynomial regression (PRS) (Myers et al. 2016), polynomial chaos expansion (PCE) (Spiridonakos et al. 2016; Chakraborty and Chowdhury 2017; Wang et al. 2021a), multivariate adaptive regression splines (MARS) (Martinez et al., 2015), support vector machine (SVM) (Xiang et al. 2017; Xu et al., 2020; Chen et al. 2021; Wang et al. 2021b), artificial neural networks (ANN) (Mohandes et al. 1998; Xu et al. 2018; Najafi et al. 2018), decision tree regression (DTR) (Pekel 2020; Pekel et al. 2020), K-Nearest Neighbor (K-NN) (Hu et al. 2014; Como et al. 2017), and Bayesian regression (Wang et al. 2017a; Wang et al. 2020). More details on surrogate modelling techniques and applications can be found in Wang et al. (2017b) and Chen et al. (2019).

    • Approximate stochastic dynamic programming for hydroelectric production planning

      2017, European Journal of Operational Research
      Citation Excerpt :

      A member of this class is selected so as to minimize the “gap” between the model’s evaluation and the presumed true value of the return function at grid points. Fitting models include Chebyshev polynomials (Rust, 1996), neural networks (Bertsekas & Tsitsiklis, 1995; Cervellera, Wen, & Chen, 2007; Fan, Tarun, & Chen, 2013), splines function (Cervellera et al., 2007; Fan et al., 2013; Johnson, Stedinger, Shoemaker, Li, & Tejada-Guibert, 1993), kernels (Cervellera, Macciò, & Marcialis, 2013; Martinez, Shih, Chen, & Kim, 2015) among others. Local (rather than global) estimation (Cervellera et al., 2014; Cervellera & Macciò, 2011; Cervellera et al., 2013; Martinez et al., 2015) yet adds to the variety of available methods.

    • Forecasting the daily power output of a grid-connected photovoltaic system based on multivariate adaptive regression splines

      2016, Applied Energy
      Citation Excerpt :

      Chang [31] applied MARS to analyze air passenger flows. Some recent variations to MARS can be referred to [32,33]. Although the MARS model has been widely used in diverse fields, its application to power systems has been very limited.

    View all citing articles on Scopus
    View full text