A convex version of multivariate adaptive regression splines

doi:10.1016/j.csda.2014.07.015

Computational Statistics & Data Analysis

Volume 81, January 2015, Pages 89-106

https://doi.org/10.1016/j.csda.2014.07.015 Get rights and content

Highlights

•
Convex-MARS enables a convex approximation without degrading the quality of fit.
•
Convex-MARS is appropriate for approximations in convex optimization problems.
•
The threshold version of Convex-MARS provides stronger convexity and better accuracy.

Abstract

Multivariate adaptive regression splines (MARS) provide a flexible statistical modeling method that employs forward and backward search algorithms to identify the combination of basis functions that best fits the data and simultaneously conduct variable selection. In optimization, MARS has been used successfully to estimate the unknown functions in stochastic dynamic programming (SDP), stochastic programming, and a Markov decision process, and MARS could be potentially useful in many real world optimization problems where objective (or other) functions need to be estimated from data, such as in surrogate optimization. Many optimization methods depend on convexity, but a non-convex MARS approximation is inherently possible because interaction terms are products of univariate terms. In this paper a convex MARS modeling algorithm is described. In order to ensure MARS convexity, two major modifications are made: (1) coefficients are constrained, such that pairs of basis functions are guaranteed to jointly form convex functions and (2) the form of interaction terms is altered to eliminate the inherent non-convexity. Finally, MARS convexity can be achieved by the fact that the sum of convex functions is convex. Convex-MARS is applied to inventory forecasting SDP problems with four and nine dimensions and to an air quality ground-level ozone problem.

Introduction

Computer modeling is having a profound effect on scientific research. Many processes are so complex that physical experimentation is too time-consuming, too expensive or simply impossible. As a result, experiments have increasingly turned to mathematical models to simulate these complex systems. Advances in computational power have allowed both greater complexity and more extensive use of such models. The purpose of design and analysis of computer experiments (DACE, Sacks et al., 1989, Kleijnen, 2008, Chen et al., 2006) is to provide methods for conducting computer experiments to build a metamodel that can be efficiently employed to improve the performance of a complex system. In DACE, the computer experiment replaces the physical experiment by organizing computer model runs and observing the model output of performance. A common DACE objective is to obtain a computationally-efficient response surface approximation (a.k.a., metamodel) of the output. This metamodel may then be used to study and potentially “optimize” the performance of the system. The effectiveness of an optimization method in using a metamodel to improve system performance depends on the convexity of the objective function (Luenberger, 2004). A non-convex metamodel requires a global optimization method, and in practice these typically cannot guarantee optimality. Consequently, if the true underlying performance objective function is known to be convex, it is highly desirable for the approximating metamodel to share this critical property.

Multivariate adaptive regression splines (MARS, Friedman, 1991) modeling has been applied in DACE-based approaches for some large-scale optimization problems, including continuous-state stochastic dynamic programming (SDP, Chen, 1999, Chen et al., 1999, Tsai et al., 2004, Tsai and Chen, 2005, Cervellera et al., 2007, Yang et al., 2007, Yang et al., 2009), Markov decision processes (MDP, Chen et al., 2003, Siddappa et al., 2007, Siddappa et al., 2008), and two-stage stochastic programming (SP, Pilla et al., 2008, Pilla et al., 2012, Shih et al., 2014). The DACE-based SDP and MDP approaches used an experimental design to discretize the continuous (or near-continuous) state space, and then used MARS to approximate the continuous value function over the state space. The MDP application studied an airline revenue management problem with the objective of more accurately estimating the fair market value of a seat over time. The two-stage SP problem studied an airline fleet assignment model that seeks an assignment of aircraft in the first stage, so that swapping of crew-compatible aircraft can be achieved in the second stage to maximize expected revenue. The DACE approach for SP was used to create a MARS approximation of the first-stage expected revenue objective function, so as to speed up the first-stage optimization. MARS has been successful in these applications not only because of the flexibility of its modeling, but also its parsimony. Parsimony is critical in achieving computational-tractability in large-scale complex problems. Shih et al. (2014) added a data mining variable selection phase that reduced the dimension of the airline fleet assignment model from about 1200 to 400 variables prior to executing DACE, so as to reduce the computational effort of DACE from 2.5 days to an estimated 0.5 days.

Under the assumption that an optimization function $f$ is convex, it is desired that the response surface metamodel $\hat{f}$ that estimates $f$ be convex as well. For example, in the above-mentioned SDP, MDP, and SP problems, the underlying function is theoretically convex. Convexity is not a typical assumption of statistical modeling methods, and a specialized approach must be developed. There are several options for DACE metamodeling, including polynomial response surface models (Box and Draper, 1987), spatial correlation models, a.k.a., kriging (Sacks et al., 1989), MARS, regression trees (Breiman et al., 1984, Friedman, 2001), and artificial neural networks (Haykin, 1999). None of these guarantee convexity. Convex-MARS uses the modification of both the MARS basis functions and algorithms to build a sum of convex functions, therefore, the final approximation will be convex. The C code is available from this website: http://www.uta.edu/cosmos/software.php.

Section snippets

Multivariate adaptive regression splines (MARS)

Friedman (1991) introduced MARS as a statistical method for high-dimensional modeling with interactions. The MARS model is essentially a linear statistical model with a forward stepwise algorithm to select model terms followed by a backward procedure to prune the model terms. A univariate version (appropriate for additive relationships) was presented by Friedman and Silverman (1989). The MARS approximation bends to model curvature at “knot” locations, and one of the objectives of the forward

Achieving convexity in MARS

To guarantee MARS convexity, two major modifications are made: (1) coefficients are constrained, such that pairs of univariate basis functions are guaranteed to jointly form convex functions and (2) the form of interaction terms is altered to eliminate the inherent non-convexity. A preliminary version of Convex-MARS (Shih et al., 2006) essentially incorporated these modifications to guarantee convexity. However the flexibility of this version was limited, so the current paper presents an

Convex-MARS forward coefficient restriction algorithm

In Convex-MARS, the forward stepwise procedure of original MARS is modified to check the coefficients of newly added basis functions according to the criteria described in Sections 3.1 Convex univariate terms, 3.2 Convex interaction terms. This modified algorithm constrains the coefficients for the basis functions throughout the search process. Whenever there are basis functions being added to the current set of basis functions, either a paired or an unpaired basis function (univariate or

Case studies

In this section Convex-MARS is tested on four-dimensional and nine-dimensional inventory forecasting SDP problems studied by Chen (1999) and on an air quality ground-level ozone SDP problem studied by Yang et al., 2007, Yang et al., 2009. The goal of the inventory forecasting problem is to minimize inventory holding and backorder costs. The state of the system is represented by the inventory levels for the products and their demand forecasts. The goal of the ground-level ozone problem is to

Conclusions

The major contribution of this research is a version of MARS that guarantees convexity without degrading the quality of fit. Given the existing success of MARS in some complex, large-scale optimization problems, the convexity guarantee provides stronger motivation to employ Convex-MARS in problems with known convexity. Testing on inventory forecasting and air quality ground-level ozone SDP problems demonstrates a comparable fit to original MARS. While a significant structural modification for

Acknowledgments

This research was partially supported by the Dallas-Fort Worth International Airport contract #8002058 and National Science Foundation grant ECCS-0801802.

References (26)

C. Cervellera et al.
Neural network and regression spline value function approximations for stochastic dynamic programming
Comput. Oper. Res.
(2007)
V.C.P. Chen
Application of MARS and orthogonal arrays to inventory forecasting stochastic dynamic programs
Comput. Statist. Data Anal.
(1999)
V.L. Pilla et al.
A multivariate adaptive regression splines cutting plane approach for solving a two-stage stochastic programming fleet assignment model
Eur. J. Oper. Res.
(2012)
B. Ariyajunya
Adaptive dynamic programming for high-dimensional multicollinear state spaces
(2012)
G.E.P. Box et al.
Empirical Model Building and Response Surfaces
(1987)
L. Breiman et al.
Classification and Regression Trees
(1984)
V.C.P. Chen
Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming
(1993)
V.C.P. Chen et al.
Solving for an optimal airline yield management policy via statistical learning
J. Roy. Statist. Soc. Ser. C
(2003)
V.C.P. Chen et al.
Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming
Oper. Res.
(1999)
V.C.P. Chen et al.
Design, modeling, and applications of computer experiments
IIE Trans.
(2006)

J.H. Friedman

Multivariate adaptive regression splines

Ann. Statist.

(1991)

J.H. Friedman

Greedy function approximation: a gradient boosting machine

Ann. Statist.

(2001)

J.H. Friedman et al.

Flexible parsimonious smoothing and additive modeling

Technometrics

(1989)

Cited by (15)

Estimating production functions through additive models based on regression splines
2024, European Journal of Operational Research
This paper introduces a new methodology for the estimation of production functions satisfying some classical production theory axioms, such as monotonicity and concavity, which is based upon the adaptation of an additive version of the machine learning technique known as Multivariate Adaptive Regression Splines (MARS). The new approach shares the piece-wise linear shape of the estimator associated with Data Envelopment Analysis (DEA). However, the new technique is able to surmount the overfitting problems associated with DEA by resorting to generalized cross-validation. In this paper, a computational experience was employed to measure how well the new approach performs, showing that it can reduce the mean squared error and bias of the estimator of the true production function in comparison with DEA and the more recent Corrected Concave Non-Parametric Least Squares (C²NLS) methodology. We also show that the success of the new approach depends on whether or not interactions among variables prevail and the degree of non-additivity of the true production function to be estimated.
Optimal decision-making of mutual fund temporary borrowing problem via approximate dynamic programming
2023, Computers and Operations Research
Temporary borrowing is a liquidity risk management tool for mutual fund managers to meet investor redemption demands. We develop a new Markov decision process model to describe the temporary borrowing process, considering the multiple lending channels, the validity period of loans, and the uncertainties of cost, demand, and maximum loan amount simultaneously. An approximate dynamic programming (ADP) algorithm is initiated to solve the cost-minimizing temporary borrowing problem. We construct the value function as a separable approximation and prove the convexity of its components with respect to the available funds in different channels. Moreover, a new value function updating formula, DMAX, is designed to overcome value function overestimation. The proved convexity and the proposed formula contribute to fast and reliable value function estimation. Numerical experiments based on actual business data show that the proposed algorithm can obtain near-optimal decisions in deterministic cases and maintain high robustness in stochastic cases.
A Local Weighted Linear Regression (LWLR) Ensemble of Surrogate Models Based on Stacking Strategy: Application to Hydrodynamic Response Prediction for Submerged Floating Tunnel (SFT)
2022, Applied Ocean Research
Citation Excerpt :
During past decades, a wide variety of surrogate models with different assumptions on the underlying response functions, datasets, and model structures have been developed rapidly and applied in many engineering disciplines. The classical metamodels include Kriging (KRG) (Aldosary et al., 2018; Simpson et al., 2001; Xu et al., 2020; Wang et al. 2022), radial basis functions (RBF) (Fang and Horstemeyer 2006), polynomial regression (PRS) (Myers et al. 2016), polynomial chaos expansion (PCE) (Spiridonakos et al. 2016; Chakraborty and Chowdhury 2017; Wang et al. 2021a), multivariate adaptive regression splines (MARS) (Martinez et al., 2015), support vector machine (SVM) (Xiang et al. 2017; Xu et al., 2020; Chen et al. 2021; Wang et al. 2021b), artificial neural networks (ANN) (Mohandes et al. 1998; Xu et al. 2018; Najafi et al. 2018), decision tree regression (DTR) (Pekel 2020; Pekel et al. 2020), K-Nearest Neighbor (K-NN) (Hu et al. 2014; Como et al. 2017), and Bayesian regression (Wang et al. 2017a; Wang et al. 2020). More details on surrogate modelling techniques and applications can be found in Wang et al. (2017b) and Chen et al. (2019).
Ensemble of metamodels/surrogate models (EM), built based on individual ones, is favoured as an approximation for expensive physical and high-fidelity numerical experiments where individual models with different assumptions on the underlying response functions, datasets, and model structures can be fused more robustly. In this study, a local weighted linear regression (LWLR) approach for constructing the EM, namely the local weighted linear regression ensemble of metamodels (LWLR-EM), is proposed based on the stacking strategy, aiming to address two significant issues in the construction process of EM: 1) it is often unfeasible to obtain sufficient additional points to enhance the performance of EM through high-fidelity numerical simulations and/or physical experiments; 2) underfitting occurs in many cases. To well address these two issues, k-fold cross-validation and local weighted strategies are correspondingly adopted in the metamodel ensembling. Through extensive verification and comparison with existent ensemble methods on typical benchmark test functions, the LWLR-EM is found to perform favourably with competitive accuracy, enhanced robustness, and improved generalization in the prediction task. Thereafter, an engineering practice of submerged floating tunnel (SFT) with five input parameters is considered to further examine the performance of surrogate models. The results show that the proposed LWLR-EM is featured with desirable prediction power and can serve as a competitive alternative in applied ocean engineering.
Efficient approximate dynamic programming based on design and analysis of computer experiments for infinite-horizon optimization
2020, Computers and Operations Research
The approximate dynamic programming (ADP) method based on the design and analysis of computer experiments (DACE) approach has been demonstrated as an effective method to solve multistage decision-making problems in the literature. However, this method is still not efficient for infinite-horizon optimization considering the required large volume of sampling in the state space and high-quality value function identification. Therefore, we propose a sequential sampling algorithm and embed it into a DACE-based ADP method to obtain a high-quality value function approximation. Considering the limitations of the traditional stopping criterion (Bellman error bound), we further propose a 45-degree line stopping criterion to terminate value iteration early by identifying an optimally equivalent value function. A comparison of the computational results with those of other three existing policies indicates that the proposed sampling algorithm and stopping criterion can determine a high-quality ADP policy. Finally, we discuss the extrapolation issue of the value function approximated by multivariate adaptive regression splines, the results of which further demonstrate the quality of the ADP policy generated in this study.
Approximate stochastic dynamic programming for hydroelectric production planning
2017, European Journal of Operational Research
Citation Excerpt :
A member of this class is selected so as to minimize the “gap” between the model’s evaluation and the presumed true value of the return function at grid points. Fitting models include Chebyshev polynomials (Rust, 1996), neural networks (Bertsekas & Tsitsiklis, 1995; Cervellera, Wen, & Chen, 2007; Fan, Tarun, & Chen, 2013), splines function (Cervellera et al., 2007; Fan et al., 2013; Johnson, Stedinger, Shoemaker, Li, & Tejada-Guibert, 1993), kernels (Cervellera, Macciò, & Marcialis, 2013; Martinez, Shih, Chen, & Kim, 2015) among others. Local (rather than global) estimation (Cervellera et al., 2014; Cervellera & Macciò, 2011; Cervellera et al., 2013; Martinez et al., 2015) yet adds to the variety of available methods.
This paper presents a novel approach for approximate stochastic dynamic programming (ASDP) over a continuous state space when the optimization phase has a near-convex structure. The approach entails a simplicial partitioning of the state space. Bounds on the true value function are used to refine the partition. We also provide analytic formulae for the computation of the expectation of the value function in the “uni-basin” case where natural inflows are strongly correlated. The approach is experimented on several configurations of hydro-energy systems. It is also tested against actual industrial data.
Forecasting the daily power output of a grid-connected photovoltaic system based on multivariate adaptive regression splines
2016, Applied Energy
Citation Excerpt :
Chang [31] applied MARS to analyze air passenger flows. Some recent variations to MARS can be referred to [32,33]. Although the MARS model has been widely used in diverse fields, its application to power systems has been very limited.
Both linear and nonlinear models have been proposed for forecasting the power output of photovoltaic systems. Linear models are simple to implement but less flexible. Due to the stochastic nature of the power output of PV systems, nonlinear models tend to provide better forecast than linear models. Motivated by this, this paper suggests a fairly simple nonlinear regression model known as multivariate adaptive regression splines (MARS), as an alternative to forecasting of solar power output. The MARS model is a data-driven modeling approach without any assumption about the relationship between the power output and predictors. It maintains simplicity of the classical multiple linear regression (MLR) model while possessing the capability of handling nonlinearity. It is simpler in format than other nonlinear models such as ANN, k-nearest neighbors (KNN), classification and regression tree (CART), and support vector machine (SVM). The MARS model was applied on the daily output of a grid-connected 2.1 kW PV system to provide the 1-day-ahead mean daily forecast of the power output. The comparisons with a wide variety of forecast models show that the MARS model is able to provide reliable forecast performance.

View all citing articles on Scopus

View full text

A convex version of multivariate adaptive regression splines

Highlights

Abstract

Introduction

Section snippets

Multivariate adaptive regression splines (MARS)

Achieving convexity in MARS

Convex-MARS forward coefficient restriction algorithm

Case studies

Conclusions

Acknowledgments

Comput. Oper. Res.

Comput. Statist. Data Anal.

Eur. J. Oper. Res.

Adaptive dynamic programming for high-dimensional multicollinear state spaces

Empirical Model Building and Response Surfaces

Classification and Regression Trees

Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming

Solving for an optimal airline yield management policy via statistical learning

J. Roy. Statist. Soc. Ser. C

Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming

Oper. Res.

Design, modeling, and applications of computer experiments

IIE Trans.

Multivariate adaptive regression splines

Ann. Statist.

Greedy function approximation: a gradient boosting machine

Ann. Statist.

Flexible parsimonious smoothing and additive modeling

Technometrics