skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Machine learning-based surrogate modeling for data-driven optimization: a comparison of subset selection for regression techniques

Journal Article · · Optimization Letters

Optimization of simulation-based or data-driven systems is a challenging task, which has attracted significant attention in the recent literature. A very efficient approach for optimizing systems without analytical expressions is through fitting surrogate models. Due to their increased flexibility, nonlinear interpolating functions, such as radial basis functions and Kriging, have been predominantly used as surrogates for data-driven optimization; however, these methods lead to complex nonconvex formulations. Alternatively, commonly used regression-based surrogates lead to simpler formulations, but they are less flexible and inaccurate if the form is not known a priori. In this work, we investigate the efficiency of subset selection regression techniques for developing surrogate functions that balance both accuracy and complexity. Subset selection creates sparse regression models by selecting only a subset of original features, which are linearly combined to generate a diverse set of surrogate models. Five different subset selection techniques are compared with commonly used nonlinear interpolating surrogate functions with respect to optimization solution accuracy, computation time, sampling requirements, and model sparsity. Furthermore, our results indicate that subset selection-based regression functions exhibit promising performance when the dimensionality is low, while interpolation performs better for higher dimensional problems.

Research Organization:
RAPID Manufacturing Institute, New York, NY (United States)
Sponsoring Organization:
USDOE Office of Energy Efficiency and Renewable Energy (EERE), Energy Efficiency Office. Advanced Manufacturing Office
Grant/Contract Number:
EE0007888
OSTI ID:
1642435
Journal Information:
Optimization Letters, Vol. 14, Issue 4; ISSN 1862-4472
Publisher:
Springer NatureCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 40 works
Citation information provided by
Web of Science

References (48)

Princeton_TIGRESS 2.0: High refinement consistency and net gains through support vector machines and molecular dynamics in double-blind predictions during the CASP11 experiment: Enhanced Protein Structure Refinement journal March 2017
Metamodeling Approach to Optimization of Steady-State Flowsheet Simulations journal October 2002
Stable signal recovery from incomplete and inaccurate measurements
  • Candès, Emmanuel J.; Romberg, Justin K.; Tao, Terence
  • Communications on Pure and Applied Mathematics, Vol. 59, Issue 8, p. 1207-1223 https://doi.org/10.1002/cpa.20124
journal January 2006
Gene Selection for Cancer Classification using Support Vector Machines journal January 2002
Derivative-free optimization: a review of algorithms and comparison of software implementations journal July 2012
Learning surrogate models for simulation-based optimization journal March 2014
Optimization formulations for multi-product supply chain networks journal September 2017
Dynamic Data-Driven Modeling of Pharmaceutical Processes journal June 2011
A tutorial on support vector regression journal August 2004
A polyhedral branch-and-cut approach to global optimization journal May 2005
Sparse principal component regression with adaptive loading journal September 2015
Partial least-squares regression: a tutorial journal January 1986
Sparse Principal Component Analysis journal June 2006
ARGONAUT: AlgoRithms for Global Optimization of coNstrAined grey-box compUTational problems journal April 2016
Feature subset selection using naive Bayes for text classification journal November 2015
Protein structure prediction by global optimization of a potential energy function journal May 1999
Regularization and variable selection via the elastic net journal April 2005
Practical selection of SVM parameters and noise estimation for SVM regression journal January 2004
Robust Face Recognition via Sparse Representation journal February 2009
Assessing a Response Surface-Based Optimization Approach for Soil Vapor Extraction System Design journal May 2009
Efficient Optimization Design Method Using Kriging Model journal March 2005
Efficient Global Optimization of Expensive Black-Box Functions journal January 1998
Simulation optimization: A comprehensive review on theory and applications journal November 2004
A trust region-based two phase algorithm for constrained black-box and grey-box optimization with infeasible initial point journal August 2018
Recent advances in surrogate-based optimization journal January 2009
A method for simulation based optimization using radial basis functions journal June 2009
Advances in surrogate based modeling, feasibility analysis, and optimization: A review journal January 2018
Optimization of a small-scale LNG supply chain journal April 2018
Global optimization of grey-box computational systems using surrogate functions and application to highly constrained oil-field operations journal June 2018
Use of reduced-order models in well control optimization journal February 2016
Sparse partial least squares regression for simultaneous dimension reduction and variable selection journal January 2010
A combined first-principles and data-driven approach to model building journal February 2015
A derivative-free methodology with local and global search for the constrained joint optimization of well locations and controls journal November 2013
An evaluation of adaptive surrogate modeling based optimization with two benchmark problems journal October 2014
Constrained Global Optimization of Expensive Black Box Functions Using Radial Basis Functions journal January 2005
Global optimization of general constrained grey-box models: new method and its application to constrained PDEs for pressure swing adsorption journal November 2015
Simulation optimization: a review of algorithms and applications journal November 2014
Selection of Subsets of Regression Variables journal January 1984
A Taxonomy of Global Optimization Methods Based on Response Surfaces journal December 2001
Improved molecular replacement by density- and energy-guided protein structure optimization journal May 2011
Modeling and Optimization of a Pharmaceutical Formulation System Using Radial Basis Function Network journal April 2009
Deep Representational Similarity Learning for Analyzing Neural Signatures in Task-based fMRI Dataset journal October 2020
Efficient Optimization Design Method Using Kriging Model journal September 2005
Simulation optimization: a review of algorithms and applications journal September 2015
Efficient Optimization Design Method Using Kriging Model journal September 2005
Efficient Optimization Design Method Using Kriging Model conference June 2004
Sparse principal component regression with adaptive loading text January 2014
Simulation optimization: A review of algorithms and applications text January 2017

Figures / Tables (14)