Abstract
We consider construction of surrogate models based on variable fidelity samples generated by a high fidelity function (an exact representation of some physical phenomenon) and by a low fidelity function (a coarse approximation of the exact representation). A surrogate model is constructed to replace the computationally expensive high fidelity function. For such tasks Gaussian processes are generally used. However, if the sample size reaches a few thousands points, a direct application of Gaussian process regression becomes impractical due to high computational costs. We propose two approaches to circumvent this difficulty. The first approach uses approximation of sample covariance matrices based on the Nyström method. The second approach relies on the fact that engineers often can evaluate a low fidelity function on the fly at any point using some blackbox; thus each time calculating prediction of a high fidelity function at some point, we can update the surrogate model with the low fidelity function value at this point. So, we avoid issues related to the inversion of large covariance matrices — as we can construct model using only a moderate low fidelity sample size. We applied developed methods to a real problem, dealing with an optimization of the shape of a rotating disk.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alexandrov, N.M., Nielsen, E.J., Lewis, R.M., Anderson, W.K.: First-order model management with variable-fidelity physics applied to multi-element airfoil optimization. Technical report, NASA (2000)
Álvarez, M.A., Lawrence, N.D.: Computationally efficient convolved multiple output Gaussian processes. J. Mach. Learn. Res. 12, 1425–1466 (2011)
Armand, S.C.: Structural optimization methodology for rotating disks of aircraft engines. Technical report, National Aeronautics and Space Administration, Office of Management, Scientific and Technical Information Program (1995)
Bachoc, F.: Cross validation and maximum likelihood estimations of hyper-parameters of Gaussian processes with model misspecification. Comput. Stat. Data Anal. 66, 55–69 (2013)
Banerjee, S., Gelfand, A.E., Finley, A.O., Sang, H.: Gaussian predictive process models for large spatial data sets. J. Royal Stat. Soc. Ser. B (Statist. Method.) 70(4), 825–848 (2008)
Belyaev, M., Burnaev, E., Kapushev, Y.: Gaussian process regression for structured data sets. In: Gammerman, A., Vovk, V., Papadopoulos, H. (eds.) SLDS 2015. LNCS, vol. 9047, pp. 106–115. Springer, Heidelberg (2015)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Boyle, P., Frean, M.: Dependent Gaussian processes. Adv. Neural Inf. Process. Syst. 17, 217–224 (2005)
Burnaev, E., Panov, M.: Adaptive design of experiments based on Gaussian processes. In: Gammerman, A., Vovk, V., Papadopoulos, H. (eds.) SLDS 2015. LNCS, vol. 9047, pp. 116–125. Springer, Heidelberg (2015)
Burnaev, E.V., Zaytsev, A.A., Spokoiny, V.G.: The Bernstein-von Mises theorem for regression based on Gaussian processes. Russ. Math. Surv. 68(5), 954–956 (2013)
Chang, W., Haran, M., Olson, R., Keller, K., et al.: Fast dimension-reduced climate model calibration and the effect of data aggregation. Ann. Appl. Stat. 8(2), 649–673 (2014)
Doyen, P.: Porosity from seismic data: a geostatistical approach. Geophysics 53(10), 1263–1275 (1988)
Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a Gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6, 2153–2175 (2005)
Druot, T., Alestra, S., Brand, C., Morozov, S.: Multi-objective optimization of aircrafts family at conceptual design stage. In: Design and Optimization Symposium. Albi, France, In Inverse Problems (2013)
Farshi, B., Jahed, H., Mehrabian, A.: Optimum design of inhomogeneous non-uniform rotating discs. Comput. Struct. 82(9), 773–779 (2004)
Forrester, A.I.J., Sóbester, A., Keane, A.J.: Multi-fidelity optimization via surrogate modelling. Proc. Roy. Soc. A Math. Phys. Eng. Sci. 463(2088), 3251–3269 (2007)
Forrester, A.I.J., Sóbester, A., Keane, A.J.: Engineering Design Via Surrogate Modelling: a Practical Guide. J. Wiley, Chichester (2008)
Foster, L., Waagen, A., Aijaz, N., Hurley, M., Luis, A., Rinsky, J., Satyavolu, C., Way, M.J., Gazis, P., Srivastava, A.: Stable and efficient Gaussian process calculations. J. Mach. Learn. Res. 10, 857–882 (2009)
Furrer, R., Genton, M.G., Nychka, D.: Covariance tapering for interpolation of large spatial datasets. J. Comput. Graphical Stat. 15(3), 502–523 (2006)
Golub, G.H., Van Loan, C.F.: Matrix Computations, vol. 3. JHU Press, Baltimore (2012)
Grihon, S., Burnaev, E., Belyaev, M., Prikhodko, P.: Surrogate modeling of stability constraints for optimization of composite structures. In: Koziel, S., Leifsson, L. (eds.) Surrogate-Based Modeling and Optimization, pp. 359–391. Springer, New York (2013)
Han, Z., Görtz, S., Zimmermann, R.: Improving variable-fidelity surrogate modeling via gradient-enhanced kriging and a generalized hybrid bridge function. Aerosp. Sci. Technol. 25(1), 177–189 (2013)
Hastie, T., Tibshirani, R., Friedman, J., Franklin, J.: The elements of statistical learning: data mining, inference and prediction. Math. Intell. 27(2), 83–85 (2005)
Hensman, J., Fusi, N., Lawrence, N.D.,Gaussian processes for big data. arXiv preprint arXiv: 1309.6835 (2013)
Higdon, D., Gattiker, J., Williams, B., Rightley, M.: Computer model calibration using high-dimensional output. J. Am. Stat. Assoc. 103(482), 570–583 (2008)
Huang, Z., Wang, C., Chen, J., Tian, H.: Optimal design of aeroengine turbine disc based on kriging surrogate models. Comput. Struct. 89(1), 27–37 (2011)
Kennedy, M.C., O’Hagan, A.: Predicting the output from a complex computer code when fast approximations are available. Biometrika 87(1), 1–13 (2000)
Koziel, S., Bekasiewicz, A., Couckuyt, I., Dhaene, T.: Efficient multi-objective simulation-driven antenna design using co-kriging. IEEE Trans. Antennas Propag. 62(11), 5900–5905 (2014)
Madsen, J.I., Langthjem, M.: Multifidelity response surface approximations for the optimum design of diffuser flows. Optim. Eng. 2(4), 453–468 (2001)
Mohan, S.C., Maiti, D.K.: Structural optimization of rotating disk using response surface equation and genetic algorithm. Int. J. Comput. Methods Eng. Sci. Mech. 14(2), 124–132 (2013)
Neal, R.M.: Monte carlo implementation of Gaussian process models for Bayesian regression and classification. arXiv preprint physics/9701026 (1997)
Park, J.-S.: Optimal Latin-hypercube designs for computer experiments. J. Stat. Plann. Infer. 39(1), 95–111 (1994)
Park, S., Choi, S.: Hierarchical Gaussian process regression. In: ACML, pp. 95–110 (2010)
Pepelyshev, A.: The role of the nugget term in the Gaussian process method. In: Giovagnoli, A., Atkinson, A.C., Torsney, B., May, C. (eds.) mODa 9-Advances in Model-Oriented Design and Analysis, pp. 149–156. Springer, Heidelberg (2010)
Qian, Z., Seepersad, C.C., Joseph, V.R., Allen, J.K., Wu, C.F.: Building surrogate models based on detailed and approximate simulations. J. Mech. Des. 128(4), 668–677 (2006)
Quiñonero-Candela, J., Rasmussen, C.E.: A unifying view of sparse approximate Gaussian process regression. J. Mach. Learn. Res. 6, 1939–1959 (2005)
Rasmussen, C.E., Williams, C.K.I.: Gaussian processes for machine learning. The MIT Press, Cambridge (2006)
Shaby, B., Ruppert, D.: Tapered covariance: Bayesian estimation and asymptotics. J. Comput. Graphical Stat. 21(2), 433–452 (2012)
Shi, J.Q., Murray-Smith, R., Titterington, D.M.: Hierarchical Gaussian process mixtures for regression. Stat. Comput. 15(1), 31–41 (2005)
Sun, G., Li, G., Stone, M., Li, Q.: A two-stage multi-fidelity optimization procedure for honeycomb-type cellular materials. Comput. Mater. Sci. 49(3), 500–511 (2010)
Sun, S., Zhao, J., Zhu, J.: A review of Nyström methods for large-scale machine learning. Inf. Fusion 26, 36–48 (2015)
Titsias, M.K.: Variational learning of inducing variables in sparse Gaussian processes. In: International Conference on Artificial Intelligence and Statistics, pp. 567–574 (2009)
Williams, C.K.I., Barber, D.: Bayesian classification with Gaussian processes. IEEE Trans. Pattern Anal. Mach. Intell. 20(12), 1342–1351 (1998)
Xu, W., Tran, T., Srivastava, R., Journel, A.: Integrating seismic data in reservoir modeling: the collocated cokriging alternative. Society of Petroleum Engineers, In: SPE Annual Technical Conference and Exhibition (1992)
Zahir, M.K., Gao, Z.: Variable fidelity surrogate assisted optimization using a suite of low fidelity solvers. Open J. Optim. 1(1), 0–8 (2012)
Zaitsev, A., Burnaev, E., Spokoiny, V.: Properties of the posterior distribution of a regression model based on Gaussian random fields. Autom. Remote Control 74(10), 1645–1655 (2013)
Acknowledgments
We thank Dmitry Khominich from DATADVANCE llc for making the solvers for rotating disk problem available, and Tatyana Alenkaya from MIPT for proofreading of the article. The research was conducted in IITP RAS and supported solely by the Russian Science Foundation grant (project 14-50-00150).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix
A Proof of Technical Statements
In this section we provide the proofs of the statements of Sect. 4.
Proof (Proof of Statement 1)
For the posterior mean we get:
We use the same approach to derive an equation for the posterior variance:
Proof (Proof of Statement 2)
First of all we have to calculate the matrices \(\mathbf {V}_{11}\) and \(\mathbf {V}= \mathbf {R}\mathbf {K}_1 \mathbf {V}_{11}^{-T}\). The matrix \(\mathbf {V}_{11}\) is of size \(n_1 \times n_1\), so we need \(O(n_1^3)\) to get its inverse. To calculate \(\mathbf {K}_1 \mathbf {V}_{11}^{-T}\) we need \(O(n_1^2 n)\) operations. Finally, as \(\mathbf {R}\) is a diagonal matrix, we use \(O(n_1 n)\) operations to get \(\mathbf {V}\).
In case \(n^* = 1\) to get the posterior mean we have to calculate \(\mathbf {V}_{11} (\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V})^{-1} \mathbf {V}^T \mathbf {y}\). We use \(O(n_1^2 n)\) operations to calculate \(\mathbf {V}^T \mathbf {V}\), to inverse \(\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V}\) we need \(O(n_1^3)\) operations, to calculate \(\mathbf {V}_{11} (\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V})^{-1} \mathbf {V}^T\) one uses extra \(O(n_1^2 n)\) operations, and finally to calculate the posterior mean we need additional \(O(n_1 n)\) operations. Consequently, to calculate the posterior mean we use \(O(n_1^2 n)\) operations.
In the same way in order to calculate \(\mathbf {V}_{11} (\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V})^{-1} \mathbf {V}_{11}^{-1}\) we need \(O(n_1^2 n)\) operations to calculate \((\mathbf {I}_{n_1} + \mathbf {V}^T \mathbf {V})^{-1}\) and additional \(O(n_1^3)\) operations to get the final matrix. Consequently, in order to calculate the posterior variance we use \(O(n_1^2 n)\) operations.
Finally, we need \(O(n_1^2 n)\) operations to compute the required matrices, and \(O(n_1^2 n)\), to obtain the posterior mean and the posterior variance from these precomputed matrices. So, the total computational complexity is \(O(n_1^2 n)\).
B Comparison of Low and High Fidelity Model for Rotating Disk
There are two available solvers for \(u_\mathrm{max}\) and \(s_\mathrm{max}\) calculation. The low fidelity function is calculated using Ordinary Differential Equations (ODE) solver based on a simple Runge–Kutta’s method. The high fidelity function is calculated using Finite Element Model (FEM) solver from ANSYS.
To compare the solvers we draw the scatter plots of low and high fidelity values and also plot slices of the corresponding functions. We generate a random sample of points in a specified design space box, calculate the low and high fidelity function values and draw the low fidelity function values versus the high fidelity function values at the same points. The scatter plots are in Fig. 3: the difference between values increases significantly when the values are increasing.
For the central point of the design space box with \(r_1 = 0.06, r_2 = 0.13, r_3 = 0.16, r_4 = 0.185, t_1 = 0.027, t_3 = 0.027\) we construct one-dimensional slices by varying single input variable in specified bounds. Slices for different input variables for \(u_\mathrm{max}\) and for \(s_\mathrm{max}\) are given in Fig. 4. In case of \(u_\mathrm{max}\) the high and the low fidelity functions have the same behaviour, and the low fidelity function models the high fidelity function accurately. For \(s_\mathrm{max}\) the high and the low fidelity functions are sometimes different: their behaviours differ for a slice along \(r_1\) input, and local maxima differ for slice along \(t_3\) input.
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Zaytsev, A. (2016). Variable Fidelity Regression Using Low Fidelity Function Blackbox and Sparsification. In: Gammerman, A., Luo, Z., Vega, J., Vovk, V. (eds) Conformal and Probabilistic Prediction with Applications. COPA 2016. Lecture Notes in Computer Science(), vol 9653. Springer, Cham. https://doi.org/10.1007/978-3-319-33395-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-33395-3_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-33394-6
Online ISBN: 978-3-319-33395-3
eBook Packages: Computer ScienceComputer Science (R0)