Hostname: page-component-76fb5796d-wq484 Total loading time: 0 Render date: 2024-04-29T18:54:15.764Z Has data issue: false hasContentIssue false

Moderate deviations inequalities for Gaussian process regression

Published online by Cambridge University Press:  05 June 2023

Jialin Li*
Affiliation:
University of Toronto
Ilya O. Ryzhov*
Affiliation:
University of Maryland
*
*Postal address: Rotman School of Management, University of Toronto, Ontario, Canada. Email address: jln.li@rotman.utoronto.ca
**Postal address: Robert H. Smith School of Business, University of Maryland, College Park, MD 20742, USA. Email address: iryzhov@rhsmith.umd.edu

Abstract

Gaussian process regression is widely used to model an unknown function on a continuous domain by interpolating a discrete set of observed design points. We develop a theoretical framework for proving new moderate deviations inequalities on different types of error probabilities that arise in GP regression. Two specific examples of broad interest are the probability of falsely ordering pairs of points (incorrectly estimating one point as being better than another) and the tail probability of the estimation error at an arbitrary point. Our inequalities connect these probabilities to the mesh norm, which measures how well the design points fill the space.

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adler, R. J. (2000). On excursion sets, tube formulas and maxima of random fields. Ann. Appl. Prob. 10, 174.CrossRefGoogle Scholar
Ankenman, B., Nelson, B. L. and Staum, J. (2010). Stochastic kriging for simulation metamodeling. Operat. Res. 58, 371382.CrossRefGoogle Scholar
Arcones, M. A. (2006). Large deviations for M-estimators. Ann. Inst. Statist. Math. 58, 2152.CrossRefGoogle Scholar
Bect, J., Bachoc, F. and Ginsbourger, D. (2019). A supermartingale approach to Gaussian process based sequential design of experiments. Bernoulli 25, 28832919.CrossRefGoogle Scholar
Beknazaryan, A., Sang, H. and Xiao, Y. (2019). Cramér type moderate deviations for random fields. J. Appl. Prob. 56, 223245.CrossRefGoogle Scholar
Bull, A. D. (2011). Convergence rates of efficient global optimization algorithms. J. Mach. Learn. Res. 12, 28792904.Google Scholar
Chan, H. P. and Lai, T. L. (2006). Maxima of asymptotically Gaussian random fields and moderate deviation approximations to boundary crossing probabilities of sums of random variables with multidimensional indices. Ann. Prob. 34, 80121.CrossRefGoogle Scholar
Cheng, D. and Xiao, Y. (2016). Excursion probability of Gaussian random fields on sphere. Bernoulli 22, 11131130.CrossRefGoogle Scholar
Ciesielski, Z. (1961). Hölder conditions for realizations of Gaussian processes. Trans. Amer. Math. Soc. 99, 403413.Google Scholar
Dembo, A. and Zeitouni, O. (2009). Large Deviations Techniques and Applications, 2nd edn. Springer, Berlin and Heidelberg.Google Scholar
Ghosal, S. and Roy, A. (2006). Posterior consistency of Gaussian process prior for nonparametric binary regression. Ann. Statist. 34, 24132429.CrossRefGoogle Scholar
Glynn, P. W. and Juneja, S. (2004). A large deviations perspective on ordinal optimization. In Proceedings of the 2004 Winter Simulation Conference, ed. R. Ingalls et al., pp. 577–585.CrossRefGoogle Scholar
Janson, S. (1987). Maximal spacings in several dimensions. Ann. Prob. 15, 274280.CrossRefGoogle Scholar
Johnson, M. E., Moore, L. M. and Ylvisaker, D. (1990). Minimax and maximin distance designs. J. Statist. Planning Infer. 26, 131148.CrossRefGoogle Scholar
Jones, D. R., Schonlau, M. and Welch, W. J. (1998). Efficient global optimization of expensive black-box functions. J. Global Optimization 13, 455492.CrossRefGoogle Scholar
Joseph, V. R., Gul, E. and Ba, S. (2015). Maximum projection designs for computer experiments. Biometrika 102, 371380.CrossRefGoogle Scholar
Lederer, A., Umlauft, J. and Hirche, S. (2019). Uniform error bounds for Gaussian process regression with application to safe control. In Advances in Neural Information Processing Systems 32, ed. H. Wallach et al. Curran Associates, Red Hook, NY.Google Scholar
Lee, S. I., Mortazavi, B., Hoffman, H. A., Lu, D. S., Li, C., Paak, B. H., Garst, J. H., Razaghy, M., Espinal, M., Park, E., Lu, D. C. and Sarrafzadeh, M. (2014). A prediction model for functional outcomes in spinal cord disorder patients using Gaussian process regression. IEEE J. Biomed. Health Inform. 20, 9199.CrossRefGoogle ScholarPubMed
Li, J. and Ryzhov, I. O. (2023). Convergence rates of epsilon-greedy global optimization under radial basis function interpolation. Stoch. Systems 13, 5992.CrossRefGoogle Scholar
Li, X., Liu, J., Lu, J. and Zhou, X. (2018). Moderate deviation for random elliptic PDE with small noise. Ann. Appl. Prob. 28, 27812813.CrossRefGoogle Scholar
Lukić, M. and Beder, J. (2001). Stochastic processes with sample paths in reproducing kernel Hilbert spaces. Trans. Amer. Math. Soc. 353, 3945–3969.CrossRefGoogle Scholar
Marcus, M. B. and Shepp, L. A. (1972). Sample behavior of Gaussian processes. In Proceedings of the 6th Berkeley Symposium on Mathematical Statistics and Probability, vol. 2, ed. L. M. Le Cam et al., pp. 423–421. University of California Press, Berkeley and Los Angeles.Google Scholar
Pati, D., Bhattacharya, A. and Cheng, G. (2015). Optimal Bayesian estimation in random covariate design with a rescaled Gaussian process prior. J. Mach. Learning Res. 16, 28372851.Google Scholar
Pronzato, L. and Müller, W. G. (2012). Design of computer experiments: space filling and beyond. Statist. Comput. 22, 681701.CrossRefGoogle Scholar
Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA.Google Scholar
Scott, W. R., Powell, W. B. and Simão, H. P. (2010). Calibrating simulation models using the knowledge gradient with continuous parameters. In Proceedings of the 2010 Winter Simulation Conference, ed. B. Johansson et al., pp. 1099–1109. IEEE, Piscataway, NJ.Google Scholar
Sheibani, M. and Ou, G. (2021). The development of Gaussian process regression for effective regional post-earthquake building damage inference. Comput.-Aided Civ. Infrastruct. Eng. 36, 264288.CrossRefGoogle Scholar
Snoek, J., Larochelle, H. and Adams, R. P. (2012). Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems 25, ed. F. Pereira et al., pp. 29512959. Curran Associates, Red Hook, NY.Google Scholar
Srinivas, N., Krause, A., Kakade, S. M. and Seeger, M. W. (2012). Information-theoretic regret bounds for Gaussian process optimization in the bandit setting. IEEE Trans. Inform. Theory 58, 32503265.CrossRefGoogle Scholar
Teckentrup, A. L. (2020). Convergence of Gaussian process regression with estimated hyper-parameters and applications in Bayesian inverse problems. SIAM/ASA J. Uncertain. Quantif. 8, 13101337.CrossRefGoogle Scholar
Toth, C. and Oberhauser, H. (2020). Bayesian learning from sequential data using Gaussian processes with signature covariances. In Proceedings of the 37th International Conference on Machine Learning, ed. H. Daumé and A. Singh, pp. 9548–9560. PMLR, Cambridge, MA.Google Scholar
van der Hofstad, R. and Honnappa, H. (2019). Large deviations of bivariate Gaussian extrema. Queueing Systems 93, 333349.CrossRefGoogle Scholar
Vazquez, E. and Bect, J. (2010). Convergence properties of the expected improvement algorithm with fixed mean and covariance functions. J. Statist. Planning Infer. 140, 30883095.CrossRefGoogle Scholar
Wang, W., Tuo, R. and Wu, C. F. J. (2020). On prediction properties of kriging: uniform error bounds and robustness. J. Amer. Statist. Assoc. 115, 920930.CrossRefGoogle Scholar
Wendland, H. (2004). Scattered Data Approximation. Cambridge University Press.CrossRefGoogle Scholar
Wu, Z.-M. and Schaback, R. (1993). Local error estimates for radial basis function interpolation of scattered data. IMA J. Numer. Anal. 13, 1327.CrossRefGoogle Scholar
Zhou, J. and Ryzhov, I. O. (2022). Technical note: A new rate-optimal sampling allocation for linear belief models. Operat. Res., to appear.Google Scholar