Class GP: Gaussian Process Modeling for Heterogeneous Functions

Malu, Mohit; Pedrielli, Giulia; Dasarathy, Gautam; Spanias, Andreas

doi:10.1007/978-3-031-44505-7_28

Mohit Malu^9,11,
Giulia Pedrielli¹⁰,
Gautam Dasarathy⁹ &
…
Andreas Spanias^9,11

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14286))

Included in the following conference series:

International Conference on Learning and Intelligent Optimization

460 Accesses

Abstract

Gaussian Processes (GP) are a powerful framework for modeling expensive black-box functions and have thus been adopted for various challenging modeling and optimization problems. In GP-based modeling, we typically default to a stationary covariance kernel to model the underlying function over the input domain, but many real-world applications, such as controls and cyber-physical system safety, often require modeling and optimization of functions that are locally stationary and globally non-stationary across the domain; using standard GPs with a stationary kernel often yields poor modeling performance in such scenarios. In this paper, we propose a novel modeling technique called Class-GP (Class Gaussian Process) to model a class of heterogeneous functions, i.e., non-stationary functions which can be divided into locally stationary functions over the partitions of input space with one active stationary function in each partition. We provide theoretical insights into the modeling power of Class-GP and demonstrate its benefits over standard modeling techniques via extensive empirical evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Routledge, Milton Park (2017)
Book MATH Google Scholar
Candelieri, A., Pedrielli, G.: Treed-gaussian processes with support vector machines as nodes for nonstationary Bayesian optimization. In: 2021 Winter Simulation Conference (WSC), pp. 1–12. IEEE (2021)
Google Scholar
Davis, C.B., Hans, C.M., Santner, T.J.: Prediction of non-stationary response functions using a Bayesian composite gaussian process. Comput. Stat. Data Anal. 154, 107083 (2021)
Article MathSciNet MATH Google Scholar
Fuentes, M., Smith, R.L.: A new class of nonstationary spatial models. Technical report, North Carolina State University, Department of Statistics (2001)
Google Scholar
Gibbs, M.N.: Bayesian Gaussian processes for regression and classification. Ph.D. thesis, Citeseer (1998)
Google Scholar
Gramacy, R.B., Lee, H.K.H.: Bayesian treed gaussian process models with an application to computer modeling. J. Am. Stat. Assoc. 103(483), 1119–1130 (2008)
Article MathSciNet MATH Google Scholar
Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 9(2), 159–195 (2001)
Article Google Scholar
Heinonen, M., Mannerström, H., Rousu, J., Kaski, S., Lähdesmäki, H.: Non-stationary gaussian process regression with hamiltonian monte carlo. In: Gretton, A., Robert, C.C. (eds.) Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, Cadiz, Spain, vol. 51, pp. 732–740. PMLR (2016)
Google Scholar
Jones, D.R., Perttunen, C.D., Stuckman, B.E.: Lipschitzian optimization without the lipschitz constant. J. Optim. Theory Appl. 79(1), 157–181 (1993)
Article MathSciNet MATH Google Scholar
Kim, H.M., Mallick, B.K., Holmes, C.C.: Analyzing nonstationary spatial data using piecewise gaussian processes. J. Am. Stat. Assoc. 100(470), 653–668 (2005)
Article MathSciNet MATH Google Scholar
Lederer, A., Umlauft, J., Hirche, S.: Uniform error bounds for gaussian process regression with application to safe control. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1), 503–528 (1989)
Article MathSciNet MATH Google Scholar
Loh, W.Y.: Classification and regression trees. Wiley Interdiscip. Rev. Data Mining Knowl. Discov. 1(1), 14–23 (2011)
Article Google Scholar
Malu, M., Dasarathy, G., Spanias, A.: Bayesian optimization in high-dimensional spaces: a brief survey. In: 2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA), pp. 1–8. IEEE (2021)
Google Scholar
Marmin, S., Ginsbourger, D., Baccou, J., Liandrat, J.: Warped gaussian processes and derivative-based sequential designs for functions with heterogeneous variations. SIAM/ASA J. Uncertain. Quantif. 6(3), 991–1018 (2018)
Article MathSciNet MATH Google Scholar
Mathesen, L., Yaghoubi, S., Pedrielli, G., Fainekos, G.: Falsification of cyber-physical systems with robustness uncertainty quantification through stochastic optimization with adaptive restart. In: 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), pp. 991–997. IEEE (2019)
Google Scholar
Paciorek, C.J., Schervish, M.J.: Spatial modelling using a new class of nonstationary covariance functions. Environmetrics Official J. Int. Environ. Soc. 17(5), 483–506 (2006)
Google Scholar
Paciorek, C.J.: Nonstationary Gaussian processes for regression and spatial modelling. Ph.D. thesis, Carnegie Mellon University (2003)
Google Scholar
Pope, C.A., et al.: Gaussian process modeling of heterogeneity and discontinuities using voronoi tessellations. Technometrics 63(1), 53–63 (2021)
Article MathSciNet Google Scholar
Rasmussen, C.E.: Gaussian processes in machine learning. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) ML -2003. LNCS (LNAI), vol. 3176, pp. 63–71. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28650-9_4
Chapter Google Scholar
Schmidt, A.M., O’Hagan, A.: Bayesian inference for non-stationary spatial covariance structure via spatial deformations. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 65(3), 743–758 (2003)
Article MathSciNet MATH Google Scholar
Schulz, E., Speekenbrink, M., Krause, A.: A tutorial on gaussian process regression: modelling, exploring, and exploiting functions. J. Math. Psychol. 85, 1–16 (2018)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work is supported in part by National Science Foundation (NSF) under the awards 2200161, 2048223, 2003111, 2046588, 2134256, 1815361, 2031799, 2205080, 1901243, 1540040, 2003111, 2048223, by DARPA ARCOS program under contract FA8750-20-C-0507, Lockheed Martin funded contract FA8750-22-9-0001, and the SenSIP Center.

Author information

Authors and Affiliations

School of ECEE, Arizona State University, Tempe, AZ, 85281, USA
Mohit Malu, Gautam Dasarathy & Andreas Spanias
SCAI, Arizona State University, Tempe, AZ, 85281, USA
Giulia Pedrielli
SenSIP Center, Tempe, USA
Mohit Malu & Andreas Spanias

Authors

Mohit Malu
View author publications
You can also search for this author in PubMed Google Scholar
Giulia Pedrielli
View author publications
You can also search for this author in PubMed Google Scholar
Gautam Dasarathy
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Spanias
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohit Malu .

Editor information

Editors and Affiliations

InsideOpt, Dover, DE, USA
Meinolf Sellmann
Bielefeld University, Bielefeld, Germany
Kevin Tierney

A Appendix

Proof sketch for the Theorem 1 follows along the lines of the proof of Theorem 3.1 in [11]. We get probabilistic uniform error bounds for GPs in each partitions $j \in [p]$ from [11] and we use per partition based bounds to bound the over all function and to derive bound on $L_1$ norm. The proof for the theorem and corollary given as follows:

Proof

1 Following bounds on each partition holds with probability $1-\delta _j$

$$\begin{aligned} \left| g_j(\textbf{x}) - \mu _{n_j}(\textbf{x})\right| \le \sqrt{\beta _j(r)}\sigma _{n_j}(\textbf{x}) + \gamma _j(r), \forall \textbf{x} \in \mathcal {X}_j \end{aligned}$$

(12)

where $\beta _j(r)$ and $\gamma _j(r)$ are given as follows

$$\begin{aligned} \beta _j(r) &= 2\log \left( \frac{M(r,\mathcal {X}_j)}{\delta _j}\right) \end{aligned}$$

(13)

$$\begin{aligned} \gamma _j(r) &= (L_{\mu _{n_j}} + L_{g_j})r + \sqrt{\beta (r)} \omega _{\sigma _{n_j}} \end{aligned}$$

(14)

Now to bound the entire function lets look at the difference $|f(\textbf{x}) - \mu _n(\textbf{x})|$.

$$\begin{aligned} \left| f(\textbf{x}) - \mu _n(\textbf{x})\right| &= \left| \sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\}(g_j(\textbf{x}) - \mu _{n_j}(\textbf{x})) \right| \end{aligned}$$

(15)

$$\begin{aligned} &= \sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\}\left| g_j(\textbf{x}) - \mu _{n_j}(\textbf{x}))\right| \end{aligned}$$

(16)

$$\begin{aligned} &\le \sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\} \left( \sqrt{\beta _j(r)}\sigma _{n_j}(\textbf{x}) + \gamma _j(r)\right) , \forall \textbf{x} \in \mathcal {X}_j \end{aligned}$$

(17)

The last inequality (17) follows from (12) and holds with probability $1-\delta $, where $\delta = \sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\} \delta _j$.

Now, redefining $\sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\} \left( \sqrt{\beta _j(r)}\sigma _{n_j}(\textbf{x})\right) = \sqrt{\beta (r)}\sigma _{n}(\textbf{x})$ and

$\sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\} \gamma _j(r) = \gamma _(r)$, we have the result. $\square $

The proof for the Corollary 1 uses the high confidence bound 10 and is given as follows:

Proof

We know that $L_1$ norm is given by

$$\begin{aligned} \Vert f(\textbf{x}) - \mu _n(\textbf{x})\Vert _1 &= \textrm{E}[\left| f(\textbf{x}) - \mu _n(\textbf{x})\right| ] \end{aligned}$$

(18)

$$\begin{aligned} &= \int \left| f(\textbf{x}) - \mu _n(\textbf{x})\right| d\mu \end{aligned}$$

(19)

$$\begin{aligned} &= \int \left| \sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\}(g_j(\textbf{x}) - \mu _{n_j}(\textbf{x})) \right| d\mu \end{aligned}$$

(20)

$$\begin{aligned} &= \sum _{j=1}^{p} \int \mathbb {1}\{x\in \mathcal {X}_j\} \left| (g_j(\textbf{x}) - \mu _{n_j}(\textbf{x})) \right| d\mu \end{aligned}$$

(21)

$$\begin{aligned} &= \sum _{j=1}^{p} \int _{\mathcal {X}_j} \left| (g_j(\textbf{x}) - \mu _{n_j}(\textbf{x})) \right| d\mu \end{aligned}$$

(22)

$$\begin{aligned} &\le \zeta r^{d} \sum _{j=1}^p M(r,\mathcal {X}_j)\left( \sqrt{\beta _j(r)}\sigma _{n_j}(\textbf{x}) + \gamma _j(r)\right) ~~~\text {holds w.p }1-\delta \end{aligned}$$

(23)

where $\delta = \sum _{j=1}^p \delta _j$ and $\delta _j = 1 - M(r,\mathcal {X}_j)\exp (-\beta _j(r)/2)$. $\square $

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Malu, M., Pedrielli, G., Dasarathy, G., Spanias, A. (2023). Class GP: Gaussian Process Modeling for Heterogeneous Functions. In: Sellmann, M., Tierney, K. (eds) Learning and Intelligent Optimization. LION 2023. Lecture Notes in Computer Science, vol 14286. Springer, Cham. https://doi.org/10.1007/978-3-031-44505-7_28

Download citation

DOI: https://doi.org/10.1007/978-3-031-44505-7_28
Published: 25 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44504-0
Online ISBN: 978-3-031-44505-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Class GP: Gaussian Process Modeling for Heterogeneous Functions

Abstract

Access this chapter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

Proof

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation