Skip to main content

Class GP: Gaussian Process Modeling for Heterogeneous Functions

  • Conference paper
  • First Online:
Learning and Intelligent Optimization (LION 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14286))

Included in the following conference series:

  • 460 Accesses

Abstract

Gaussian Processes (GP) are a powerful framework for modeling expensive black-box functions and have thus been adopted for various challenging modeling and optimization problems. In GP-based modeling, we typically default to a stationary covariance kernel to model the underlying function over the input domain, but many real-world applications, such as controls and cyber-physical system safety, often require modeling and optimization of functions that are locally stationary and globally non-stationary across the domain; using standard GPs with a stationary kernel often yields poor modeling performance in such scenarios. In this paper, we propose a novel modeling technique called Class-GP (Class Gaussian Process) to model a class of heterogeneous functions, i.e., non-stationary functions which can be divided into locally stationary functions over the partitions of input space with one active stationary function in each partition. We provide theoretical insights into the modeling power of Class-GP and demonstrate its benefits over standard modeling techniques via extensive empirical evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Routledge, Milton Park (2017)

    Book  MATH  Google Scholar 

  2. Candelieri, A., Pedrielli, G.: Treed-gaussian processes with support vector machines as nodes for nonstationary Bayesian optimization. In: 2021 Winter Simulation Conference (WSC), pp. 1–12. IEEE (2021)

    Google Scholar 

  3. Davis, C.B., Hans, C.M., Santner, T.J.: Prediction of non-stationary response functions using a Bayesian composite gaussian process. Comput. Stat. Data Anal. 154, 107083 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  4. Fuentes, M., Smith, R.L.: A new class of nonstationary spatial models. Technical report, North Carolina State University, Department of Statistics (2001)

    Google Scholar 

  5. Gibbs, M.N.: Bayesian Gaussian processes for regression and classification. Ph.D. thesis, Citeseer (1998)

    Google Scholar 

  6. Gramacy, R.B., Lee, H.K.H.: Bayesian treed gaussian process models with an application to computer modeling. J. Am. Stat. Assoc. 103(483), 1119–1130 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  7. Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 9(2), 159–195 (2001)

    Article  Google Scholar 

  8. Heinonen, M., Mannerström, H., Rousu, J., Kaski, S., Lähdesmäki, H.: Non-stationary gaussian process regression with hamiltonian monte carlo. In: Gretton, A., Robert, C.C. (eds.) Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, Cadiz, Spain, vol. 51, pp. 732–740. PMLR (2016)

    Google Scholar 

  9. Jones, D.R., Perttunen, C.D., Stuckman, B.E.: Lipschitzian optimization without the lipschitz constant. J. Optim. Theory Appl. 79(1), 157–181 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  10. Kim, H.M., Mallick, B.K., Holmes, C.C.: Analyzing nonstationary spatial data using piecewise gaussian processes. J. Am. Stat. Assoc. 100(470), 653–668 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  11. Lederer, A., Umlauft, J., Hirche, S.: Uniform error bounds for gaussian process regression with application to safe control. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  12. Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Program. 45(1), 503–528 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  13. Loh, W.Y.: Classification and regression trees. Wiley Interdiscip. Rev. Data Mining Knowl. Discov. 1(1), 14–23 (2011)

    Article  Google Scholar 

  14. Malu, M., Dasarathy, G., Spanias, A.: Bayesian optimization in high-dimensional spaces: a brief survey. In: 2021 12th International Conference on Information, Intelligence, Systems & Applications (IISA), pp. 1–8. IEEE (2021)

    Google Scholar 

  15. Marmin, S., Ginsbourger, D., Baccou, J., Liandrat, J.: Warped gaussian processes and derivative-based sequential designs for functions with heterogeneous variations. SIAM/ASA J. Uncertain. Quantif. 6(3), 991–1018 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  16. Mathesen, L., Yaghoubi, S., Pedrielli, G., Fainekos, G.: Falsification of cyber-physical systems with robustness uncertainty quantification through stochastic optimization with adaptive restart. In: 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), pp. 991–997. IEEE (2019)

    Google Scholar 

  17. Paciorek, C.J., Schervish, M.J.: Spatial modelling using a new class of nonstationary covariance functions. Environmetrics Official J. Int. Environ. Soc. 17(5), 483–506 (2006)

    Google Scholar 

  18. Paciorek, C.J.: Nonstationary Gaussian processes for regression and spatial modelling. Ph.D. thesis, Carnegie Mellon University (2003)

    Google Scholar 

  19. Pope, C.A., et al.: Gaussian process modeling of heterogeneity and discontinuities using voronoi tessellations. Technometrics 63(1), 53–63 (2021)

    Article  MathSciNet  Google Scholar 

  20. Rasmussen, C.E.: Gaussian processes in machine learning. In: Bousquet, O., von Luxburg, U., Rätsch, G. (eds.) ML -2003. LNCS (LNAI), vol. 3176, pp. 63–71. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28650-9_4

    Chapter  Google Scholar 

  21. Schmidt, A.M., O’Hagan, A.: Bayesian inference for non-stationary spatial covariance structure via spatial deformations. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 65(3), 743–758 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  22. Schulz, E., Speekenbrink, M., Krause, A.: A tutorial on gaussian process regression: modelling, exploring, and exploiting functions. J. Math. Psychol. 85, 1–16 (2018)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work is supported in part by National Science Foundation (NSF) under the awards 2200161, 2048223, 2003111, 2046588, 2134256, 1815361, 2031799, 2205080, 1901243, 1540040, 2003111, 2048223, by DARPA ARCOS program under contract FA8750-20-C-0507, Lockheed Martin funded contract FA8750-22-9-0001, and the SenSIP Center.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohit Malu .

Editor information

Editors and Affiliations

A Appendix

A Appendix

Proof sketch for the Theorem 1 follows along the lines of the proof of Theorem 3.1 in [11]. We get probabilistic uniform error bounds for GPs in each partitions \(j \in [p]\) from [11] and we use per partition based bounds to bound the over all function and to derive bound on \(L_1\) norm. The proof for the theorem and corollary given as follows:

Proof

1 Following bounds on each partition holds with probability \(1-\delta _j\)

$$\begin{aligned} \left| g_j(\textbf{x}) - \mu _{n_j}(\textbf{x})\right| \le \sqrt{\beta _j(r)}\sigma _{n_j}(\textbf{x}) + \gamma _j(r), \forall \textbf{x} \in \mathcal {X}_j \end{aligned}$$
(12)

where \(\beta _j(r)\) and \(\gamma _j(r)\) are given as follows

$$\begin{aligned} \beta _j(r) &= 2\log \left( \frac{M(r,\mathcal {X}_j)}{\delta _j}\right) \end{aligned}$$
(13)
$$\begin{aligned} \gamma _j(r) &= (L_{\mu _{n_j}} + L_{g_j})r + \sqrt{\beta (r)} \omega _{\sigma _{n_j}} \end{aligned}$$
(14)

Now to bound the entire function lets look at the difference \(|f(\textbf{x}) - \mu _n(\textbf{x})|\).

$$\begin{aligned} \left| f(\textbf{x}) - \mu _n(\textbf{x})\right| &= \left| \sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\}(g_j(\textbf{x}) - \mu _{n_j}(\textbf{x})) \right| \end{aligned}$$
(15)
$$\begin{aligned} &= \sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\}\left| g_j(\textbf{x}) - \mu _{n_j}(\textbf{x}))\right| \end{aligned}$$
(16)
$$\begin{aligned} &\le \sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\} \left( \sqrt{\beta _j(r)}\sigma _{n_j}(\textbf{x}) + \gamma _j(r)\right) , \forall \textbf{x} \in \mathcal {X}_j \end{aligned}$$
(17)

The last inequality (17) follows from (12) and holds with probability \(1-\delta \), where \(\delta = \sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\} \delta _j\).

Now, redefining \(\sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\} \left( \sqrt{\beta _j(r)}\sigma _{n_j}(\textbf{x})\right) = \sqrt{\beta (r)}\sigma _{n}(\textbf{x})\) and

\(\sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\} \gamma _j(r) = \gamma _(r)\), we have the result.    \(\square \)

The proof for the Corollary 1 uses the high confidence bound 10 and is given as follows:

Proof

We know that \(L_1\) norm is given by

$$\begin{aligned} \Vert f(\textbf{x}) - \mu _n(\textbf{x})\Vert _1 &= \textrm{E}[\left| f(\textbf{x}) - \mu _n(\textbf{x})\right| ] \end{aligned}$$
(18)
$$\begin{aligned} &= \int \left| f(\textbf{x}) - \mu _n(\textbf{x})\right| d\mu \end{aligned}$$
(19)
$$\begin{aligned} &= \int \left| \sum _{j=1}^{p} \mathbb {1}\{x\in \mathcal {X}_j\}(g_j(\textbf{x}) - \mu _{n_j}(\textbf{x})) \right| d\mu \end{aligned}$$
(20)
$$\begin{aligned} &= \sum _{j=1}^{p} \int \mathbb {1}\{x\in \mathcal {X}_j\} \left| (g_j(\textbf{x}) - \mu _{n_j}(\textbf{x})) \right| d\mu \end{aligned}$$
(21)
$$\begin{aligned} &= \sum _{j=1}^{p} \int _{\mathcal {X}_j} \left| (g_j(\textbf{x}) - \mu _{n_j}(\textbf{x})) \right| d\mu \end{aligned}$$
(22)
$$\begin{aligned} &\le \zeta r^{d} \sum _{j=1}^p M(r,\mathcal {X}_j)\left( \sqrt{\beta _j(r)}\sigma _{n_j}(\textbf{x}) + \gamma _j(r)\right) ~~~\text {holds w.p }1-\delta \end{aligned}$$
(23)

where \(\delta = \sum _{j=1}^p \delta _j\) and \(\delta _j = 1 - M(r,\mathcal {X}_j)\exp (-\beta _j(r)/2)\).    \(\square \)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Malu, M., Pedrielli, G., Dasarathy, G., Spanias, A. (2023). Class GP: Gaussian Process Modeling for Heterogeneous Functions. In: Sellmann, M., Tierney, K. (eds) Learning and Intelligent Optimization. LION 2023. Lecture Notes in Computer Science, vol 14286. Springer, Cham. https://doi.org/10.1007/978-3-031-44505-7_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-44505-7_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-44504-0

  • Online ISBN: 978-3-031-44505-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics