Skip to main content

Bayesian Optimization for Reverse Stress Testing

  • Conference paper
  • First Online:
Intelligent Computing and Optimization (ICO 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1324))

Included in the following conference series:

Abstract

Bayesian Optimization with an underlying Gaussian Process is used as an optimization solution to a black-box optimization problem in which the function to be optimized has particular properties that result in difficulties. It can only be expressed in terms of a complicated and lengthy stochastic algorithm, with the added complication that the value returned is only required to be sufficiently near to a pre-determined ‘target’. We consider the context of financial stress testing, for which the data used has a significant noise component. Commonly-used Bayesian Optimization acquisition functions cannot analyze the ‘target’ condition in a satisfactory way, but a simple modification of the ‘Lower Confidence Bound’ acquisition function improves results markedly. A proof that the modified acquisition function is superior to the unmodified version is given.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Basel Committee on Banking Supervision, Stress testing principles d450. Bank for International Settlements (BIS) (2018). https://www.bis.org/bcbs/publ/d450.htm

  2. European Banking Authority: 2020 EU-wide stress test methodological note (2019). https://www.eba.europa.eu/sites/default/documents/files/documents/10180/2841396/ba66328f-476f-4707-9a23-6df5957dc8c1/2020%20EU-wide%20stress%20test%20-%20Draft%20Methodological%20Note.pdf

  3. Frachot, A., Georges, P., Roncalli, T.: Loss Distribution Approach for operational risk, Working paper, Groupe de Recherche Operationnelle, Credit Lyonnais, France (2001). https://ssrn.com/abstract=1032523

  4. Mockus, J.: On Bayesian methods for seeking the extremum. In: Proceedings of IFIP Technical Conference, pp. 400–404 (1974). https://dl.acm.org/citation.cfm?id=646296.687872

  5. Mockus, J., Tiesis, V., Zilinskas, A.: The application of Bayesian methods for seeking the extremum. Towards Global Optimisation (eds. Dixon,L. and Szego,G.P.) vol. 2 (1978)

    Google Scholar 

  6. Mockus, J.: The Bayesian approach to local optimization. In: Bayesian Approach to Global Optimization. Mathematics and Its Applications, vol. 37. Springer, Heidelberg (1989). https://doi.org/10.1007/978-94-009-0909-0_7

  7. Cox, D.D., John, S.: SDO: a statistical method for global optimization. In: Multidisciplinary Design Optimization, pp. 315–329. SIAM, Philadelphia (1997)

    Google Scholar 

  8. Srinivas, N., Krause, A., Kakade, S., Seeger, M.: Gaussian process optimization in the bandit setting: no regret and experimental design. In: Proceedings of ICML 2010, pp. 1015–1022 (2010). https://dl.acm.org/citation.cfm?id=3104322.3104451

  9. Rana, S., Li, C., Gupta, S.: High dimensional Bayesian optimization with elastic Gaussian process. In: Proceedings of 34th International Conference on Machine Learning, Sydney, PMLR 70 (2017)

    Google Scholar 

  10. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  11. Murphy, K.P.: Machine Learning: A Probabilistic Perspective, Chapter 15. MIT Press, Cambridge (2015)

    Google Scholar 

  12. Berk, J., Nguyen, V., Gupta, S., et al.: Exploration enhanced expected improvement for Bayesian optimization. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. LNCS, vol. 11052, pp. 621–637 (2018)

    Google Scholar 

  13. Jones, D.R.: A taxonomy of global optimization methods based on response surfaces. J. Global Optim. 21(4), 345–383 (2001)

    Article  MathSciNet  Google Scholar 

  14. Kara, G., Özmen, A., Weber, G.: Stability advances in robust portfolio optimization under parallelepiped uncertainty. Central Eur. J. Oper. Res. 27, 241–261 (2019). https://doi.org/10.1007/s10100-017-0508-5

  15. Özmen, A., Weber, G.W., Batmaz, I., Kropat, E.: RCMARS: robustification of CMARS with different scenarios under polyhedral uncertainty set. Commun. Nonlinear Sci. Numer. Simul. 16(12), 478–4787 (2011). https://doi.org/10.1016/j.cnsns.2011.04.001

  16. Savku, E., Weber, G.: Stochastic differential games for optimal investment problems in a Markov regime-switching jump-diffusion market. Ann. Oper. Res. (2020). https://doi.org/10.1007/s10479-020-03768-5

    Article  Google Scholar 

  17. Kwon, J., Mertikopoulos, P.: A continuous-time approach to online optimization. J. Dyn. Games 4(2), 125–148 (2017). https://doi.org/10.3934/jdg.2017008

    Article  MathSciNet  MATH  Google Scholar 

  18. Ascher, U.M.: Discrete processes and their continuous limits. J. Dyn. Games 7(2), 123–140 (2020). https://doi.org/10.3934/jdg.2020008

    Article  MathSciNet  MATH  Google Scholar 

  19. Yang, Y., Sutanto, C.: Chance-constrained optimization for nonconvex programs using scenario-based methods. ISA Trans. 90, 157–168 (2019). https://doi.org/10.1016/j.isatra.2019.01.013

    Article  Google Scholar 

  20. Ozer, F., Toroslu, I.H., Karagoz, P., Yucel, F.: Dynamic programming solution to ATM cash replenishment optimization problem. In: Intelligent Computing & Optimization. ICO 2018, vol. 866. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-00979-3_45

  21. Samakpong, T., Ongsakul, W., Nimal Madhu, M.: Optimal power flow considering cost of wind and solar power uncertainty using particle swarm optimization. In: Intelligent Computing and Optimization. ICO 2019, vol. 1072. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-33585-4_19

  22. Yan, Y.: (2016). https://cran.r-project.org/web/packages/rBayesianOptimization/index.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Mitic .

Editor information

Editors and Affiliations

Appendix A

Appendix A

This proof shows that the expected number of ‘expensive’ function evaluations of g (Eq. 2) is greater for LCB acquisition than for ZERO acquisition.

We first define Local Regret, which is the difference between a proposed function estimate and the actual function value. That is followed by Cumulative Regret, which a sum of local regrets. All other notation is defined in the main body of this paper. The following proof depends on a probability bound on the error estimate for g(xn) at an evaluation point xn due to Srinivas et al. [8]. With that bound, the general strategy is to calculate an upper bound for local regret, use that bound to determine the expected value of local regret, and then to calculate the expected number of ‘expensive’ function evaluations using the ZERO and LCB acquisition functions.

Definitions: Local regret \(r_{n}\) and Cumulative regret \(R_{N}\)

$$ r_{n} = g\left( {\hat{x}} \right) - g\left( {x_{n} } \right); 1 \le n \le N $$
(A1)
$$ R_{N} = \mathop \sum \nolimits_{n = 1}^{N} r_{n} $$
(A2)

Definition: \(\beta_{n}\) Srinivas [8], Appendix A.1, Lemma 5.1, for constants Cn > 0 that satisfy \(\mathop \sum \nolimits_{n \ge 1} C_{n}^{ - 1} = 1\), and small \(\delta\):

$$ \beta_{n} = 2 \log \left( {\frac{{\left| I \right|C_{n} }}{\delta }} \right). $$
(A3)

Using the definition of local regret (Srinivas [8] Appendix A.1, Lemma 5.1), Equation (A4) provides upper and lower confidence bound for the GP “mu” term.

$$ P\left( {\left| {g\left( x \right) - \mu_{n - 1} \left( x \right)} \right| \le \sqrt {\beta_{n} } \sigma_{n - 1} \left( x \right)} \right) \ge 1 - \delta ; \forall x \in I; 1 \le n \le N $$
(A4)

Then since \(\hat{x}\) is an optimal solution (so that \(g\left( {\hat{x}} \right) < g\left( {x_{n} } \right)\)), these upper and lower bounds apply respectively:

$$ \begin{gathered} \mu_{n - 1} \left( {\hat{x}} \right) + \sqrt {\beta_{n} } \sigma_{n - 1} \left( {\hat{x}} \right) \le \mu_{n - 1} \left( {x_{n} } \right) + \sqrt {\beta_{n} } \sigma_{n - 1} \left( {x_{n} } \right) \hfill \\ \mu_{n - 1} \left( {x_{n} } \right) - \sqrt {\beta_{n} } \sigma_{n - 1} \left( {x_{n} } \right) \le \mu_{n - 1} \left( {\hat{x}} \right) + \sqrt {\beta_{n} } \sigma_{n - 1} \left( {\hat{x}} \right). \hfill \\ \end{gathered} $$
(A5)

The general strategy in this proof is to estimate the maximum cumulative regret in each of these cases for LCB and ZERO acquisitions, and then calculate the expected difference between the two.

Proposition: ZERO acquisition converges faster than LCB acquisition in the case \(P\left( * \right) > 1 - \delta\) (Eq. A4).

Proof: First, from Eq. A4 with probability greater than \(1 - \delta\), the Local Regret calculation proceeds as in Eq. A6. The first line uses Eq. A4 with \(x = \hat{x}\), the second line uses Eq. A5 and the third line uses Eq. A4 with \(x = x_{n}\). The fourth line uses an upper bound: \(\beta_{n} = \mathop {\max }\limits_{n} \left( {\beta_{n} } \right)\).

$$ \begin{array}{*{20}c} {r_{n} \le \mu _{{n - 1}} (\hat{x}) + \sqrt {\beta _{n} } \sigma _{{n - 1}} (\hat{x}) - g(x_{n} )} \\ { \le \mu _{{n - 1}} (x_{n} ) + \sqrt {\beta _{n} } \sigma _{{n - 1}} (x_{n} ) - g(x_{n} )} \\ { \le 2\sqrt {\beta _{n} } \sigma _{{n - 1}} (x_{n} )} \\ { \le 2\sqrt \beta \sigma _{{n - 1}} (x_{n} )} \\ \end{array} $$
(A6)

Now consider the Local Regret in the cases of LCB and ZERO acquisition. ZERO acquisition is always non-negative but LCB acquisition can be negative. So we partition the values of n into those which result in zero or positive LCB acquisition (set S) and those which result in negative LCB acquisition (the complement, set \(S^{\prime}\)). These sets are shown in Eq. A7.

$$ \begin{gathered} S = \left\{ {n: \mu_{n - 1} \left( {x_{n} } \right) - \kappa \sigma_{n - 1} \left( {x_{n} } \right) \ge 0; 1 \le n \le N} \right\} \hfill \\ S^{\prime} = \left\{ {n: \mu_{n - 1} \left( {x_{n} } \right) - \kappa \sigma_{n - 1} \left( {x_{n} } \right) < 0; 1 \le n \le N} \right\} \hfill \\ \end{gathered} $$
(A7)

For S, the evaluation points proposed are identical for the two acquisition functions, since they both correspond to the same minimum. Therefore, using superscripts to denote the regrets for the two acquisition functions, the following equality applies.

$$ r_{n}^{{\left( {LCB} \right)}} = r_{n}^{{\left( {ZERO} \right)}} ; n \in S $$
(A8)

For \(S^{\prime}\), ZERO acquisition returns a proposal that corresponds to a zero of the acquisition function, whereas the equivalent for LCB acquisition is negative, and we introduce a term \(\phi_{n}\) to account for the difference from zero (Eq. A9).

$$ \begin{gathered} \mu_{n - 1} \left( {x_{n} } \right) = \kappa \sigma_{n - 1} \left( {x_{n} } \right) \,\left( {ZERO} \right) \hfill \\ \mu_{n - 1} \left( {x_{n} } \right) = \kappa \sigma_{n - 1} \left( {x_{n} } \right) - \phi_{n} \,\left( {LCB} \right) \hfill \\ \end{gathered} $$
(A9)

This leads to the following expressions for the two regrets, using Eq. A6.

$$ \left\{ {\begin{array}{*{20}l} {\mathop {\max }\limits_{n} \left( {r_{n}^{{\left( {ZERO} \right)}} } \right) = 2\sqrt \beta \sigma _{{n - 1}} \left( {x_{n} } \right) = \frac{{2\sqrt \beta }}{\kappa }\mu _{{n - 1}} \left( {x_{n} } \right)} \hfill \\ {\mathop {\max }\limits_{n} \left( {r_{n}^{{\left( {LCB} \right)}} } \right) = 2\sqrt \beta \sigma _{{n - 1}} \left( {x_{n} } \right) = \frac{{2\sqrt \beta }}{\kappa }\left( {\mu _{{n - 1}} \left( {x_{n} } \right) + \phi _{n} } \right)} \hfill \\ \end{array} } \right. $$
(A10)

Equation (A11) shows the partitioning the Cumulative Regret between sets S and \(S^{\prime}\).

$$ R_{N} = \mathop \sum \nolimits_{n \in S} r_{n} + \mathop \sum \nolimits_{{n \in S^{\prime}}} r_{n} $$
(A11)

Then, Eq. A12 show the maximum Cumulative Regret for ZERO and LCB acquisitions.

$$ \left\{ {\begin{array}{*{20}l} {\max \left( {R_{N}^{{\left( {ZERO} \right)}} } \right) = \sum\nolimits_{{n \in S}} {r_{n} } + \frac{{2\sqrt \beta }}{\kappa }\sum\nolimits_{{n \in S^{\prime } }} {\mu _{{n - 1}} } \left( {x_{n} } \right)} \hfill \\ {\max \left( {R_{N}^{{\left( {LCB} \right)}} } \right) = \sum\nolimits_{{n \in S}} {r_{n} } + \frac{{2\sqrt \beta }}{\kappa }\left( {\sum\nolimits_{{n \in S^{\prime } }} {\mu _{{n - 1}} } \left( {x_{n} } \right) + \phi _{n} } \right)} \hfill \\ \end{array} } \right. $$
(A12)

Equations A12 then imply that the inequality in Eq. A13.

$$ \max \left( {R_{N}^{{\left( {LCB} \right)}} } \right) - \max \left( {R_{N}^{{\left( {ZERO} \right)}} } \right) = \frac{2\sqrt \beta }{\kappa }\phi_{n} > 0;1 \le n \le N $$
(A13)

Equation (A13) is a strong indication that ZERO acquisition leads to faster convergence than LCB acquisition, since it applies with high probability \(1 - \delta\). This completes the proof.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mitic, P. (2021). Bayesian Optimization for Reverse Stress Testing. In: Vasant, P., Zelinka, I., Weber, GW. (eds) Intelligent Computing and Optimization. ICO 2020. Advances in Intelligent Systems and Computing, vol 1324. Springer, Cham. https://doi.org/10.1007/978-3-030-68154-8_17

Download citation

Publish with us

Policies and ethics