Skip to main content

High Dimensional Bayesian Optimization with Kernel Principal Component Analysis

  • Conference paper
  • First Online:
Parallel Problem Solving from Nature – PPSN XVII (PPSN 2022)

Abstract

Bayesian Optimization (BO) is a surrogate-based global optimization strategy that relies on a Gaussian Process regression (GPR) model to approximate the objective function and an acquisition function to suggest candidate points. It is well-known that BO does not scale well for high-dimensional problems because the GPR model requires substantially more data points to achieve sufficient accuracy and acquisition optimization becomes computationally expensive in high dimensions. Several recent works aim at addressing these issues, e.g., methods that implement online variable selection or conduct the search on a lower-dimensional sub-manifold of the original search space. Advancing our previous work of PCA-BO that learns a linear sub-manifold, this paper proposes a novel kernel PCA-assisted BO (KPCA-BO) algorithm, which embeds a non-linear sub-manifold in the search space and performs BO on this sub-manifold. Intuitively, constructing the GPR model on a lower-dimensional sub-manifold helps improve the modeling accuracy without requiring much more data from the objective function. Also, our approach defines the acquisition function on the lower-dimensional sub-manifold, making the acquisition optimization more manageable.

We compare the performance of KPCA-BO to a vanilla BO and to PCA-BO on the multi-modal problems of the COCO/BBOB benchmark suite. Empirical results show that KPCA-BO outperforms BO in terms of convergence speed on most test problems, and this benefit becomes more significant when the dimensionality increases. For the 60D functions, KPCA-BO achieves better results than PCA-BO for many test cases. Compared to the vanilla BO, it efficiently reduces the CPU time required to train the GPR model and to optimize the acquisition function compared to the vanilla BO.

K. Antonov—Work done while visiting Sorbonne Université in Paris.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The outer product is a linear operator defined as \(\forall h\in \mathcal {H}, [\phi (\mathbf {x})\phi (\mathbf {x})^\top ](h):h\mapsto \langle \phi (\mathbf {x}), h \rangle _{\mathcal {H}}\phi (\mathbf {x})\). Hence, the sample covariance is also a linear operator \(C:\mathcal {H} \rightarrow \mathcal {H} \).

References

  1. Bellman, R.: Dynamic programming. Science 153(3731), 34–37 (1966). https://doi.org/10.1126/science.153.3731.34

    Article  MATH  Google Scholar 

  2. Ben Salem, M., Bachoc, F., Roustant, O., Gamboa, F., Tomaso, L.: Sequential dimension reduction for learning features of expensive black-box functions (2019). https://hal.archives-ouvertes.fr/hal-01688329, preprint

  3. Binois, M., Wycoff, N.: A survey on high-dimensional Gaussian process modeling with application to Bayesian optimization. arXiv:2111.05040 [math], November 2021

  4. Bull, A.D.: Convergence rates of efficient global optimization algorithms. J. Mach. Learn. Res. 12, 2879–2904 (2011). http://dl.acm.org/citation.cfm?id=2078198

  5. Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16(5), 1190–1208 (1995). https://doi.org/10.1137/0916069

    Article  MathSciNet  MATH  Google Scholar 

  6. Delbridge, I., Bindel, D., Wilson, A.G.: Randomly Projected Additive Gaussian Processes for Regression. In: Proc. of the 37th International Conference on Machine Learning (ICML), pp. 2453–2463. PMLR, November 2020

    Google Scholar 

  7. Duvenaud, D.K., Nickisch, H., Rasmussen, C.: Additive Gaussian Processes. In: Advances in Neural Information Processing Systems, vol. 24. Curran Associates, Inc. (2011)

    Google Scholar 

  8. García-González, A., Huerta, A., Zlotnik, S., Díez, P.: A kernel principal component analysis (kpca) digest with a new backward mapping (pre-image reconstruction) strategy. CoRR abs/2001.01958 (2020)

    Google Scholar 

  9. Gaudrie, D., Le Riche, R., Picheny, V., Enaux, B., Herbert, V.: Modeling and optimization with Gaussian processes in reduced eigenbases. Struct. Multidiscip. Optim. 61(6), 2343–2361 (2020). https://doi.org/10.1007/s00158-019-02458-6

    Article  MATH  Google Scholar 

  10. Ginsbourger, D., Roustant, O., Schuhmacher, D., Durrande, N., Lenz, N.: On ANOVA decompositions of kernels and Gaussian random field paths. arXiv:1409.6008 [math, stat], October 2014

  11. Guhaniyogi, R., Dunson, D.B.: Compressed gaussian process for manifold regression. J. Mach. Learn. Res. 17(69), 1–26 (2016). http://jmlr.org/papers/v17/14-230.html

  12. Hansen, N., Auger, A., Ros, R., Mersmann, O., Tušar, T., Brockhoff, D.: COCO: a platform for comparing continuous optimizers in a black-box setting. Optimization Methods and Software, pp. 1–31 (2020)

    Google Scholar 

  13. Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 9(2), 159–195 (2001). https://doi.org/10.1162/106365601750190398

  14. Huang, W., Zhao, D., Sun, F., Liu, H., Chang, E.: Scalable Gaussian process regression using deep neural networks. In: Proceedings of the 24th International Conference on Artificial Intelligence (IJCAI), pp. 3576–3582. AAAI Press (2015)

    Google Scholar 

  15. Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Global Optim. 13(4), 455–492 (1998). https://doi.org/10.1023/A:1008306431147

    Article  MathSciNet  MATH  Google Scholar 

  16. Kapsoulis, D., Tsiakas, K., Asouti, V., Giannakoglou, K.C.: The use of kernel PCA in evolutionary optimization for computationally demanding engineering applications. In: 2016 IEEE Symposium Series on Computational Intelligence, SSCI 2016, Athens, Greece, December 6–9, 2016, pp. 1–8. IEEE (2016). https://doi.org/10.1109/SSCI.2016.7850203

  17. Li, C., Gupta, S., Rana, S., Nguyen, V., Venkatesh, S., Shilton, A.: High dimensional bayesian optimization using dropout. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), pp. 2096–2102. AAAI Press (2017)

    Google Scholar 

  18. Močkus, J.: On bayesian methods for seeking the extremum. In: Marchuk, G.I. (ed.) Optimization Techniques 1974. LNCS, vol. 27, pp. 400–404. Springer, Heidelberg (1975). https://doi.org/10.1007/3-540-07165-2_55

    Chapter  Google Scholar 

  19. Muehlenstaedt, T., Roustant, O., Carraro, L., Kuhnt, S.: Data-driven Kriging models based on FANOVA-decomposition. Stat. Comput. 22(3), 723–738 (2012). https://doi.org/10.1007/s11222-011-9259-7

    Article  MathSciNet  MATH  Google Scholar 

  20. Niederreiter, H.: Low-discrepancy and low-dispersion sequences. J. Number Theory 30(1), 51–70 (1988)

    Article  MathSciNet  Google Scholar 

  21. Raponi, E., Wang, H., Bujny, M., Boria, S., Doerr, C.: High dimensional bayesian optimization assisted by principal component analysis. In: Bäck, T., Preuss, M., Deutz, A., Wang, H., Doerr, C., Emmerich, M., Trautmann, H. (eds.) PPSN 2020. LNCS, vol. 12269, pp. 169–183. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58112-1_12

    Chapter  Google Scholar 

  22. Rasmussen, C.E., Williams, C.K.I.: Gaussian processes for machine learning. Adaptive computation and machine learning, MIT Press (2006), https://www.worldcat.org/oclc/61285753

  23. Rolland, P., Scarlett, J., Bogunovic, I., Cevher, V.: High-dimensional bayesian optimization via additive models with overlapping groups. In: Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, pp. 298–307. PMLR, March 2018

    Google Scholar 

  24. Santner, T.J., Williams, B.J., Notz, W.I.: The Design and Analysis of Computer Experiments. Springer (2003). https://doi.org/10.1007/978-1-4757-3799-8

  25. Schölkopf, B., Smola, A., Müller, K.R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10(5), 1299–1319 (1998). https://doi.org/10.1162/089976698300017467

    Article  Google Scholar 

  26. Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 104(1), 148–175 (2016). https://doi.org/10.1109/JPROC.2015.2494218

    Article  Google Scholar 

  27. Ulmasov, D., Baroukh, C., Chachuat, B., Deisenroth, M., Misener, R.: Bayesian optimization with dimension scheduling: application to biological systems. Comput. Aided Chem. Eng. 38, November 2015. https://doi.org/10.1016/B978-0-444-63428-3.50180-6

  28. Wang, H., Vermetten, D., Ye, F., Doerr, C., Bäck, T.: IOHanalyzer: performance analysis for iterative optimization heuristic. ACM Trans. Evol. Learn. Optim. (2022). https://doi.org/10.1145/3510426

    Article  Google Scholar 

  29. Wang, Z., Hutter, F., Zoghi, M., Matheson, D., De Freitas, N.: Bayesian optimization in a billion dimensions via random embeddings (2016)

    Google Scholar 

Download references

Acknowledgments

Our work is supported by the Paris Ile-de-France region (via the DIM RFSI project AlgoSelect), by the CNRS INS2I institute (via the RandSearch project), by the PRIME programme of the German Academic Exchange Service (DAAD) with funds from the German Federal Ministry of Education and Research (BMBF), and by RFBR and CNRS, project number 20-51-15009.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kirill Antonov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Antonov, K., Raponi, E., Wang, H., Doerr, C. (2022). High Dimensional Bayesian Optimization with Kernel Principal Component Analysis. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds) Parallel Problem Solving from Nature – PPSN XVII. PPSN 2022. Lecture Notes in Computer Science, vol 13398. Springer, Cham. https://doi.org/10.1007/978-3-031-14714-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-14714-2_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-14713-5

  • Online ISBN: 978-3-031-14714-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics