1 Introduction

In this paper we consider the numerical approximation of optimal control problems. The subject is of importance for many applications such as aerospace engineering, chemical processing and resource economics, among others.

The value function of an optimal control problem is obtained in terms of a first-order nonlinear Hamilton–Jacobi–Bellman (HJB) partial differential equation. A bottleneck in the computation of the value function comes from the need to approach a nonlinear partial differential equation in dimension n, which is a challenging problem in high dimensions.

Several methods have been studied in the literature trying to mitigate the so called curse of dimensionality although it is still a difficult task. As stated in [10], the relevance of efficient numerical methods can be seen by the fact that methods solving the HJB equation are rarely used in practice due to the necessary computational effort. We mention some related references that are not intended to be a complete list. In [14], a domain decomposition technique is considered. In [26] semi-Lagrangian methods are studied. The authors in [22] apply data-based approximate policy iteration methods. A procedure for the numerical approximation of high-dimensional HJB equations associated to optimal feedback control problems for semilinear parabolic equations is proposed in [19]. In [9] a tensor decomposition approach is presented. In [10] an approach based on low-rank tensor train decompositions is applied. Methods using sparse grids for HJB equations are presented in [6]. The solution of HJB equations on a tree structure was presented in [2]. The author of [23, 24] discusses an approach to certain nonlinear HJB PDEs which is not subject to the curse of dimensionality. The approach utilizes the max-plus algebra. In [8] a data-driven approach based on the knowledge of the value function and its gradient on sample points is developed. The authors of [3] present a new approach where the value function is computed using radial basis functions. Expanded literature on the control of partial differential equations using dynamic programming approach can be found in the last two references.

In the present paper we concentrate on reduced order models based on proper orthogonal decomposition (POD) methods. Our work is related to [1]. In this reference the authors propose two different ways to apply POD methods in the numerical approximation of the fully-discrete value function. In the first approach, the authors choose a set of nodes in the original domain \(\Omega \subset {{\mathbb {R}}}^n\) and project then onto a reduced space \(\Omega ^r \subset {{\mathbb {R}}}^r\) with \(r<n\) to get a new set of nodes. The problem in this procedure is that it produces a nonuniform grid in which the mesh diameter cannot be predicted a priori. Consequently, the method is not suitable to implement in practice. Furthermore, although this is not reflected in the error bounds in [1], the error also depends on the interpolation properties of the a priori unknown reduced mesh in \(\Omega ^r \subset {{\mathbb {R}}}^r\). In the second approach, the authors use a uniform mesh over the reduced space \(\Omega ^r\). This second method can be implemented in practice (the numerical experiments in [1] are carried out with this method). However, as the authors state, the computation of an upper bound for the error in this case is much more involved and the error bound proved in [1] has some drawbacks see [1, Remark 4.7, Remark 4.9].

Recently in [7], a new error analysis is introduced in which a bound of size \(O(h+k)\) is obtained for the fully discrete approximations to infinite horizon problems via the dynamic programming approach. In this error bound, h is the time step while k is the spatial mesh diameter. This error bound improves existing results in the literature, where only O(k/h) error bounds are proved, see [12, 13].

To bound the error in the first method in [1] the authors follow the technique in [12, Corollary 2.4], [13, Theorem 1.3] obtaining a bound for the error of size O(k/h). For the second method in [1], the factor 1/h also multiplies all the terms on the right-hand side of the a priori error bound.

In this paper we present a new approach, similar to the second method in [1], but with snapshots based on the value at different times of the time derivative of the state of the controlled nonlinear dynamical system, instead of values of the state at different times. This new approach is inspired in the recent results in [15] where the authors prove that the use of snapshots based on time derivatives has the advantage of providing pointwise estimates for the error between a function and its projection onto the POD space.The idea of using snapshots approaching the time derivatives is not new, although most of the references in the literature employ first order difference quotients (DQs) (i.e. first order divided finite differences) instead of Galerkin time derivatives, as in [15] and the present paper. In [21] the set of snapshots (at different times) is increased with DQs to carry out the error analysis for the case in which projections respect to the \(H_0^1\) norm are considered. In a more recent paper, [20], the authors show that the use of DQs has the added property of allowing to prove pointwise estimates in time. In a later paper, [11], the authors prove that one does not need to double the set of snapshots with values at different times plus DQs since only DQs plus a single initial value are enough to get pointwise estimates. This is a very interesting result because one can work with the same number of snapshots as in the standard case (the one with only values of the states at different times). In [15], the authors prove that this is also the case with time derivatives. A set of snapshots based on time derivatives plus the snapshot at the initial time (or the mean value of the states) is able to provide pointwise in time error estimates. This is the idea we apply in the present paper. Moreover, we carry out a different error analysis based on the recent results obtained in [7] that allow us to get sharper error bounds free of 1/h factors. This is in agreement with the numerical investigations in the literature where the 1/h behaviour in the error bounds of fully discrete methods has never been observed. Also, the use of snapshots based on time derivatives allows us to give a bound for some terms that could not be bounded with the standard approach. Both facts, the new technique used to bound the error that follows ideas in [7] together with the use of snapshots based on time derivatives, are the key ingredients to get error bounds for the new method that are optimal in terms of the time step h and the mesh diameter of the reduced space \(k_r\). As usual, our error bounds for the POD method depend also on the size of the tail of eigenvalues in the singular value decomposition.

The outline of the paper is as follows. In Sect. 2 we state the model problem and some preliminary results. In Sect. 3 we introduce the POD approximation and carry out the error analysis of the method. Finally, in Sect. 4 we show some numerical experiments in which we implement the method we propose in the paper. In the experiments of Sect. 4 we choose the same numerical tests as in [1] to compare our results with those in this related reference. The method introduced in the present paper seems to produce better results than those shown in [1]. We finish the paper with some conclusions.

2 Model Problem and Standard Numerical Approximation

In the sequel, \(\Vert \cdot \Vert \) denotes any norm associated to an inner product and \(\Vert \cdot \Vert _\infty \) denotes the maximum norm for vectors in \({{\mathbb {R}}}^n\), \(n\ge 1\). We will also denote by \(\Vert \cdot \Vert _2\) the standard euclidean norm. In particular, in the numerical experiments, we use a weighted norm \(\Vert \cdot \Vert \) slightly different from the standard euclidean norm \(\Vert \cdot \Vert _2\).

For a nonlinear mapping

$$\begin{aligned} f:{{\mathbb {R}}}^n\times {{\mathbb {R}}}^m\rightarrow {{\mathbb {R}}}^n, \end{aligned}$$

and a given initial condition \(y_0 \in {{\mathbb {R}}}^n\) let us consider the controlled nonlinear dynamical system

$$\begin{aligned} \dot{y}(t)=f(y(t),u(t))\in {{\mathbb {R}}}^n,\quad t>0,\quad y(0)=y_0\in {{\mathbb {R}}}^n, \end{aligned}$$
(1)

together with the infinite horizon cost functional

$$\begin{aligned} J(y,u)=\int _0^\infty g(y(t),u(t))e^{-\lambda t}~dt. \end{aligned}$$
(2)

In (2) \(\lambda >0\) is a given weighting parameter and

$$\begin{aligned} g:{{\mathbb {R}}}^n\times {{\mathbb {R}}}^m\rightarrow {{\mathbb {R}}}. \end{aligned}$$

The set of admissible controls is

$$\begin{aligned} {{\mathbb {U}}}_{\textrm{ad}}=\left\{ u\in {{\mathbb {U}}}\mid u(t)\in U_{\textrm{ad}} \ {\textrm{for}}\ \textrm{almost}\ \textrm{all}\ t\ge 0\right\} , \end{aligned}$$

where \({{\mathbb {U}}}=L^2(0,\infty ;{{\mathbb {R}}}^m)\) and \(U_{\textrm{ad}}\subset {{\mathbb {R}}}^m\) is a compact convex subset.

As in [1, Assumption 2.1] we assume the following hypotheses:

  • The right-hand side f in (1) is continuous and globally Lipschitz-continuous in both the first and second arguments; i.e., there exists a constant \(L_f>0\) satisfying

    $$\begin{aligned} \Vert f(y,u)-f({\tilde{y}},u)\Vert\le & L_f\Vert y-{\tilde{y}}\Vert ,\quad \forall y,{\tilde{y}}\in {{\mathbb {R}}}^n, u\in U_{\textrm{ad}}, \end{aligned}$$
    (3)
    $$\begin{aligned} \Vert f(y,u)-f(y,{\tilde{u}})\Vert\le & L_f \Vert u-{\tilde{u}}\Vert ,\quad \forall u,{\tilde{u}}\in U_{\textrm{ad}}, y\in {{\mathbb {R}}}^n. \end{aligned}$$
    (4)
  • The right-hand side f in (1) satisfies that there exists a constant \(M_f>0\) such that the following bound holds

    $$\begin{aligned} \Vert f(y,u)\Vert _\infty \le M_f, \quad \forall y\in {{\overline{\Omega }}}\subset {{\mathbb {R}}}^n, u\in U_{\textrm{ad}}, \end{aligned}$$
    (5)

    where \({{\overline{\Omega }}}\) is a bounded polyhedron such that for sufficiently small \(h>0\) the following inward pointing condition on the dynamics holds

    $$\begin{aligned} y+hf(y,u)\in {{\overline{\Omega }}},\quad \forall y \in {{\overline{\Omega }}}, u\in U_{\textrm{ad}}. \end{aligned}$$
    (6)
  • The running cost g is continuous and globally Lipschitz-continuous in both the first and second arguments; i.e., there exists a constant \(L_g>0\) satisfying

    $$\begin{aligned} |g(y,u)-g({\tilde{y}},u)|\le & L_g\Vert y-{\tilde{y}}\Vert ,\quad \forall y,{\tilde{y}}\in {{\mathbb {R}}}^n, u\in U_{\textrm{ad}}, \end{aligned}$$
    (7)
    $$\begin{aligned} |g(y,u)-g(y,{\tilde{u}})|\le & L_g \Vert u-{\tilde{u}}\Vert ,\quad \forall u,{\tilde{u}}\in U_{\textrm{ad}}, y \in {{\mathbb {R}}}^n. \end{aligned}$$
    (8)
  • Moreover, there exists a constant \(M_g>0\) such that

    $$\begin{aligned} |g(y,u)|\le M_g,\quad \forall (y,u)\in {{\overline{\Omega }}}\times U_{\textrm{ad}}. \end{aligned}$$
    (9)

From the assumptions made on f there exists a unique solution of (1) \(y=y(y_0,u)\) defined on \([0,\infty )\) for every admissible control \(u\in {{\mathbb {U}}}_{\textrm{ad}}\) and for every initial condition \(y_0\in {{\mathbb {R}}}^n\), see [4, Chapter 3]. We define the reduced cost functional as follows:

$$\begin{aligned} {\hat{J}}(y_0,u)=J(y(y_0,u),u),\quad \forall u\in {{\mathbb {U}}}_{\textrm{ad}},\quad y_0\in {{\mathbb {R}}}^n, \end{aligned}$$
(10)

where \(y(y_0,u)\) solves (1). Then, the optimal control can be formulated as follows: for given \(y_0\in {{\mathbb {R}}}^n\) we consider

$$\begin{aligned} \min _{u\in {{\mathbb {U}}}_{\textrm{ad}}} {\hat{J}}(y_0,u). \end{aligned}$$

The value function of the problem is defined as \(v:{\mathbb R}^n\rightarrow {{\mathbb {R}}}\) as follows:

$$\begin{aligned} v(y)=\inf \left\{ {\hat{J}}(y,u)\mid u\in {{\mathbb {U}}}_{\textrm{ad}}\right\} ,\quad y\in {{\mathbb {R}}}^n. \end{aligned}$$
(11)

This function gives the best value for every initial condition, given the set of admissible controls \(U_{\textrm{ad}}\). It is characterized as the viscosity solution of the HJB equation corresponding to the infinite horizon optimal control problem:

$$\begin{aligned} \lambda v(y)+\sup _{u\in U_{\textrm{ad}}}\left\{ -f(y,u)\cdot \nabla v(y)-g(y,u)\right\} =0,\quad y\in {{\mathbb {R}}}^n. \end{aligned}$$
(12)

The solution of (12) is unique for sufficiently large \(\lambda \), \(\lambda >\max (L_g,L_f)\), [4].

Let us consider first a time discretization where h is a strictly positive step size. We consider the following semidiscrete scheme for (12):

$$\begin{aligned} v_h(y)=\min _{u\in U_{\textrm{ad}}}\left\{ (1-\lambda h)v_h(y+hf(y,u))+h g(y,u)\right\} ,\quad y\in {{\mathbb {R}}}^n. \end{aligned}$$
(13)

As it is well-known equation (13) represents a numerical approximation related to the HJB equation (12) (see Remark 7). The following convergence result for the semidiscrete approximation [12, Theorem 2.3] requires that for \((y,{\tilde{y}},u)\in {{\mathbb {R}}}^n\times {{\mathbb {R}}}^n\times U_{\textrm{ad}}\)

$$\begin{aligned} \Vert f(y+{\tilde{y}},u)-2f(y,u)+f(y-{\tilde{y}},u)\Vert\le & C_f\Vert {\tilde{y}}\Vert ^2, \end{aligned}$$
(14)
$$\begin{aligned} \Vert g(y+{\tilde{y}},u)-2g(y,u)+g(y-{\tilde{y}},u)\Vert\le & C_g\Vert {\tilde{y}}\Vert ^2. \end{aligned}$$
(15)

Theorem 1

Let assumptions (3), (5), (6), (7), (9), (14) and (15) hold and let \(\lambda >\max (2L_g,L_f)\). Let v and \(v_h\) be the solutions of (12) and (13), respectively. Then, there exists a constant \(C\ge 0\), that can be bounded explicitly, such that the following bound holds

$$\begin{aligned} \sup _{y\in {{\mathbb {R}}}^n}|v(y)-v_h(y)|\le Ch,\quad h\in [0,1/ \lambda ). \end{aligned}$$
(16)

As in [1] let us suppose that there exists a bounded polyhedron \(\Omega \subset {{\mathbb {R}}}^n\) such that for \(h>0\) small enough (6) holds. We consider a fully-discrete approximation to (12). Let \(\left\{ S_j\right\} _{j=1}^{m_s}\) be a family of simplices which defines a regular triangulation of \(\Omega \)

$$\begin{aligned} {{\overline{\Omega }}}=\bigcup _{j=1}^{m_s} S_j,\quad k=\max _{1\le j\le m_s}(\textrm{diam} \ S_j). \end{aligned}$$

We assume we have \(n_s\) vertices/nodes \({\hat{y}}^1,\ldots ,{\hat{y}}^{n_s}\) in the triangulation. Let \(V^k\) be the space of piecewise affine functions from \({{\overline{\Omega }}}\) to \({{\mathbb {R}}}\) which are continuous in \({{\overline{\Omega }}}\) having constant gradients in the interior of any simplex \(S_j\) of the triangulation. Then, a fully discrete scheme for the HJB equations is given by

$$\begin{aligned} v_{h,k}({\hat{y}}^i)=\min _{u\in {U}_{\textrm{ad}}}\left\{ (1-\lambda h)v_{h,k}({\hat{y}}^i+hf({\hat{y}}^i,u))+hg({\hat{y}}^i,u)\right\} , \end{aligned}$$
(17)

for any vertex \({\hat{y}}^i\in {{\overline{\Omega }}}\). There exists a unique solution of (17) in the space \(V^k\), see [4, Theorem 1.1, Appendix A].

For the fully discrete method if we assume that the controls are Lipschitz-continuous; i.e., there exists a positive constant \(L_u>0\) such that

$$\begin{aligned} \Vert u(t)-u(s)\Vert _2\le L_u |t-s|, \end{aligned}$$
(18)

then first order of convergence both in time and space is proved in [7, Theorem 6].

Theorem 2

Assume conditions (3)–(5), (7)– (9) and (18) hold. Assume \(\lambda >{\overline{L}}\) with \({\overline{L}}=CnL_f\). Then, for \(0\le h\le 1/(2\lambda )\) there exist positive constants \({C_1= C_1(\lambda ,M_f,M_g,L_f,L_g)}\) and \({C_2= C_2(\lambda ,L_f,L_g,L_u)}\) such that

$$\begin{aligned} |v(y)-v_{h,k}(y)|\le C_1(h+k)+{C_2h},\quad y\in {{\overline{\Omega }}}. \end{aligned}$$

Condition (18) can be weakened and one can still get convergence as proved in [7, Theorem 7]. Assume the following convexity assumption introduced in [5, (A4)] and denoted by (CA) as in [5, 7],

  • (CA) For every \(y\in {{\mathbb {R}}}^n\),

    $$\begin{aligned} \left\{ f(y,u), g(y,u),\quad u\in U_{\textrm{ad}}\right\} \end{aligned}$$

    is a convex subset of \({{\mathbb {R}}}^{n+1}\).

Theorem 3

Assume conditions (3), (4), (5), (7), (8), (9) and (CA) hold. Assume \(\lambda >{\overline{L}}\) with \({\overline{L}}\) defined as in Theorem 2. Then, for \(0\le h\le 1/(2\lambda )\) there exist positive constants \(C_1= C_1(\lambda ,M_f,M_g,L_f,L_g)\) and \(C_2= C_2(\lambda ,M_f,M_g,L_f,L_g)\) such that for \(y\in {{\overline{\Omega }}}\)

$$\begin{aligned} |v(y)-v_{h,k}(y)|\le C_1(h+k)+C_2 \frac{1}{(1+\beta )^2\lambda ^2}(\log (h))^2h^{\frac{1}{1+\beta }},\quad \beta =\frac{\sqrt{n}L_f}{\lambda }. \end{aligned}$$
(19)

Let us observe that since \(\beta \) is smaller than 1, by weakening the regularity requirements we loose at most half an order in the rate of convergence in time of the method up to a logarithmic term.

3 POD Approximation of the Optimal Control Problem Based on Time Derivatives

In this section we present a new approach, similar to the second method in [1], but with snapshots based on time derivatives at different times. We also perform a completely different error analysis to the one appearing in [1], inspired in the results in [7] and [15].

3.1 POD Approximation Based on Time Derivatives

For \(p\in {\mathbb {N}}\) let us choose different pairs \(\left\{ (u^\nu ,y_0^\nu )\right\} _{\nu =1}^p\) in \({{\mathbb {U}}}\times {{\overline{\Omega }}}\). Since \({{\mathbb {U}}}=L^2(0,\infty ;{{\mathbb {R}}}^m)\) the controls do not need to be constants, as those taken in the numerical experiments. By \(y^\nu =y(u^\nu ;y_0^\nu )\), \(\nu =1,\ldots ,p\), we denote the solutions of (1) corresponding to those chosen initial conditions and controls.

Let us fix \(T>0\) and \(M>0\) and take \(\Delta t=T/M\) and \(t_j=j\Delta t\), \(j=0,\ldots , M\). For \(N=M+1\) we define the following space

$$\begin{aligned} \mathcal{{\varvec{V}}}=\text{ span }\left\{ z_1^\nu ,z_2^\nu ,\ldots ,z_N^\nu \right\} _{\nu =1}^p, \end{aligned}$$

with

$$\begin{aligned} z_1^\nu= & \sqrt{N}{\overline{y}}^\nu ,\quad {\overline{y}}^\nu =\frac{1}{N}\sum _{j=0}^My^\nu (t_j)\\ z_j^\nu= & \tau y_{t}^\nu (t_{j-1}),\ j=2,\ldots ,N, \end{aligned}$$

so that

$$\begin{aligned} \mathcal{{\varvec{V}}}= \text{ span }\left\{ \sqrt{N}{\overline{y}}^\nu ,\tau y_{t}^\nu (t_{1}),\ldots ,\tau y_{t}^\nu (t_{N})\right\} _{\nu =1}^p, \end{aligned}$$

where the factor \(\tau \) in front of the temporal derivatives is a time scale and it makes the snapshots dimensionally correct. In the numerical experiments we take \(\tau =1\). The correlation matrix corresponding to the snapshots is given by \({K}=((k_{i,j}))\in {{\mathbb {R}}}^{pN\times pN}\), with the entries

$$\begin{aligned} k_{i,j}=\frac{1}{pN}\left( z_k^{i},z_l^{j}\right) , \quad k,l=1,\ldots ,N, \quad i,j=1,\ldots , p, \end{aligned}$$

and where here, and in the sequel, \((\cdot ,\cdot )\) denotes the inner product in \({{\mathbb {R}}}^n\) to which the norm \(||\cdot ||\) is associated. Let us denote for simplicity

$$\begin{aligned} \mathcal{{\varvec{V}}}=\text{ span }\left\{ w_1,w_2,\ldots ,w_{pN}\right\} :=\left\{ z_1^1,\ldots z_N^1,\ldots ,z_1^p,\ldots ,z_N^p\right\} . \end{aligned}$$

Following [21], we denote by \( \lambda _1\ge \lambda _2,\ldots \ge \lambda _{d}>0\) the positive eigenvalues of K and by \({\varvec{v}}_1,\ldots ,{\varvec{v}}_{d}\in {{\mathbb {R}}}^{pN}\) its associated eigenvectors of euclidean norm 1. Then, the (orthonormal) POD basis functions of \(\mathcal {\varvec{V}}\) are given by

$$\begin{aligned} \varphi _k=\frac{1}{\sqrt{pN}}\frac{1}{\sqrt{\lambda _k}}\sum _{j=1}^{pN} v_k^j w_j,\ k=1,\ldots ,d, \end{aligned}$$
(20)

where \(v_k^j\) is the j-th component of the eigenvector \({\varvec{v}}_k\). The following error estimate is known from [21, Proposition 1]

$$\begin{aligned} \frac{1}{pN}\sum _{j=1}^{pN}\left\| w_j-\sum _{k=1}^r(w_j,\varphi _k)\varphi _k\right\| ^2=\sum _{k=r+1}^{d}\lambda _k, \end{aligned}$$
(21)

from which one can deduce for \(\nu =1,\ldots ,p\)

$$\begin{aligned} \left\| {\overline{y}}^\nu -\sum _{k=1}^r(\overline{y}^\nu ,\varphi _k)\varphi _k\right\| ^2+\frac{\tau ^2}{M+1}\sum _{j=1}^M\left\| y_{t}^{\nu }(t_j)-\sum _{k=1}^r(y_{t}^{\nu }(t_j),\varphi _k)\varphi _k\right\| ^2\le p\sum _{k=r+1}^{d}\lambda _k. \end{aligned}$$
(22)

In the sequel, we will denote by

$$\begin{aligned} \mathcal{{\varvec{V}}}^r= \text{ span }\{\varphi _1,\varphi _2,\ldots ,\varphi _r\},\quad 1\le r\le d, \end{aligned}$$
(23)

and by \(P^r\, \ {{\mathbb {R}}}^n \rightarrow \mathcal{{\varvec{V}}}^r\), the orthogonal projection onto \(\mathcal{{\varvec{V}}}^r\). Then (21) can be written as

$$\begin{aligned} \frac{1}{pN}\sum _{j=1}^{pN}\left\| w_j-P^r w_j\right\| ^2=\sum _{k=r+1}^{d}\lambda _k. \end{aligned}$$

The following lemma is proved in [15, Lemma 3.2].

Lemma 1

Let \(T>0\), \(\Delta t=T/M\), \(t^n=n\Delta t\), \(n=0,1,\ldots M\), let X be a Banach space, \({\varvec{z}}\in H^2(0,T;X)\). Then, the following estimate holds

$$\begin{aligned} \max _{0\le k\le N }\Vert {\varvec{z}}^k\Vert _X^2 \le {3}\Vert {{\overline{{\varvec{z}}}}}\Vert _X^2+\frac{12 T^2}{M}\sum _{n=1}^M \Vert {\varvec{z}}_t^n\Vert _X^2+\frac{16T}{3}(\Delta t)^2\int _0^T\Vert {\varvec{z}}_{tt}(s)\Vert _X^2\ ds, \end{aligned}$$
(24)

where \( {{\overline{{\varvec{z}}}}}=\frac{1}{M+1}\sum _{j=0}^M{\varvec{z}}^j\).

Using Lemma 1 we can prove pointwise estimates for the projections onto \( \mathcal{{\varvec{V}}}^r\).

Lemma 2

The following bounds hold for \(\nu =1,\ldots ,p\)

$$\begin{aligned} \max _{0\le j\le M }\Vert y^\nu (t_j)-P^r y^\nu (t_j)\Vert ^2 \le \left( 3+24\frac{T^2}{\tau ^2}\right) p\sum _{k={r+1}}^{d}\lambda _k +\frac{16T}{3}(\Delta t)^2 \int _0^T\Vert y^{\nu }_{tt}(s)\Vert ^2\ ds. \end{aligned}$$
(25)

Proof

We argue as in [15, Lemma 3.4]. Taking \({\varvec{z}}=y^\nu (t_j)-P^r y^\nu (t_j)\) in (24) and applying (22) (taking into account that \((M+1)/M\le 2\)) yields

$$\begin{aligned} \max _{0\le n\le M }\Vert y^\nu (t_j)-P^r y^\nu (t_j)\Vert ^2\le & \left( 3+24\frac{T^2}{\tau ^2}\right) p\sum _{k={r+1}}^{d}\lambda _k\nonumber \\ & +\frac{16T}{3}(\Delta t)^2\int _0^T\Vert y^\nu _{tt}(s)-P^r y^\nu _{tt}(s)\Vert ^2\ ds. \end{aligned}$$

Now, since \(P^r\) is an orthogonal projection, we have \(\Vert y^\nu _{tt}(s)-P^r y^\nu _{tt}(s)\Vert ^2\le \Vert y^\nu _{tt}(s)\Vert ^2\) and the proof is finished \(\square \)

3.2 The POD Control Problem

To mitigate the curse of dimensionality, the idea of the POD method is to work on a space of dimension r with \(r<n\). To start we need to introduce some notation.We use a slightly different notation from the one used in [1]. In particular, as stated below, \(P_c^r\). is always used to denote coefficients and \(\varvec{\varphi }\) is always used for the linear combination based on the POD basis functions of the reduced order space. More precisely: for any \(y\in {{\overline{\Omega }}}\subset {{\mathbb {R}}}^n\) let us denote by \(P_c^r y\in {{\mathbb {R}}}^r\) the coefficients of the projection of y onto \(\mathcal{{\varvec{V}}}^r\)

$$\begin{aligned} P_c^r y=\left\{ (y,\varphi _k)\right\} _{k=1}^r. \end{aligned}$$
(26)

For any \(y^r\in {{\mathbb {R}}}^r\) let us denote by \(\varvec{\varphi }y^r\in {{\mathbb {R}}}^n\) the vector whose coefficients in the POD basis are the components of \(y^r\), i.e.,

$$\begin{aligned} \varvec{\varphi }y^r=\sum _{j=1}^r y^r_j \varphi _j, \end{aligned}$$
(27)

where \(y^r_j\) is the j component of the vector \(y^r\).

For f and g in (1), (2) and \((y^r,u)\in {{\mathbb {R}}}^r\times U_{\textrm{ad}}\) we define

$$\begin{aligned} f^r(y^r,u)= & P^r_cf(\varvec{\varphi }y^r,u)\in {{\mathbb {R}}}^r,\nonumber \\ g^r(y^r,u)= & g(\varvec{\varphi }y^r,u)\in {{\mathbb {R}}}. \end{aligned}$$
(28)

To have an inward pointing condition on the dynamics in the reduced space, analogous to (6), following [1, Section 4.2], we assume that there exists a bounded polyhedron \({{\overline{\Omega }}}^r\subset {{\mathbb {R}}}^r\) satisfying

$$\begin{aligned} P_c^r y\in \Omega ^r,\quad \forall y\in {{\overline{\Omega }}}. \end{aligned}$$
(29)

The following lemma proves that the inward pointing condition for \(\Omega ^r\) follows from (29).

Lemma 3

Condition (29) implies that

$$\begin{aligned} y^r+hf^r(y^r,u)\in {{\overline{\Omega }}}^r,\quad y^r=P_c^r y, \ y\in {{\overline{\Omega }}}, \end{aligned}$$

provided the step size h or \(\Vert P^r y-y\Vert \) is sufficiently small.

Proof

We follow [1, Remark 4.5] for the proof. We first observe that

$$\begin{aligned} y^r+hf^r(y^r,u)=P_c^r y+h P^r_c f(\varvec{\varphi }y^r,u). \end{aligned}$$

Adding and subtracting \(hP^r_c f(y,u)\) we get

$$\begin{aligned} y^r+hf^r(y^r,u)=P_c^r(y+hf(y,u))+hP_c^r(f(\varvec{\varphi }y^r,u)-f(y,u)). \end{aligned}$$
(30)

Applying condition (6) \(y+hf(y,u)\in {{\overline{\Omega }}}\) and applying (29) the first term on the right-hand side of (30) verifies \(P_c^r(y+hf(y,u))\in \Omega ^r\). Then, we only need to show that the second term on the right-hand side of (30) is small enough for h or \(\Vert P^r y-y\Vert \) sufficiently small.

Let us denote by \(z=f(\varvec{\varphi }y^r,u)-f(y,u)\in {{\mathbb {R}}}^n\). Since \(P_c^r z=\left\{ (z,\varphi _k)\right\} _{k=1}^r\in {{\mathbb {R}}}^r\), taking into account that the functions \(\varphi _k\) define an orthonormal basis and that \(P^r\) is a projection, we have

$$\begin{aligned}\Vert P_c^r z\Vert _2=\Vert P^rz\Vert \le \Vert z\Vert .\end{aligned}$$

Applying the above inequality together with (3), we get

$$\begin{aligned} \Vert P_c^r z\Vert _2^2= & \Vert P_c^r(f(\varvec{\varphi }y^r,u)-f(y,u))\Vert _2^2\le \Vert f(\varvec{\varphi }y^r,u)-f(y,u)\Vert ^2 \nonumber \\\le & L_f^2 \Vert \varvec{\varphi }y^r-y\Vert ^2=L_f^2 \Vert P^r y-y\Vert ^2, \end{aligned}$$
(31)

so that the proof is concluded. \(\square \)

We can now define the reduced order problem we solve in practice. For \(f^r\) and \(g^r\) defined in (28) and a given initial condition \(y_0^r \in {{\mathbb {R}}}^r\) let us consider the controlled nonlinear dynamical system

$$\begin{aligned} \dot{y}^r(t)=f^r(y^r(t),u(t))\in {{\mathbb {R}}}^r,\quad t>0,\quad y^r(0)=y_0^r\in {{\mathbb {R}}}^r, \end{aligned}$$
(32)

together with the infinite horizon cost functional

$$\begin{aligned} J^r(y^r,u)=\int _0^\infty g^r(y^r(t),u(t))e^{-\lambda t}~dt. \end{aligned}$$
(33)

As in (10), we define the reduced cost functional

$$\begin{aligned} {\hat{J}}^r(y_0^r,u)=J^r(y^r(y_0^r,u),u),\quad \forall u\in {\mathbb U}_{\textrm{ad}},\quad y_0^r\in {{\mathbb {R}}}^r, \end{aligned}$$
(34)

where \(y^r(y_0^r,u)\) solves (32). Then, the POD optimal control can be formulated as follows: for given \(y_0^r\in {{\mathbb {R}}}^r\) we consider

$$\begin{aligned} \min _{u\in {{\mathbb {U}}}_{\textrm{ad}}} {\hat{J}}^r(y_0^r,u). \end{aligned}$$

The value function of the problem \(v^r:{{\mathbb {R}}}^r\rightarrow {{\mathbb {R}}}\) is defined as follows:

$$\begin{aligned} v^r(y^r)=\inf \left\{ {\hat{J}}(y^r,u)\mid u\in {{\mathbb {U}}}_{\textrm{ad}}\right\} ,\quad y^r\in {{\mathbb {R}}}^r. \end{aligned}$$
(35)

Remark 1

It is easy to check that the regularity assumptions for \(f^r\) and \(g^r\) analogous to those for f and g, (3), (4), (5), (7), (8) and (9), hold from the definition of \(f^r\) and \(g^r\) and the properties being true for f and g.

To get in practice a fully discrete approximation in the reduced space let us define \(\left\{ S_j^r\right\} _{j=1}^{m_s^r}\) a family of simplices which defines a regular triangulation of \(\Omega ^r\). We assume we have \(n_s\) vertices/nodes in the triangulation \({\hat{y}}^{1}_{r},\ldots ,{\hat{y}}^{n_s}_{r}\in {{\overline{\Omega }}}^r\) and

$$\begin{aligned} {{\overline{\Omega }}}^r=\bigcup _{j=1}^{m_s^r} S_j^r,\quad k_r=\max _{1\le j\le m_s^r}(\textrm{diam} \ S_j^r). \end{aligned}$$

Let \(V^{k_r}\) be the space of piecewise affine functions from \({{\overline{\Omega }}}^r\) to \({{\mathbb {R}}}\) which are continuous in \({{\overline{\Omega }}}^r\) having constant gradients in the interior of any simplex \(S_j^r\) of the triangulation. As in [1, (4.15)] we introduce the following POD fully discrete scheme for the HJB equations

$$\begin{aligned} v_{h,k}^r({\hat{y}}^{i}_{r})=\min _{u\in {U}_{\textrm{ad}}}\left\{ (1-\lambda h)v_{h,k}^r({\hat{y}}^{i}_{r}+hf^r({\hat{y}}^{i}_{r},u))+hg^r({\hat{y}}^{i}_{r},u)\right\} ,\ i=1,\ldots ,n_s, \end{aligned}$$
(36)

for any vertex \({\hat{y}}^{i}_{r}\in {{\overline{\Omega }}}^r\). As in (17), there exists a unique solution of (36) in the space \(V^{k_r}\) defined by its nodal values (36), see [4, Theorem 1.1, Appendix A].

The key point to carry out the error analysis is that (36) is the fully discrete approximation to the continuous problem with value function defined in (35). Moreover, we can apply Theorems 2 and 3 with v and \(v_{h,k}\) replaced by \(v^r\) and \(v^r_{h,k}\).

As in [1], for any node \({\hat{y}}^{i}_{r}\in {{\overline{\Omega }}}_r\) we set

$$\begin{aligned} {\hat{y}}^i=\varvec{\varphi }{\hat{y}}^{i}_{r},\quad i=1,\ldots n_s, \end{aligned}$$

and define

$$\begin{aligned} {\tilde{v}}_{h,k}^r(y)=v_{h,k}^r(P^r_c y),\quad \forall y\in {{\overline{\Omega }}}. \end{aligned}$$
(37)

For \({\hat{y}}^i\), \(i=1,\ldots n_s\), by definition, we have

$$\begin{aligned} {\tilde{v}}_{h,k}^r({\hat{y}}^i)=v_{h,k}^r(P^r_c {\hat{y}}^{i})=v_{h,k}^r({\hat{y}}^{i}_{r}), \end{aligned}$$

since \({\hat{y}}^{i}_{r}\in {{\mathbb {R}}}^r\) are the coordinates of \({\hat{y}}^i\in {{\mathbb {R}}}^n\) respect to the basis functions of \(\mathcal{{\varvec{V}}}^r\) (23), see (26), (27).

Taking into account that \({\hat{y}}^{i}_{r}+hf^r({\hat{y}}^{i}_{r},u)\)=\(P_c^r({\hat{y}}^i+hf({\hat{y}}^i,u))\) and \(g^r({\hat{y}}_r^{i},u)=g({\hat{y}}^i,u)\) then (36) can also be written as (see [1, (4.17)])

$$\begin{aligned} {\tilde{v}}_{h,k}^r({\hat{y}}^{i})=\min _{u\in {U}_{\textrm{ad}}}\left\{ (1-\lambda h){\tilde{v}}_{h,k}^r({\hat{y}}^{i}+hf({\hat{y}}^{i},u))+hg({\hat{y}}^{i},u)\right\} ,\ i=1,\ldots ,n_s. \end{aligned}$$

Nevertheless, we do not use the above characterization of the fully discrete value function to bound the error.

3.3 Error Analysis of the Method

To prove the main results of the paper we need a previous lemma that we now state and prove. Lemma 4 bounds the difference between the value function solving the original problem (11) and the value function solving the reduced order problem (35).

Lemma 4

Let v and \(v^r\) be the solutions of (11) and (35), respectively. For \(y\in {{\overline{\Omega }}}\), let \(P^r y\in {{\mathbb {R}}}^n\) be the projection of y onto \(\mathcal{{\varvec{V}}}^r\) and let \(P^r_c y\in {{\mathbb {R}}}^r\) be as defined in (26). Then, the following bound holds

$$\begin{aligned} |v(P^r y)-v^r(P^r_c y)|\le & \ L_g\int _0^\infty te^{(L_f-\lambda )t}\max _{0\le s\le t}\Vert (I-P^r)f(y(s),u(s))\Vert dt\nonumber \\ & \quad + L_g\int _0^\infty te^{(L_f-\lambda )t}\max _{0\le s\le t}\Vert (I-P^r)f(y_r(s),u^r(s))\Vert dt, \end{aligned}$$
(38)

where \(u,u^r:[0,\infty )\rightarrow {{\mathbb {R}}}^m\) are the controls giving the minimum in (11) and (35) (with initial conditions \(P^r y\) and \(P^r_c y\), respectively), y(t) is the solution of (1) with \(y(0)=P^r y\) and control u(t), i.e. \(y=y(P^r y,u)\) and \(y_r(t)\) is the solution of (1) with \(y(0)=P^r y\) and control \(u^r(t)\), i.e., \(y_r=y(P^r y,u^r)\).

Proof

We argue as in [7, Lemma 2]. Let \(w:[0,\infty )\rightarrow {{\mathbb {R}}}^m\) be a given control that we now fix and let y(t) be the solution of (1) with \(y(0)=P^r y\) and control w(t). Then

$$\begin{aligned} y(t)=P^r y +\int _0^t f(y(s),w(s))\,ds. \end{aligned}$$
(39)

Let \( y^r(t)\) be the solution of (32) with control w(t) and \(y^r(0)=P^r_c y\). Then, recalling definitions (26) and (27) from which we obtain \(\varvec{\varphi }P^r_c y=P^r y\), we can write

$$\begin{aligned} \varvec{\varphi }y^r(t)=P^r y+\int _0^t\varvec{\varphi }P^r_c f(\varvec{\varphi }y^r(s),w(s))\,ds. \end{aligned}$$
(40)

Subtracting (40) from (39) we get

$$\begin{aligned} y(t)-\varvec{\varphi }y^r(t)= & \int _0^t (f(y(s),w(s))-\varvec{\varphi }P^r_c f(\varvec{\varphi }y^r(s),w(s)))\,ds\\= & \int _0^t(f(y(s),w(s))-\varvec{\varphi }P^r_c f(y(s),w(s)))\,ds\\ & \quad +\int _0^t (\varvec{\varphi }P^r_c f(y(s),w(s))-\varvec{\varphi }P^r_c f(\varvec{\varphi }y^r(s),w(s)))\,ds. \end{aligned}$$

Taking norms

$$\begin{aligned} \Vert y(t)-\varvec{\varphi }y^r(t)\Vert\le & \int _0^t \Vert (I-P^r)f(y(s),w(s))\Vert \,ds \nonumber \\ & +\int _0^t \Vert \varvec{\varphi }P^r_c f(y(s),w(s))-\varvec{\varphi }P^r_c f(\varvec{\varphi }y^r(s),w(s))\Vert \,ds. \end{aligned}$$
(41)

Using again that \(\varvec{\varphi }P^r_c=P^r\) and applying that \(P^r\) is a projection together with (3) we get

$$\begin{aligned} & \Vert \varvec{\varphi }P^r_c f(y(s),w(s))-\varvec{\varphi }P^r_c f(\varvec{\varphi }y^r(s),w(s))\Vert ^2\\ & \quad =\Vert P^r \left( f(y(s),w(s))- f(\varvec{\varphi }y^r(s),w(s))\right) \Vert ^2\\ & \quad \le \Vert f(y(s),w(s))-f(\varvec{\varphi }y^r(s),w(s)\Vert ^2\le L_f^2 \Vert y(s)-\varvec{\varphi }y^r(s)\Vert ^2. \end{aligned}$$

Going back to (41)

$$\begin{aligned} \Vert y(t)-\varvec{\varphi }y^r(t)\Vert \le L_f\int _0^t\Vert y(s)-\varvec{\varphi }y^r(s)\Vert ds +t\max _{0\le s\le t}\Vert (I-P^r)f(y(s),w(s))\Vert . \end{aligned}$$

Applying Gronwall’s lemma we get

$$\begin{aligned} \Vert y(t)-\varvec{\varphi }y^r(t)\Vert \le {e^{L_ft}}\left( t\max _{0\le s\le t}\Vert (I-P^r)f(y(s),w(s))\Vert \right) . \end{aligned}$$
(42)

We now observe that from definitions (10) and (34) we get

$$\begin{aligned} |{\hat{J}}(P^r y,w)-{\hat{J}}^r(P^r_c y,w)|\le \int _0^\infty |g(y(t),w(t))-g(\varvec{\varphi }y^r(t),w(t))|e^{-\lambda t}dt. \end{aligned}$$

Applying then the Lipschitz-continuity property of g, (7), together with (42) we get

$$\begin{aligned} & |{\hat{J}}(P^r y,w)-{\hat{J}}^r(P^r_c y,w)|\nonumber \\ & \le \quad L_g\int _0^\infty t{e^{(L_f-\lambda )t}}\max _{0\le s\le t}\Vert (I-P^r)f(y(s),w(s))\Vert \ dt. \end{aligned}$$
(43)

To conclude we will argue similarly as in in [7, Theorem 4].

For any \(y\in {{\overline{\Omega }}}\), let \(u^r:[0,\infty )\rightarrow {{\mathbb {R}}}^m\) be the control giving the minimum in (35) with initial condition \(P^r_c y\). Since by definition of v, \(v(P^r y)\le {\hat{J}}(P^r y,u^r)\) and \(v^r(P^r_c y)={\hat{J}}^r(P^r_c y,u^r)\), applying (43) with \(w=u^r\) we get

$$\begin{aligned} v(P^r y)-v^r(P^r_c y)\le & {\hat{J}}(P^r y,u^r)-{\hat{J}}^r(P^r_c y,u^r)\nonumber \\\le & L_g\int _0^\infty te^{(L_f-\lambda )t}\max _{0\le s\le t}\Vert (I-P^r)f(y_r(s),u^r(s))\Vert \ dt, \end{aligned}$$
(44)

where \(y_r(t)\) is the solution of (1) with \(y(0)=P^r y\) and control \(u^r(t)\).

On the other hand, let \(u:[0,\infty )\rightarrow {{\mathbb {R}}}^m\) be the control giving the minimum in (11) with initial condition \(P^r y\). Arguing as in (44) and applying (43) again, with \(w=u\), we get

$$\begin{aligned} v^r(P^r_c y)-v(P^r y)\le & {\hat{J}}^r(P^r_c y,u)-{\hat{J}}(P^r y,u)\nonumber \\\le & L_g\int _0^\infty te^{(L_f-\lambda )t}\max _{0\le s\le t}\Vert (I-P^r)f(y(s),u(s))\Vert \ dt, \end{aligned}$$
(45)

where y(t) is the solution of (1) with \(y(0)=P^r y\) and control u. From (44) and (45) we conclude (38). \(\square \)

In next theorem we bound the difference between the value function of the original problem and our fully discrete approximation based on POD. Let \(u^r\) be the control giving the minimum in (35). For the proof of next theorem we need to assume that there exists a positive constant \(L_{u^r}>0\) such that

$$\begin{aligned} \Vert u^r(t)-u^r(s)\Vert _2\le L_{u^r} |t-s|. \end{aligned}$$
(46)

Theorem 4

Let v be the solution of (11) and let \({\tilde{v}}_{h,k}^r\) be its fully discrete POD approximation defined in (36)-(37). Assume conditions (3), (4), (5), (7), (8), (9), (14), (15) and (46) hold. Assume \(\lambda >\max (2L_g,L_f,{\overline{L}}_r)\) with \({\overline{L}}_r=CrL_f\). Then, for any \(y\in {{\overline{\Omega }}}\) and \(0\le h\le 1/(2\lambda )\) there exist positive constants \(C_1\) and \(C_2\) such that the following bound holds for \(y\in {{\overline{\Omega }}}\)

$$\begin{aligned} |v(y)-{\tilde{v}}_{h,k}^r(y)|\le & C_1(h+k_r)+{C_2h}+\frac{L_g}{\lambda -L_f}\Vert y-P^r y\Vert \nonumber \\ & \ + L_g\int _0^\infty te^{(L_f-\lambda )t}\max _{0\le s\le t}\Vert (I-P^r)f(y(s),u(s))\Vert dt\nonumber \\ & \ + L_g\int _0^\infty te^{(L_f-\lambda )t}\max _{0\le s\le t}\Vert (I-P^r)f(y_r(s),u^r(s))\Vert dt, \end{aligned}$$
(47)

where \(u,u^r:[0,\infty )\rightarrow {{\mathbb {R}}}^m\) are the controls giving the minimum in (11) and (35) (with initial conditions \(P^r y\) and \(P^r_c y\), respectively), y(t) is the solution of (1) with \(y(0)=P^r y\) and control u(t), i.e., \(y=y(P^r y,u)\) and \(y_r(t)\) is the solution of (1) with \(y(0)=P^r y\) and control \(u^r(t)\), i.e., \(y_r=y(P^r y,u^r)\).

Proof

We first observe that adding and subtracting terms we get

$$\begin{aligned} |v(y)-{\tilde{v}}_{h,k}^r(y)|\le & |v(y)-v(P^ry)|+|v(P^r y)-v^r(P^r_c y)|+|v^r(P^r_c y)-{\tilde{v}}_{h,k}^r(y)|. \end{aligned}$$

To bound the first term we write

$$\begin{aligned} |v(y)-v(P^ry)|\le |v(y)-v_h(y)|+|v_h(y)-v_h(P^r y)|+|v_h(P^r y)-v(P^r(y)|, \end{aligned}$$

and then apply (16) to the first and third terms and the Lipschitz-continuity of \(v_h\) that holds for \(\lambda >L_f\) (see [13, p. 473]) to the second term. Then

$$\begin{aligned} |v(y)-v(P^ry)|\le C h+\frac{L_g}{\lambda -L_f}\Vert y-P^r y\Vert . \end{aligned}$$

To bound the second term we apply Lemma 4. To conclude we need to bound the third term. To this end we observe that

$$\begin{aligned} v^r(P^r_c y)-{\tilde{v}}_{h,k}^r(y)=v^r(P^r_c y)-v_{h,k}^r(P^r_c y) \end{aligned}$$

so that we can apply Theorem 2 to \(v^r\) and \(v^r_{h,k}\) to reach (47). \(\square \)

Remark 2

Let us observe that the first two terms on the right-hand side of (47) give the rate of convergence of the method in terms of the time step h and mesh diameter \(k_r\). The other three terms come from the POD approximation and will be bounded at the end of this section. These terms depend on the set of snapshots and the tail of the eigenvalues in the singular value decomposition.

To apply Theorem 2 in the proof of Theorem 4 we use the properties of \(f^r\) and \(g^r\) that, as commented in Remark 1 are inherited from the assumed hypothesis made on f and g, (3), (4), (5), (7), (8), (9). We also need to assume condition (46) holds for the control of the reduced order problem. Condition (46) can be weakened and one can still get convergence assuming instead the following convexity assumption:

  • (CAr) For every \(y^r\in {{\mathbb {R}}}^r\),

    $$\begin{aligned} \left\{ f^r(y^r,u), g^r(y^r,u),\quad u\in U_{\textrm{ad}}\right\} \end{aligned}$$

    is a convex subset of \({{\mathbb {R}}}^{r+1}\).

This result is stated in Theorem 5 below. In the proof of the theorem we apply Theorem 3 instead of Theorem 2.

Theorem 5

Let v be the solution of (11) and let \({\tilde{v}}_{h,k}^r\) be its fully discrete POD approximation defined in (36)-(37). Assume conditions (3), (4), (5), (7), (8), (9), (14), (15) and (CAr) hold. Assume \(\lambda >\max (2L_g,L_f,{\overline{L}}_r)\) with \({\overline{L}}_r=CrL_f\). Then, for any \(y\in {{\overline{\Omega }}}\) and \(0\le h\le 1/(2\lambda )\) there exist positive constants \(C_1\) and \(C_2\) such that the following bound holds for \(y\in {{\overline{\Omega }}}\)

$$\begin{aligned} |v(y)-v_{h,k}(y)|\le & C_1(h+k_r)+C_2 \frac{1}{(1+\beta )^2\lambda ^2}(\log (h))^2h^{\frac{1}{1+\beta }}\nonumber \\ & \ +\frac{L_g}{\lambda -L_f}\Vert y-P^r y\Vert \nonumber \\ & \ + L_g\int _0^\infty te^{(L_f-\lambda )t}\max _{0\le s\le t}\Vert (I-P^r)f(y(s),u(s))\Vert dt\nonumber \\ & \ + L_g\int _0^\infty te^{(L_f-\lambda )t}\max _{0\le s\le t}\Vert (I-P^r)f(y_r(s),u^r(s))\Vert dt, \end{aligned}$$
(48)

where \(\beta =\frac{\sqrt{r}L_f}{\lambda }\) and \(u,u^r:[0,\infty )\rightarrow {{\mathbb {R}}}^m\) are the controls giving the minimum in (11) and (35) (with initial conditions \(P^r y\) and \(P^r_c y\), respectively), y(t) is the solution of (1) with \(y(0)=P^r y\) and control u(t), i.e., \(y=y(P^r y,u)\) and \(y_r(t)\) is the solution of (1) with \(y(0)=P^r y\) and control \(u^r(t)\), i.e. \(y_r=y(P^r y,u^r)\).

Proof

The proof is the same as the proof of Theorem 4 applying Theorem 3 instead of Theorem 2. \(\square \)

Remark 3

The same comments as in Remark 2 apply with the difference that the rate of convergence in terms of the time step h is reduced due to the weaker regularity requirements. Since \(\beta \) is smaller than 1 we loose at most half an order in the rate of convergence in time up to a logarithmic term.

To conclude we will give an estimation of the last three terms in (47) and (48). The first term is bounded in the following lemma, where, recall, p is the number of trajectories in the set of snapshots (see Sect. 3.1).

Lemma 5

For \(y\in {{\overline{\Omega }}}\) and \(P^r\) the X-orthogonal projection onto \(\mathcal{{\varvec{U}}}^r\) the following bound holds

(49)

Proof

Let k and j be such that

Then, we can write

$$\begin{aligned} y - P^ry =(I-P^r) \left( y - y^{j}(t_k) \right) + y^{j}(t_k) - P^r y^{j}(t_k). \end{aligned}$$
(50)

Now noticing that \(\left\| I-P^r\right\| \le 1\) and recalling Lemma 2, from (50) it follows (49) \(\square \)

Remark 4

The first term on the right hand-side in (49) reflects the closeness of the data y to the set of snapshots while the other term is the projection error onto the POD basis. Let us observe that using time derivatives in the set of snapshots allow us to get a bound for this projection error in the discrete maximum norm in time, see Lemma 2.

We bound the second term on the right-hand side of (47) in the following lemma.

Lemma 6

For \(y\in {{\overline{\Omega }}}\) and \(P^r\) the orthogonal projection onto \(\mathcal{{\varvec{U}}}^r\) let \(u:[0,\infty )\rightarrow {{\mathbb {R}}}^m\) be the control giving the minimum in (11) with initial condition \(P^r y\). Let y(t) be the solution of (1) with \(y(0)=P^r y\) and control u(t). Then, for any fixed \(s\in [0,\infty )\) the following bound holds

(51)

Proof

We argue as in the proof of Lemma 5. Let k and j be such that

Then,

$$\begin{aligned} & (I-P^r)f(y(s),u(s))=(I-P^r)(f(y(s),u(s))-f(y^j(t_k),u^j(t_k)))\\ & \quad \quad +(I-P^r)f(y^j(t_k),u^j(t_k)), \end{aligned}$$

so that applying (22) and \(\Vert I-P^r\Vert \le 1\) we get (51). \(\square \)

Remark 5

An error bound for \(\Vert (I-P^r)f(y_r(s),u^r(s))\Vert \) can be obtain arguing exactly in the same way.

Remark 6

As in (49), the first term on the right-hand side of (51) reflects the closeness of f(y(s), u(s)) to the set of snapshots and the second one is the projection error onto the POD basis. Let us observe that the use of temporal derivates in the set of snapshots is essential to get the bound (51).

In case one has a uniform distribution along the discrete times of the errors in (22), as it is often the case (at least in our experience with numerical computations concerning POD methods, see [15, Figure 1]), one would expect for the second term on the right-hand side of (51) a behaviour as \( \frac{p}{\tau ^2}\sum _{k=r+1}^d\lambda _k \) instead of the rude bound \(\frac{(M+1)}{\tau ^2}p\sum _{k=r+1}^d\lambda _k\).

Remark 7

As in [1, 3, 7, 24] we do not provide in this paper error bounds for the reduced control whose values at the nodes are obtained solving (36). Although this would be interesting, we are not aware of similar error bounds in the literature. Actually, we think that the starting point should be getting those bounds for the controls in the fully discrete scheme of the original (not reduced) method. We remark that the results in [7], error bounds of the fully discrete problem, in which the theory of the present paper is based, are very recent. Actually, the bounds in [7] represent an improvement in the error bounds of the fully discrete method over previous results obtained more than 25 years ago. In [13, Section 1.2] the reconstruction of approximate optimal controls is considered comparing the fully discrete case with the semi-discrete in time case. Getting bounds for the computed controls of the fully discrete problem respect to the original problem could be an interesting subject of future research, for which at the moment we do not know if the techniques in [13, Section 1.2] could be extended. To complete the discussion we include below a heuristic argument concerning the convergence of the approximate controls in the fully discrete case. For any node in the triangulation, \({\hat{y}}^i\), let us denote by \(u_{h,k}^i\) the control giving the minimum in (17). On the other hand, let us denote by \(u^i\) the optimal control in the HJB equation (12) for \(y={\hat{y}}^i\). As stated in [13, Section 1.2], the controls could not be unique but one can select the control with minimum norm. Now, let us observe that for the discrete value function it holds

$$\begin{aligned} v_{h,k}({\hat{y}}^i+hf({\hat{y}}^i,u^i_{h,k}))=v_{h,k}({\hat{y}}^i)+hf(y^i,u^i_{h,k})\cdot \nabla v_{h,k}({\hat{y}}^i)+O(h^2). \end{aligned}$$
(52)

Taking into account that

$$\begin{aligned} v_{h,k}({\hat{y}}^i)=(1-\lambda h)v_{h,k}({\hat{y}}^i+hf({\hat{y}}^i,u^i_{h,k}))+hg({\hat{y}}^i,u^i_{h,k}), \end{aligned}$$
(53)

and inserting (52) into (53) we get

$$\begin{aligned} v_{h,k}({\hat{y}}^i)= & (1-\lambda h)\left( v_{h,k}({\hat{y}}^i)+hf(y^i,u^i_{h,k})\cdot \nabla v_{h,k}({\hat{y}}^i)+O(h^2)\right) \\ & +hg({\hat{y}}^i,u^i_{h,k}). \end{aligned}$$

And then

$$\begin{aligned} \lambda h v_{h,k}({\hat{y}}^i)=hf({\hat{y}}^i,u^i_{h,k})\cdot \nabla v_{h,k}({\hat{y}}^i)+hg({\hat{y}}^i,u^i_{h,k})+O(h^2). \end{aligned}$$

From which

$$\begin{aligned} \lambda v_{h,k}({\hat{y}}^i)=f({\hat{y}}^i,u^i_{h,k})\cdot \nabla v_{h,k}({\hat{y}}^i)+g({\hat{y}}^i,u^i_{h,k})+O(h). \end{aligned}$$

Now, since

$$\begin{aligned} \lambda v({\hat{y}}^i)=f({\hat{y}}^i,u^i)\cdot \nabla v({\hat{y}}^i)+g({\hat{y}}^i,u^i), \end{aligned}$$
(54)

and \(v_{h,k}({\hat{y}}^i)\rightarrow v({\hat{y}}^i),\) for \(h,k\rightarrow 0\) we obtain

$$\begin{aligned} f({\hat{y}}^i,u^i_{h,k})\cdot \nabla v_{h,k}({\hat{y}}^i)+g({\hat{y}}^i,u^i_{h,k})\rightarrow f({\hat{y}}^i,u^i)\cdot \nabla v({\hat{y}}^i)+g({\hat{y}}^i,u^i). \end{aligned}$$
(55)

Arguing as in [13, Section 1.2], let us define

$$\begin{aligned} L(y,u)=\frac{1}{\lambda }\left( f(y,u)\cdot \nabla v(y)+g(y,u)\right) , \end{aligned}$$

and let us associate with y a (unique) control u(y) such that

$$\begin{aligned} L(y,u(y))=\min _{u\in U_{\textrm{ad}}}L(y,u)=v(y). \end{aligned}$$

Assume

$$\begin{aligned} \nabla v_{h,k}({\hat{y}}^i)\rightarrow \nabla v({\hat{y}}^i) \end{aligned}$$
(56)

(which we have not proved) and \(u_{h,k}^i\rightarrow {\overline{u}}^i\) for \(h,k\rightarrow 0\). Then, on the one hand, from (55),

$$\begin{aligned} L({\hat{y}}^i,u_{h,k}^i)\rightarrow L({\hat{y}}^i,u^i), \end{aligned}$$

and, on the other

$$\begin{aligned} L({\hat{y}}^i,u_{h,k}^i)\rightarrow L({\hat{y}}^i,{\overline{u}}^i) \end{aligned}$$

which implies \({\overline{u}}^i=u^i\) and \(u_{h,k}^i\rightarrow u^i\) for \(h,k\rightarrow 0\). Finally, let us observe that the argument in [13, Section 1.2] had already proved that for any fixed h and \(k\rightarrow 0\) the fully discrete controls \(u_{h,k}^i\) converge to the corresponding semi-discrete time control defined in (13), for that value of h.

4 Numerical Experiments

We now present some numerical experiments. We closely follow those in [1] so that the new method we propose can be compared with the method in [1]. The authors in [1] apply state snapshots in the reduced order method instead of snapshots based on time derivatives. We observe that, as explained in detail in the introduction, in the last case it is not necessary to consider both, state snapshots and time derivatives, since it has already been proved that only with time derivatives optimal bounds can be obtained. We observe that we have chosen the closed-loop control type approach instead of the open-loop control type approach in the present paper. Also, the theory of the present paper develops the first approach. We do not compare the present method with methods based on the first approach since our aim is just to propose, analyze and check in practice a new method that could be better or not (probably depending on the examples) than other methods in the literature. The numerical experiments of this section show that our method works fine in practice and is able to provide accurate approximations.

We first notice that due to numerical reasons we have to choose a finite time horizon, so we select a sufficiently large \(t_e>0\), which, in the experiments that follow, it was fixed to \(t_e=3\). As in [1], we consider the following convection-reaction-diffusion equation

$$\begin{aligned} \begin{array}{rcll} z_t-\varepsilon z_{xx} + \gamma z_x + \mu (z^3-z) & =& ub\quad & \hbox {in }I\times (0,t_e),\\ z(\cdot ,0)& =& z_0\quad & \hbox {in }I,\\ z(\cdot ,t) & =& 0\quad & \hbox {in }\partial I\times (0,t_e), \end{array} \end{aligned}$$
(57)

with \(\varepsilon =1/10\), and where \(I=(0,a)\) is an open interval, \(z: I\times [0,t_e]\rightarrow {{\mathbb {R}}}\) denotes the state, and \(\gamma \) and \(\mu \) are positive constants. The controls u belong to the closed, convex, bounded set \({{\mathbb {U}}}_{\textrm{ad}} = L^2(0,t_e,[u_a,u_b])\), for real values \(u_a<u_b\). The cost functional to minimize is given by

$$\begin{aligned} \int _{0}^{t_e} e^{-\lambda t}\left( \left\| z(\cdot ,t,u)\right\| _{L^2(I)}^2 + \frac{1}{100}\left| u(t)\right| ^2\right) \,dt, \end{aligned}$$
(58)

where we set \(\lambda =1\). Notice then that in (58) the aim is to drive the state to zero.

We use a finite-difference method on a uniform grid of size \(\Delta x=l/N\) with \(N=100\) on the interval \(I=(0,l)\) to discretize (57) in space to obtain a system of ordinary differential equations (ODEs). To obtain the snapshots, the ODE system is integrated in time using Matlab’s command ode15s, which uses the numeric differentiation formulae (NDF) [25], with sufficiently small tolerances for the local errors (below \(10^{-12}\)). The snapshots were obtained on a uniform (time) grid of diameter 1/20. The time derivatives were obtained by evaluating the right-hand-side of the system of ODEs.

As in [1], equation (36)

$$\begin{aligned} v_{h,k}^r(y^{i}_r)=\min _{u\in {U}_{\textrm{ad}}}\left\{ (1-\lambda h)v_{h,k}^r(y^{i}_r+hf^r(y^{i}_r,u))+hg^r(y^{i}_r,u)\right\} ,\ i=1,\ldots ,n_s, \end{aligned}$$

was solved by fixed-point iteration, the stopping criterium being that two consecutive iterates differ in the maximum norm in less than a given tolerance \(\textrm{TOL}_v\), initially set to \(\textrm{TOL}_v=5\times 10^{-4}\). For the first iterate we choose a family of constant controls \(u^l\), \(l=1,\ldots ,p\) and at any point of the mesh \(y_r^i\) (initial condition) we compute the approximate solution of (32) corresponding to this initial condition and control \(u^l\). Then, we compute the value of the functional cost (33). Finally, the value of the initial iterate at \(y_r^i\) is the minimum between the values of the functional cost for \(l=1,\ldots ,p\).

Once (36) is solved, the optimal control

$$\begin{aligned} u_{h,k}^r(y^{i}_r)={{\,\textrm{argmin}\,}}_{u\in {U}_{\textrm{ad}}}\left\{ (1-\lambda h)v_{h,k}^r(y^{i}_r+hf^r(y^{i}_r,u))+hg^r(y^{i}_r,u)\right\} , \end{aligned}$$
(59)

is obtained at any mesh point \(y^i_r\), \(i=1,\ldots ,n_s\). Then, the suboptimal feedback operator \(\Phi ^r(y)\) is computed by interpolation. This means that for \(y\in {{\overline{\Omega }}}\) we project onto the POD space to get \(P^r y\) and then

$$\begin{aligned} P^r y=\sum _{i=1}^{n_s}\mu _i y^i_r,\quad \Phi ^r(y)=\sum _{i=1}^{n_s}\mu _i u_{h,k}^r(y^{i}_r), \end{aligned}$$

where the coefficients \(\mu _i\) satisfy \(0\le \mu _i\le 1\), \(\sum _{i=1}^{n_s} \mu _i=1\).

With this, the closed-loop system

$$\begin{aligned} y'(t)=f(y(t),\Phi ^r(y(t))), \qquad y(0)=y_0, \end{aligned}$$
(60)

is integrated, again, using the NDF formulae as implemented in Matlab’s command ode15s with the same tolerances as in the computation of the snapshots. We will see below that very different approximations to the solution of (60) can be obtained with different values of the tolerance \(\textrm{TOL}_v\) for the fixed-point iteration solving (36) (see Fig. 3), so that we solved this equation for decreasing values of \(\textrm{TOL}_v\), each one 5 times smaller than the previous one until the relative error between the solutions of (60) corresponding to two consecutive values of \(\textrm{TOL}_v\) was below 10% (it usually turned out to drop dramatically from above \(10\%\) to less than 0.01%). Here and in the sequel, by the relative error of a quantity \({\hat{y}}\) with respect to y we mean \(\left| y-{\hat{y}}\right| /\max (\left| y\right| ,10^{-3})\). For the optimal HJB states, for every value of time t for which the solution of (60) was computed, we computed the maximum or the relative errors of the components of y(t). For Test 2 in Sect. 4.2, due to the discontinous initial datum, it proved impossible to drive the relative error of two optimal HJB states computed with two different tolerances \(\textrm{TOL}_v\) below 10%, so that we checked that the value of the relative errors measured in the norm (63) below was smaller than 10%.

With respect to the computational cost of solving (59) by fixed-point iteration, it is obviously proportional to the number of iterations. In the experiments below, these values were 1382 for all values of r in Test 1 below, 362, 263 and 242 for \(r=2,3,4\), respectively, in Test 2 below, and 365, 1103 and 1539 for \(r=3,4,5\) in Test 3 in Sect. 4.3 below. On each iteration, the bulk of the cost is finding the nonnegative scalars \(\mu _j^i\), \(j=1,\ldots ,n_s\), such that \(y^{i}_r+hf^r(y^{i}_r,u)= \mu _1^i y_r^1 +\cdots + \mu _{n_s}^i y_r^{n_s}\), which was 70% of the cost of the iteration for \(r=5\) in Test 3 in Sect. 4.3, 79% for \(r=4\) in Test 2, and 95% for \(r=4\) in Test 1 in Sect. 4.1, followed by the cost of obtaining \(f^r(y^{i}_r,u)\), \(i=1,\ldots ,{n_s}\), which was \(4\%\) for \(r=4\) in Tests 1 and 2 below to 28% for \(r=5\) in Test 3 in Sect. 4.3. We note that cost of obtaining \(f^r(y^{i}_r,u)\) can be substantially diminished using appropriate tensors or by means of techniques like discrete empirical inerpolation, which, for simplicity, we did not use in our codes.

4.1 Test 1: Semilinear Equation

As in [1], we consider (57) with \(\gamma =0\) and \(\mu =1\), \(a=1\) and \(b(x)=z_0(x)=2x(1-x)\). It is easy to check that the uncontrolled solution converges, as \(t\rightarrow \infty \) to a non-null steady state (see also [1, Fig. 6.1]), and that the null solution is unstable.

For the finite-difference approximation, we consider \(y:[0,t_e]\rightarrow {{\mathbb {R}}}^{N-1}\) with components \(y_j(t)\approx z(x_j,t)\), \(x_j=j\Delta x\), \(j=1,\ldots ,N-1\), \(\Delta x=1/N\), solution of

$$\begin{aligned} Cy_t=\frac{1}{10}Ay + C(F(y) + uB) \end{aligned}$$
(61)

where the components of F and B are, respectively \(F_j = y_j(1-y_j^2)\), \(B_j=2 x_j(1-x_j)\), \(j=1,\ldots ,N-1\), and A and C are \((N-1)\times (N-1)\) tridiagonal matrices matrices given by

$$\begin{aligned} A=\frac{1}{(\Delta x)^2}\left[ \begin{array} {cccccc} -2 & \ 1\\ \ 1& -2& \ 1\\ & \ddots & \ddots & \ddots \\ & & \ 1& -2 & \ 1 \\ & & & \ 1 & -2\end{array}\right] , \qquad C=\frac{1}{12}\left[ \begin{array} {cccccc} 10 & 1\\ 1& 10& 1\\ & \ddots & \ddots & \ddots \\ & & 1& 10 & 1 \\ & & & 1 & 10\end{array}\right] , \end{aligned}$$
(62)

so that the finite-difference discretization (61) is fourth-order convergent. The norm we consider in \({{\mathbb {R}}}^{N-1}\) is given by

$$\begin{aligned} \left\| y\right\| ^2 = \Delta x \sum _{j=1}^{N-1} y_j^2. \end{aligned}$$
(63)

Let as observe that this norm is an approximation to the integral \(\int _0^1y(x)^2 dx\) of a function with values \(y_j\) at the spatial mesh nodes.

To compute the snapshots, as in [1], for constant controls \(u\in U_{\textrm{snap}}=\{-1,0,1\}\), we obtained the solutions \(y^{(n)} =y(t_n)\) of (61) every 1/20 time units, that is, for \(t_n=n/20\), \(n=1,\ldots 20t_e\), and then the time derivatives \(y_t^{(n)}\) were computed from identity (61). For the reduced spaces, we consider the cases of POD basis with only \(r=2,3\) and 4 elements, also as in [1]. The POD approximation \(y^r\) was then the mean of the snapshots plus a linear combination of the POD basis. The control set \({{\mathbb {U}}}_{\textrm{ad}}\) is given by 41 controls equally distributed in \([-1,1]\).

As in [1], to define the domain \(\overline{\Omega }^r\), we compute the projections of all the snapshots. With this procedure we obtain a set of points in \({\mathbb {R}}^r\). Then, we define an hypercube containing this set of points. The aim of this procedure, in view of Lemma 3, is that the set \(\overline{\Omega }^r\) defined in this way satisfies the invariance condition

$$\begin{aligned} y^r+hf^r(y^r,u)\in {{\overline{\Omega }}}^r, \quad y^r\in {{\overline{\Omega }}}^r, \quad u\in {{\mathbb {U}}}_{\textrm{ad}}. \end{aligned}$$
(64)

The set \({{\overline{\Omega }}}^r\) for \(r=4\) was given by

$$\begin{aligned}{{\overline{\Omega }}}^r =[-0.87,0.41]\times [-0.01,0.02]\times [-0.01,0.01]\times [-0.01,0.01].\end{aligned}$$

For this set we checked that condition (64) holds.

We notice that our set \({{\overline{\Omega }}}^r\) is considerable smaller than the corresponding set in [1] (see [1, 6.1. Test 1]) where the authors use the standard euclidean norm in \({{\mathbb {R}}}^{n}\) rather than the norm (63) we use here. Since the domain is smaller we also consider partitions of \(\Omega ^r\) smaller than those in [1]. We take maximum diameter \(k_r=0.01\), and, as in [1], we choose \(h=0.1k_r\).

Fig. 1
figure 1

Test 1: Optimal HJB states computed with r=4 POD basis functions (top-left), difference between optimal solution with 4 and 2 POD basis functions (top-middle), difference between optimal solution with 4 and 3 POD bases (top-right). Optimal HJB controls with \(r=4,3,2\) (bottom). The red crosses correspond to the values of the controls that we have joined by a blue line

In Fig. 1 we have represented on top the optimal solution (left) for \(r=4\), the difference between optimal solution with 4 and 2 POD basis functions (top middle) and the difference between optimal solution with 4 and 3 POD basis functions (top right). On bottom we have represented the optimal controls for \(r=2\), 3 and 4.

We observe that we get much better results than those in [1], although (apart from using a different set of snapshots) in our method, both the finite-difference method and the time integrator that we use are more accurate than those in [1]. We also notice that there is little discrepancy between the values of the optimal HJB states for the different values of r that we tried. We also computed the values of the cost functional (58) on the optimal HJB states for the three values of r. The values are shown in Fig. 2. It can be seen that the values decrease with r and that they differ in the ninth significant digit.

Fig. 2
figure 2

Test 1: Value of the cost functional (58) on the optimal HJB states for \(r=2,3,4\). The red crosses correspond to the values of the cost values that are joined by a pdf line

Fig. 3
figure 3

Test 1: Relative error between the optimal HJB states with \(r=4\) corresponding to solving (36) by fixed point iteration with tolerances \(\textrm{TOL}_v=5\times 10^{-4}\) and \(\textrm{TOL}_v=1\times 10^{-4}\) (left), \(\textrm{TOL}_v=1\times 10^{-4}\) and \(\textrm{TOL}_v=2\times 10^{-5}\) (centre), and optimal HJB controls (right)

As mentioned above there can be a significant difference between the optimal HJB states computed with solutions obtained by solving (36) with different tolerances \(\textrm{TOL}_v\). In Fig. 3 we show the relative error between the optimal HJB states corresponding to tolerances \(\textrm{TOL}_v=5\times 10^{-4}\) and \(\textrm{TOL}_v=1\times 10^{-4}\) (left), and between this one and that corresponding to \(\textrm{TOL}_v=2\times 10^{-5}\) (centre). The right plot shows the corresponding optimal HJB controls. Figure 3 shows the importance of solving (36) accurately in order to obtain good optimal HJB states, thus, justifying that we computed the (approximations to the) solution of (36) with decreasing values of \(\textrm{TOL}_v\) until the relative error of the corresponding optimal HJB estates was below 10%.

Following the suggestion of one of the reviewers, we checked if the optimal control computed with \(\Delta x=1/100\) stabilizes the same PDE computed on finer meshes. We tried this for \(r=4\) and \(\Delta x=1/400\) and 1/800, and the controlled solutions converged to zero as fast as in the case \(\Delta x=1/100\). Given the accuracy with which we had computed the controls for \(\Delta x=1/100\) and the accuracy of the discretization itself, we did not expect otherwise.

The results above suggest that, for this problem, it is enough with \(r=3\). For this value of r we now check the effect of the finite-difference mesh in the optimal HJB states. In Table 1 we show the relative errors of the optimal HJB states computed with \(N=25\) and \(N=50\) with respect to that computed with \(N=100\), as well as the relative errors of the corresponding controls. For the controls, we show in Table 1 the maximum for all values of \(t\in \{0,0.05,0.1,\ldots ,3\}\) of the relative errors, and for the states we show the maximum on the same values of t of the maximum of the relative errors of the state on all the values of the corresponding spatial grid. They confirm that the finite-difference discretization is of order 4. Due to the excellent accuracy obtained with \(\Delta x=1/50\), the results that follow are done with that value of \(\Delta x\).

Table 1 Relative errors of the optimal HJB states and controls for \(r=3\) computed with \(\Delta x=1/N\), \(N=25\) and \(N=50\), with respect to those computed with \(N=100\)
Fig. 4
figure 4

Test 1: Results for different values of \(k_r\); relative errors between HJB states (top left) and controls (bottom left) with respect to to \(k_r=0.005\); HBJ controls (centre) and values of the cost functional (58) (right)

We now check the effect of different values \(k_r\) of the diameter of the partition of \(\Omega ^r\). To do this, we compare the results obtained with \(r=3\), \(\Delta x=1/50\) and \(k_r=0.02, 0.01, 0.005\). In order not to spoil the better accuracy obtained with the smaller value of \(k_r\) we took \({{\mathbb {U}}}_{\textrm{ad}}\) with 161 controls equally distributed in \([-1,1]\) for \(k_r=0.005\), and, to simplify computations with only 11 controls for \(k_r=0.02\) (we also try with 21 and 41 controls, but, although we do not have at present an explanation for it, using only 11 controls with \(k_r=0.02\) gave somewhat better results). The results can be seen in Fig. 4. The plots on the left show the (maximum of the 51 points of the spatial grid of the) relative errors of the optimal HJB states (top) and their controls (bottom) for \(k_r=0.02\) and \(k_r=0.01\) with respect to those of \(k_r=0.005\), while the plot in the centre shows the HJB controls. The errors, as expected, are smaller for \(k_r=0.01\) than for \(k_r=0.02\), except for the controls for \(t\in [1.8,2.55]\) where they are slightly larger. We also notice that, for \(k_r=0.01\), the relative errors remain below 10% except for \(t\in [1.5.2.2]\) (where they remain below 18%) in the case of the errors in the optimal HJB states and \(t\in [1.8,3]\) for the controls. Notice, however, that the largest errors take place where both the states and the controls are close to zero (recall the plots in Fig. 1), were it is difficult to obtain small relative errors. Maybe this is the reason for the similar values of the cost functional (58) for the three values of \(k_r\), which are shown on the right plot in Fig. 4; the relative errors (with respect to \(k_r=0.005\)) for \(k_r=0.02\) and \(k_r=0.01\) are 0.081% and 0.0023%, respectively.

One may wonder what is the result if the snapshots are used to obtain the POD basis as in [1] instead of the time derivatives as in the present paper. Thus, we repeated our computations but replacing the time derivatives by the snapshots minus their mean. We did not find any significant difference. In Fig. 5 we show the results corresponding to \(\Delta x =1/50\), \(r=3\), the set \(\Omega ^r\) being

$$\begin{aligned} \Omega ^r = (-0.42,0.9)\times (-0.01, 0.02)\times (-0.01,0.01). \end{aligned}$$

The optimal HJB control is shown on the right-plot, while the other two plots show the relative errors of the HJB state (left) and its control (centre) with respect to the results when the POD basis is taken from the time derivatives, \(r=3\) and \(\Delta x=1/100\). We see that the relative errors are below 0.1%, and thus, no difference can be seen between the right-plot in Fig. 5 and the centre plot in Fig. 1. We also notice that our results when the POD basis is taken from the snapshots are better than those in [1]. We believe that this is due to the higher accuracy of our computations (fourth-order convergent finite-difference method instead of a second-order convergent one, NDF with small tolerances to compute the snapshots instead of implicit Euler method, denser sets \({{\mathbb {U}}}_{\textrm{ad}}\) for the control variable, smaller tolerance \(\textrm{TOL}_v\) in the fixed point method to solve (36), etc).

Fig. 5
figure 5

Test 1: Results for POD basis extracted from snapshots (\(\Delta x =1/50\), \(r=3\)); Relative errors of HJB state (left) and control (centre) with respect to the case where POD basis is taken from time derivatives; HBJ control (right)

The fact that very similar results are obtained when the POD basis is taken from the snapshots or the time derivatives should not be surprising. As shown in [18], wether better results are obtained if the POD basis is extracted from the snapshots or from their difference quotients is case-dependent and, as shown in [16], very similar results are usually obtained when the POD basis is taken from the time derivatives or the snapshots difference quotients. The advantage of using time derivatives or difference quotients for the POD basis is more from the theoretical side, since it allows to prove optimal convergence of the POD methods with less assumptions than when the POD basis is extracted from the snapshots. More recently, in [17], it has been proved that using only snapshots for the POD basis, it is possible to prove error estimates for the corresponding POD methods with convergence rates as close to optimal as the smoothness of the solution from where the snapshots are taken allows. In view of the recent results in [17], the analysis in the present paper can be easily adapted to cover also the case where POD basis is taken from the snapshots.

4.2 Test 2: Advection–Diffusion Equation

As in [1], we now consider (57) with \(\gamma =1\) and \(\mu =0\), \(I=(0,2)\) and \(z_0(x)=\max (0,0.5\sin (\pi x))\). We take b as the characteristic function of the inverval (1/2, 1). To compute the POD basis we compute the time derivatives of the states for constant controls \(u=-2.2,-1.1,0\). The semidiscretization was done with a standard finite difference method

$$\begin{aligned} \frac{dy_j}{dt} =\frac{y_{j+1}-2y_j + y_{j-1}}{10(\Delta x)^2} - \frac{y_{j+1}-y_{j-1}}{2\Delta x} + b(x_j), \quad j=1,\ldots ,N-1, \qquad y_0=y_N=0, \end{aligned}$$

which is second order convergent in problems with sufficiently smooth solutions.

Since the initial state \(z_0\) does not possess second-order derivatives in \(L^2\), we notice then that the time derivative \(z_t\) blows up when \(t\rightarrow 0\). For this reason, after the spatial discretization by finite differences, we replaced the time derivative at \(t=0\) by the difference quotient \((y^{(1)}-y^{(0)})/\Delta t\) of states at \(t=0\) and \(t=\Delta t\). Perhaps also for lack bounded time derivatives at \(t=0\) and the more dissipative nature of the implicit Euler method, we found that, in the computation of the optimal HBJ states and controls, better results were obtained if the implicit Euler method with \(\Delta t=1/20\) was used instead of the NDF with small tolerances. Also, for reasons that we do not understand at present, we found that better results were obtained when the POD approximation was a linear combination of the POD basis plus the initial condition \(y_0\), instead of a linear combination of the POD basis plus the mean as in the previous section.

In the previous test we had an invariance set so that we did not need to impose any boundary condition for solving (36). In this test we found it impossible to find a set \(\Omega ^r\) satisfying condition (64) both when the POD basis is taken from the states and from their time derivatives. In this last case the set \(\Omega ^r\) for \(r=4\) we used in the experiments was

$$\begin{aligned}\Omega ^r=(-0.5,0.7)\times (-0.3,1.5)\times (-0.3,0.2)\times (-0.05,0.15).\end{aligned}$$

To overcome the lack of invariance of this set, whenever for a vertex \(y^i_r\) we had \(y^i_r+hf^r(y^i_r,u)\not \in \overline{\Omega }^r\), we simply replaced \(y^i_r+hf^r(y^i_r,u)\) by its closest point on \(\partial \Omega ^r\). This resulted in changing the value of \(y^i_r+hf^r(y^i_r,u)\) in less than 2% in the first two coordinates in the POD basis and \(15\%\) in the remaining ones, except for some negative values of the fourth coordinate where errors up to \(60\%\) were encountered. For example, an error of \(15\%\) in the third coordinate means that for some values of \(y^i_r+hf^r(y^i_r,u)\) the third coordinate could be in the set \([-0.345,0.23]\) instead of \([-0.3,0.2]\). Nevertheless, as we will see below, the results obtained with the POD approximation in this test were excellent.

Since this problem is linear-quadratic, the solution of HJB equation can be computed by solving Riccati equation. In Fig. 6 we show the uncontrolled solution (left), the optimal LQR state (middle) and the optimal LQR control (right).

Fig. 6
figure 6

Test 2: Uncontrolled solution (left), the optimal LQR state (middle) and the optimal LQR control (right)

Fig. 7
figure 7

Test 2: Optimal HJB states computed with \(r=4\) POD basis functions (top-left), difference between optimal solution with 4 and 2 POD basis functions (top middle), difference between optimal solution with 4 and 3 POD basis functions (top-right). Optimal HJB controls with \(r=4,3,2\) (bottom)

In Fig. 7 we have represented on top the optimal solution (left) for \(r=4\), the difference between optimal solution with 4 and 2 POD basis functions (top middle) and the difference between optimal solution with 4 and 3 POD basis functions (top right). On the bottom part we have represented the optimal controls for \(r=2\), 3 and 4. Also in this case, the improvement with respect to the results in [1] is remarkable. In particular, the optimal controls in Fig. 7 compare very well with the optimal LQR control of Fig. 6 even for the case with only \(r=2\) basis functions in our POD method.

To conclude, in Fig. 8 (left) we show the difference between the optimal LQR state and the optimal HJB state computed with \(r=4\) POD basis functions. On the right, we show the relative errors \(\left| u_{\textrm{HJB}} - u_{LQR}\right| /\max (10^{-3}, \left| u_{LQR}\right| )\) of the optimal HJB controls with respect to the optimal LQR control for \(r=2, 3\) and 4. It can be seen a very good agreement between HJB and LQR optimal states. With respect to the optimal controls, we notice that whereas with \(r=3\) and \(r=4\) POD basis functions the errors do not exceed 30% and, indeed, they stay below 10% most of the time, this is not the case of \(r=2\) POD basis functions, where errors are above 100% for more than half the time interval. However, let us observe that we are considering relative errors on the right of Fig. 8 and that restricting ourselves to the time interval in which the optimal control is sufficiently away from zero the errors for \(r=2\) are also below 35%.

Fig. 8
figure 8

Test 2: relative errors \(\left| u_{\textrm{HJB}} - u_{LQR}\right| /\max (10^{-3}, \left| u_{LQR}\right| )\) of the optimal HJB controls with respect to the optimal LQR control (left) and difference between the optimal LQR state and the optimal HJB state computed with \(r=4\) POD basis functions (right)

Again, the results here (which correspond to \(k_r=0.1\)) compare favourably with those in the literature. As in the previouis test, we checked for \(r=4\) if the optimal control computed with \(\Delta x=1/100\) stabilizes the same PDE discretized with \(\Delta =1/400\) and 1/800 with similar results as in the previous section.

Fig. 9
figure 9

Test 3: Uncontrolled solution at \(t=0,1.5,3\)

4.3 Test 3: A Two-Dimensional Reaction–Diffusion Equation

We extend (57) to two dimensions. In particular, we consider,

$$\begin{aligned} \begin{array}{rcll} z_t-\varepsilon \Delta z + (z^3-z) & =& ub\quad & \hbox {in }\Omega \times (0,t_e),\\ z(\cdot ,0)& =& z_0\quad & \hbox {in }\Omega ,\\ z(\cdot ,t) & =& 0\quad & \hbox {in }\partial \Omega \times (0,t_e), \end{array} \end{aligned}$$
(65)

with \(\varepsilon =1/10\), \(\Omega =[0,1]\times [0,1]\) and \(z: \Omega \times [0,t_e]\) denotes the state. The control u belongs to \({{\mathbb {U}}}_{\textrm{ad}} = L^2(0,t_e,[u_a,u_b])\), with \(u_a=-1\) and \(u_b=1\). The cost function is as (58) but with the state measured in \(L^2(\Omega )\) instead of \(L^2(I)\), that is

$$\begin{aligned} \int _{0}^{t_e} e^{-\lambda t}\left( \left\| z(\cdot ,t,u)\right\| _{L^2(\Omega )}^2 + \frac{1}{100}\left| u(t)\right| ^2\right) \,dt, \end{aligned}$$

with \(\lambda =1\) as before. Similarly to Sect. 4.1, we take \(b(x,y)=z_0(x,y)=4x(1-x)y(1-y)\) and \(t_e=3\). In Fig. 9 we show the uncontrolled solution at the initial time, at \(t=t_e/2\) and \(t=t_e\).

For the finite-difference approximation, we consider \(y:[0,t_e]\rightarrow {{\mathbb {R}}}^{(N-1)^2}\) with components \(y_k(t)\approx z({{\varvec{x}}}_k,t)\), where, for \(k=(j-1)(N-1) +i\), \(i,j=1,\ldots ,N-1\), \({{\varvec{x}}}_k= (x_i,y_j)\), and \(x_i=i\Delta x\), \(y_j=j\Delta y\), \(\Delta x=\Delta y=1/N\), solution of

$$\begin{aligned} {\hat{C}}y_t=\frac{1}{10}{\hat{A}}y + {\hat{C}}({\hat{F}}(y) + u{\hat{B}}) \end{aligned}$$
(66)

where the components of \({\hat{F}}\) and \({\hat{B}}\) are, respectively \({\hat{F}}_k = y_k(1-y_k^2)\), \({\hat{B}}_k=4 x_i(1-x_i) y_j(1-y_j)\), \(k=(j-1)(N-1) + i\), \(i,j=1,\ldots ,N-1\), and \({\hat{A}}\) and \({\hat{C}}\) are \((N-1)^2\times (N-1)^2\) matrices given by \({\hat{A}}=I\otimes A + A\otimes I\), \({\hat{C}}= I\otimes C + C\otimes I\), where I is the identity of order \(N-1\), \(\otimes \) represents the Kronecker product of matrices, and A and C are the matrices in (62) so that, as in Sect. 4.1, the finite-difference discretization (66) is fourth-order convergent. The norm we consider in \({{\mathbb {R}}}^{(N-1)^2}\) is given by

$$\begin{aligned} \left\| y\right\| ^2 = \Delta x\Delta y \sum _{k=1}^{(N-1)^2} y_k^2, \end{aligned}$$
(67)

so that it is a discrete version of the \(L^2\) norm in \(\Omega \).

In spite of the similarity with the one-dimensional case, more POD modes were needed to attain results similar to those in Sect. 4.1, and for \(r=5\) modes a set \({{\mathbb {U}}}_{\textrm{ad}}\) with 81 controls uniformly distributed in \([-1,1]\) were used for higher accuracy. The set \({{\overline{\Omega }}}^r\) for \(r=5\) was given by

$$\begin{aligned} {{\overline{\Omega }}}^r =[-0.8,0.36]\times [-0.02,0.03]\times [-0.01,0.01]\times [-0.01,0.01]\times [-0.01,0.01]. \end{aligned}$$

For this set we checked that condition (64) holds.

On the top plots in Fig. 10 (top three plots) we show the optimal HJB state at three different times computed with \(r=5\). Notice that the vertical scale is 5 times smaller that in the plots in Fig. 9. We can see that the results are very similar to the one-dimensional case.

Fig. 10
figure 10

Test 3: Optimal HJB states computed with r=5 POD basis functions at \(t=1,1.5,2\) (top) and optimal Optimal HJB controls with \(r=4,3,2\) (bottom). The red crosses correspond to the values of the controls that have been joined by a blue line in the plot

For \(r=5\) we checked if the optimal control computed with \(\Delta x=\Delta y=1/50\) stabilizes the solutions corresponding to \(\Delta x=\Delta y=1/200\) and \(\Delta x=\Delta y=1/400\), which, as in the previous sections, it did.

5 Conclusions

In this paper we introduce a reduced order method based on POD to mitigate the curse of dimensionality in the numerical approximation of HJB equations. The novelty of the method is the use of snapshots based on temporal derivatives of the controlled nonlinear dynamical system.

We carry out the error analysis of the method based on the recent results obtained in [7] that allow us to get sharper error bounds than those appearing in the literature. In particular, the factor 1/h where h is the time step of the fully discrete method does not appear in our error bounds. Our error bounds are optimal in terms of the time step h and the mesh diameter of the reduced space \(k_r\) and as usual depend also on the size of the tail of eigenvalues in the singular value decomposition. The use of snapshots based on time derivatives allow us to give a bound for some of the terms in the error that could not be bounded with the standard approach.

Numerous numerical experiments are performed. We check that the method behaves in practice as expected from the theoretical error analysis carried out it the present paper. We show the importance of choosing a small tolerance for the fixed point iteration solving the POD fully discrete scheme (36). We include a two-dimensional example to check the good performance of the method also in that case. The new method we propose obtains better results than a similar POD method presented in [1]. Moreover, even for the method shown in [1], we have performed a numerical experiment in which we also get better results. This allows us to conclude the good performance, not only of the new method introduced in this paper, but also of the method presented in [1], when choosing accurate snapshots as well as taking denser enough sets for the control variable and, as mentioned above, small enough tolerance in the fixed point method solving (36). Finally, even in an example in which it is not possible to find an invariance set, we propose a procedure that allows to apply the method proposed in this paper and that produce excellent results.