Appendix A
Proof of theorem 1
The convergence proof is based on Lyapunov analysis. We consider the following positive definite Lyapunov candidate
$$L = V_{1} (x) + V_{2} (x) + \overbrace {{\frac{1}{2}\tilde{W}_{1}^{T} a_{1}^{ - 1} \tilde{W}_{1} }}^{{L_{1} }} + \overbrace {{\frac{1}{2}\tilde{W}_{2}^{T} a_{2}^{ - 1} \tilde{W}_{2} }}^{{L_{2} }} + \frac{1}{2}\tilde{W}_{3}^{T} a_{3}^{ - 1} \tilde{W}_{3} + \frac{1}{2}\tilde{W}_{4}^{T} a_{4}^{ - 1} \tilde{W}_{4}$$
(A.1)
where \(V_{1} (x)\), \(V_{2} (x)\) are approximate solutions to the constrained coupled HJ Eq. 10. The derivative of the Lyapunov function is given by
$$\dot{L}(x) = \dot{V}_{1} (x) + \dot{V}_{2} (x) + \overbrace {{\tilde{W}_{1}^{T} a_{1}^{ - 1} \dot{\tilde{W}}_{1} }}^{{\dot{L}_{1} }} + \overbrace {{\tilde{W}_{2}^{T} a_{2}^{ - 1} \dot{\tilde{W}}}}^{{\dot{L}_{2} }}_{2} + \tilde{W}_{3}^{T} a_{3}^{ - 1} \dot{\tilde{W}}_{3} + \tilde{W}_{4}^{T} a_{4}^{ - 1} \dot{\tilde{W}}_{4}$$
(A.2)
The first term in (A.2) is
$$\begin{gathered} \dot{V}_{1} (x) = \nabla V_{1} \dot{x} = \left( {W_{1}^{T} \nabla \sigma_{1} + \nabla \varepsilon_{1}^{T} } \right)\,\left( {f + g_{1} \hat{u}_{1} + g_{2} \hat{u}_{2} } \right) \hfill \\ = W_{1}^{T} \nabla \sigma_{1} f - W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (\hat{D}_{1} ) - W_{1}^{T} \nabla \sigma_{1} g_{2} \bar{u}_{2} \tanh (\hat{D}_{2} ) + \varepsilon^{\prime}_{1} (x) \hfill \\ \end{gathered}$$
(A.3)
where \(\varepsilon^{\prime}_{1} (x) = \nabla \varepsilon_{1}^{T} (f - g_{1} \bar{u}_{1} \tanh (\hat{D}_{1} ) - g_{2} \bar{u}_{2} \tanh (\hat{D}_{2} ))\).
Add and subtract the terms \(W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (D_{1} )\) and \(W_{1}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} \tanh (D_{2} )\) to (A.3) reveals
$$\begin{gathered} \dot{V}_{1} (x) = W_{1}^{T} \varsigma_{1} (t) \hfill \\ + W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \left( {\tanh (D_{1} ) - \tanh (\hat{D}_{1} )} \right)\, + W_{1}^{T} \nabla \sigma_{1} g_{2} \bar{u}_{2} \left( {\tanh (D_{2} ) - \tanh (\hat{D}_{2} )} \right) + \varepsilon^{\prime}_{1} (x) \hfill \\ \end{gathered}$$
(A.4)
From the HJ Eq. 14 we have
$$W_{1}^{T} \varsigma_{1} (t) = - Q_{1} (x(t)) - M_{1} (u_{1} (t)) - M_{1} (u_{2} (t)) + \varepsilon_{CHJ1}$$
(A.5)
The terms \(M_{1} (u_{1} (t))\) and \(M_{1} (u_{2} (t))\) are obtained as follow by substituting (9) into (3)
$$M_{1} (u_{1} (t)) = \bar{u}_{1} W_{1}^{T} \nabla \sigma_{1} g_{1} \tanh (D_{1} ) + \bar{u}_{1}^{2} \bar{R}_{11} \ln (\underline{{\mathbf{1}}} - \tanh^{2} (D_{1} ))$$
(A.6)
$$M_{1} (u_{2} (t)) = \bar{u}_{2} W_{2}^{T} \nabla \sigma_{2} g_{2} R_{22}^{ - T} R_{12} \tanh (D_{2} ) + \bar{u}_{2}^{2} \bar{R}_{12} \ln (\underline{{\mathbf{1}}} - \tanh^{2} (D_{2} ))$$
(A.7)
where \(\underline{{\mathbf{1}}}\) is a column vector having its all elements equal to one, \(\bar{R}_{11} \in 1 \times \Re^{{m_{1} }}\) and \(\bar{R}_{12} \in 1 \times \Re^{{m_{2} }}\) are row vectors having their elements equal to the elements of the main diagonal of \(R_{11}\) and \(R_{12}\), respectively. Substituting (A.5) into (A.4) gives
$$\begin{gathered} \dot{V}_{1} (x) = - Q_{1} (x(t)) - M_{1} (u_{1} (t)) - M_{1} (u_{2} (t))\, + W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \left( {\tanh (D_{1} ) - \tanh (\hat{D}_{1} )} \right) \hfill \\ + W_{1}^{T} \nabla \sigma_{1} g_{2} \bar{u}_{2} \left( {\tanh (D_{2} ) - \tanh (\hat{D}_{2} )} \right) + \varepsilon^{\prime}_{1} (x) + \varepsilon_{CHJ1} \hfill \\ \end{gathered}$$
(A.8)
According to Assumptions 1 and 2, one can easily show that
$$\left\| {\varepsilon^{\prime}_{1} (x)} \right\| \le b_{\varepsilon 1x} b_{f} \left\| x \right\| + b_{\varepsilon 1x} \left( {\bar{u}_{1} b_{g1} + \bar{u}_{2} b_{g2} } \right).$$
(A.9)
Next, using (A.9) and the fact that \(M_{1} (u_{1} (t))\) and \(M_{1} (u_{2} (t))\) are positive definite, and noting that since \(Q_{1} (x) \ge 0\), there exists a \(q_{1}\) such that \(x^{T} q_{1} x < Q_{1} (x)\) for \(x \in \varOmega\), (A.8) becomes
$$\dot{V}_{1} < - x^{T} q_{1} x + b_{\varepsilon 1x} b_{f} \left\| x \right\| + 2\bar{u}_{1} b_{g1} b_{{\sigma_{1} x}} W_{1} + 2\bar{u}_{2} b_{g2} b_{\sigma 1x} W_{1} + b_{\varepsilon 1x} \left( {\bar{u}_{1} b_{g1} + \bar{u}_{2} b_{g2} } \right) + \varepsilon_{h1m}$$
(A.10)
where \(\varepsilon_{h1}\) is the bound for \(\varepsilon_{CHJ1}\).
Denoting \(k^{\prime}_{1} = b_{\varepsilon 1x} b_{f}\), \(k^{\prime}_{2} = 2b_{\sigma 1x} W_{1} \left( {\bar{u}_{1} b_{g1} + \bar{u}_{2} b_{g2} } \right) + b_{\varepsilon 1x} \left( {\bar{u}_{1} b_{g1} + \bar{u}_{2} b_{g2} } \right) + \varepsilon_{h1}\), (A.10) becomes
$$\dot{V}_{1} < - x^{T} q_{1} x + k^{\prime}_{1} \left\| x \right\| + k^{\prime}_{2}$$
(A.11)
Similarly, by noting that \(x^{T} q_{2} x < Q_{2} (x)\) for \(x \in \varOmega\), for the second term in (A.2) one obtains
$$\dot{V}_{2} < - x^{T} q_{2} x + k^{\prime\prime}_{1} \left\| x \right\| + k^{\prime\prime}_{2}$$
(A.12)
where \(k^{\prime\prime}_{1} = b_{\varepsilon 2x} b_{f}\), \(k^{\prime\prime}_{2} = 2b_{\sigma 2x} W_{2} \left( {\bar{u}_{1} b_{g1} + \bar{u}_{2} b_{g2} } \right) + b_{\varepsilon 2x} \left( {\bar{u}_{1} b_{g1} + \bar{u}_{2} b_{g2} } \right) + \varepsilon_{h2}\), and \(\varepsilon_{h2}\) is bound for \(\varepsilon_{CHJ2}\).
Using (33) and the fact that \(\dot{\tilde{W}}_{1} = - \dot{\hat{W}}_{1}\), the third term of (A.2) is obtained as
$$\dot{L}_{1} = \tilde{W}_{1}^{T} a_{1} a_{1}^{ - 1} \left( {\frac{{\hat{\varsigma }_{1} (t)}}{{\left( {\hat{\varsigma }_{1}^{T} (t)\hat{\varsigma }_{1} (t) + 1} \right)^{2} }}e_{1} (t) + \sum\limits_{k = 1}^{l} {\frac{{\hat{\varsigma }_{1k} }}{{\left( {\hat{\varsigma }_{1k}^{T} \hat{\varsigma }_{1k} + 1} \right)^{2} }}e_{1} (t_{k} )} } \right)$$
(A.13)
where \(\hat{\varsigma }_{1k} = \hat{\varsigma }_{1} (t_{k} )\). For \(e_{1} (t)\) in (A.13) we have
$$e_{1} (t) = Q_{1} (x) + M_{1} (\hat{u}_{1} (t)) + M_{1} (\hat{u}_{2} (t)) + \hat{W}_{1}^{T} \hat{\varsigma }_{1} (t)$$
(A.14)
Adding zero from (17) to (A.14) gives
$$\begin{gathered} e_{1} (t) = M_{1} (\hat{u}_{2} (t)) - M_{1} (u_{2} (t)) + M_{1} (\hat{u}_{1} (t)) - M_{1} (u_{1} (t)) - \tilde{W}_{1}^{T} \hat{\varsigma }_{1} (t) + \varepsilon_{CHJ1} \hfill \\ + W_{1}^{T} \nabla \sigma_{1} \left( {f - g_{1} \bar{u}_{1} \tanh (\hat{D}_{1} ) - g_{2} \bar{u}_{2} \tanh (\hat{D}_{2} )} \right) \hfill \\ - W_{1}^{T} \nabla \sigma_{1} \left( {f - g_{1} \bar{u}_{1} \tanh (D_{1} ) - g_{2} \bar{u}_{2} \tanh (D_{2} )} \right) \hfill \\ \end{gathered}$$
(A.15)
where \(M_{1} (\hat{u}_{1} (t))\) and \(M_{1} (\hat{u}_{2} (t))\) are obtained as follow by substituting (25) and (26) into (3)
$$M_{1} (\hat{u}_{1} (t)) = \bar{u}_{1} \hat{W}_{3}^{T} \nabla \sigma_{1} g_{1} \tanh (\hat{D}_{1} ) + \bar{u}_{1}^{2} \bar{R}_{11} \ln (\underline{{\mathbf{1}}} - \tanh^{2} (\hat{D}_{1} ))$$
(A.16)
$$M_{1} (\hat{u}_{2} (t)) = \bar{u}_{2} \hat{W}_{4}^{T} \nabla \sigma_{2} g_{2} R_{22}^{ - T} R_{12} \tanh (\hat{D}_{2} ) + \bar{u}_{2}^{2} \bar{R}_{12} \ln (\underline{{\mathbf{1}}} - \tanh^{2} (\hat{D}_{2} ))$$
(A.17)
Using (A.6) and (A.16), \(M_{1} (\hat{u}_{1} (t)) - M_{1} (u_{1} (t))\) is obtained as
$$\begin{gathered} M_{1} (\hat{u}_{1} (t)) - M_{1} (u_{1} (t)) = \hat{W}_{3}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (\hat{D}_{1} ) + \bar{u}_{1}^{2} \bar{R}_{11} \ln (\underline{{\mathbf{1}}} - \tanh^{2} (\hat{D}_{1} )) \hfill \\ - W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (D_{1} ) - \bar{u}_{1}^{2} \bar{R}_{11} \ln (\underline{{\mathbf{1}}} - \tanh^{2} (D_{1} )) \hfill \\ \end{gathered}$$
(A.18)
Next, the term \(\ln (\underline{{\mathbf{1}}} - \tanh^{2} (D_{1} ))\) can be closely approximated as [30]
$$\ln (\underline{{\mathbf{1}}} - \tanh^{2} (D_{1} )) \approx - 2D_{1} \text{sgn} (D_{1} ) + \varepsilon_{{D_{1} }} \approx - 2D_{1} \tanh (\delta D_{1} ) + \bar{\varepsilon }_{{D_{1} }}$$
(A.19)
where \(\delta\) is a big constant, \(\bar{\varepsilon }_{{D_{1} }}\) is bounded approximation error, and \(D_{1} = (1/2\bar{u}_{1} )R_{11}^{ - 1} g_{1}^{T} \nabla \sigma_{1}^{T} W_{1}\).
Using (A.19), and adding and subtracting \(W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (\delta \hat{D}_{1} )\) to (A.18), it becomes
$$\begin{gathered} M_{1} (\hat{u}_{1} (t)) - M_{1} (u_{1} (t)) = \hat{W}_{3}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (\hat{D}_{1} ) + \tilde{W}_{3}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (\delta \hat{D}_{1} ) \hfill \\ - W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (D_{1} ) - W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \left( {\tanh (\delta \hat{D}_{1} ) - \tanh (\delta D_{1} )} \right) \hfill \\ + \bar{u}_{1}^{2} \bar{R}_{11} \left( {\bar{\varepsilon }_{{\hat{D}_{1} }} - \bar{\varepsilon }_{{D_{1} }} } \right) \hfill \\ \end{gathered}$$
(A.20)
Likewise, \(M_{1} (\hat{u}_{2} (t)) - M_{1} (u_{2} (t))\) is obtained as
$$\begin{gathered} M_{1} (\hat{u}_{2} (t)) - M_{1} (u_{2} (t)) = \hat{W}_{4}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} R_{22}^{ - 1} R_{12} \tanh (\hat{D}_{2} ) + \tilde{W}_{4}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} R_{22}^{ - T} R_{12} \tanh (\delta \hat{D}_{2} ) \hfill \\ - W_{2}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} R_{22}^{ - T} R_{12} \left( {\tanh (\delta \hat{D}_{2} ) - \tanh (\delta D_{2} )} \right) \hfill \\ - W_{2}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} R_{22}^{ - T} R_{12} \tanh (D_{2} ) + \bar{u}_{2}^{2} \bar{R}_{12} \left( {\bar{\varepsilon }_{{\hat{D}_{2} }} - \bar{\varepsilon }_{{D_{2} }} } \right) \hfill \\ \end{gathered}$$
(A.21)
Substituting (A.20) and (A.21) into (A.15) and doing some manipulations gives
$$e_{1} (t) = - \tilde{W}_{1}^{T} (t)\hat{\varsigma }_{1} (t) - \tilde{W}_{3}^{T} \Pi_{1} (t) - \tilde{W}_{4}^{T} \Pi_{2} (t) + \Xi_{1} (t)$$
(A.22)
where \(\Pi_{1}\) and \(\Pi_{2}\) are defined as (31) and (32), and bounded term \(\Xi_{1} (t)\) is
$$\begin{gathered} \Xi_{1} (t) = W_{1}^{T} \nabla \sigma_{1} g_{2} \bar{u}_{2} \left( {\tanh (D_{2} ) - \tanh (\hat{D}_{2} )} \right) + W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \left( {\tanh (\delta D_{1} ) - \tanh (\delta \hat{D}_{1} )} \right) \hfill \\ + W_{2}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} R_{22}^{ - T} R_{12} \left( {\tanh (\hat{D}_{2} ) - \tanh (D_{2} )} \right) \hfill \\ + W_{2}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} \left( {\tanh (\delta D_{2} ) - \tanh (\delta \hat{D}_{2} )} \right) \hfill \\ + \varepsilon_{h1} + \bar{u}_{1}^{2} \bar{R}_{11} \left( {\bar{\varepsilon }_{{\hat{D}_{1} }} - \bar{\varepsilon }_{{D_{1} }} } \right) + \bar{u}_{2}^{2} \bar{R}_{12} \left( {\bar{\varepsilon }_{{\hat{D}_{2} }} - \bar{\varepsilon }_{{D_{2} }} } \right) \hfill \\ \end{gathered}$$
(A.23)
Similarly, \(e_{1} (t_{k} )\) in (A.13) is obtained as
$$e_{1} (t_{k} ) = - \tilde{W}_{1}^{T} \hat{\varsigma }_{1k} - \tilde{W}_{3}^{T} \Pi_{1} (t_{k} ) - \tilde{W}_{4}^{T} \Pi_{2} (t_{k} ) + \Xi_{1} (t_{k} ).$$
(A.24)
Substituting (A.22) and (A.24) into (A.13), one gets
$$\begin{gathered} \dot{L}_{1} = \tilde{W}_{1}^{T} \left( {\frac{{\hat{\bar{\varsigma }}_{1} }}{{s_{1} }}\left( { - \hat{\varsigma }_{1}^{T} (t)\tilde{W}_{1} - \Pi_{1}^{T} (t)\tilde{W}_{3} - \Pi_{2}^{T} (t)\tilde{W}_{4} + \Xi_{1} (t)} \right)} \right) \hfill \\ + \tilde{W}_{1}^{T} \left( {\sum\limits_{k = 1}^{l} {\frac{{\hat{\bar{\varsigma }}_{1k} }}{{s_{1k} }}\left( { - \hat{\varsigma }_{1k}^{T} \tilde{W}_{1} - \Pi_{1}^{T} (t_{k} )\tilde{W}_{3} - \Pi_{2}^{T} (t_{k} )\tilde{W}_{4} + \Xi_{1} (t_{k} )} \right)} } \right) \hfill \\ \end{gathered}$$
(A.25)
where \(\hat{\bar{\varsigma }}_{1} = \hat{\varsigma }_{1} (t)/(\hat{\varsigma }_{1}^{T} (t)\hat{\varsigma }_{1} (t) + 1)\), \(\hat{\bar{\varsigma }}_{1k} \equiv \hat{\bar{\varsigma }}_{1} (t_{k} ) = \hat{\varsigma }_{1k} /(\hat{\varsigma }_{1k}^{T} \hat{\varsigma }_{1k} + 1)\), \(s_{1} = \hat{\varsigma }_{1}^{T} (t)\hat{\varsigma }_{1} (t) + 1\), \(s_{1k} \equiv s_{1} (t_{k} ) = \hat{\varsigma }_{1k}^{T} \hat{\varsigma }_{1k} + 1\).
Denoting \({\rm T}_{1k} = - \tilde{W}_{3}^{T} \Pi_{1} (t_{k} ) - \tilde{W}_{4}^{T} \Pi_{2} (t_{k} )\) and \(\Xi_{1k} = \Xi_{1} (t_{k} )\), (A.25) becomes
$$\begin{gathered} \dot{L}_{1} = - \tilde{W}_{1}^{T} \left[ {\hat{\bar{\varsigma }}_{1} \hat{\bar{\varsigma }}_{1}^{T} + \sum\limits_{k = 1}^{l} {\hat{\bar{\varsigma }}_{1k} \hat{\varsigma }_{1k}^{T} } } \right]\tilde{W}_{1} (t) + \tilde{W}_{1}^{T} \left( {\frac{{\hat{\bar{\varsigma }}_{1} }}{{s_{1} }}\Xi_{1} (t) + \sum\limits_{k = 1}^{l} {\frac{{\hat{\bar{\varsigma }}_{1k} }}{{s_{1k} }}} \left( {T_{1k} + \Xi_{1k} } \right)} \right) \hfill \\ - \tilde{W}_{3}^{T} \Pi_{1} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}\tilde{W}_{1} - \tilde{W}_{4}^{T} \Pi_{2} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}\tilde{W}_{1} \hfill \\ \end{gathered}$$
(A.26)
Note that \({\rm T}_{1k}\) depends on the actor NN errors of the recorded past times. Now, using \(\tilde{W}_{1} = W_{1} - \hat{W}_{1}\), the third term in (A.2) is obtained as
$$\begin{gathered} \dot{L}_{1} = - \tilde{W}_{1}^{T} \left[ {\hat{\bar{\varsigma }}_{1} \hat{\bar{\varsigma }}_{1}^{T} + \sum\limits_{k = 1}^{l} {\hat{\bar{\varsigma }}_{1k} \hat{\varsigma }_{1k}^{T} } } \right]\tilde{W}_{1} + \tilde{W}_{1}^{T} \left( {\frac{{\hat{\bar{\varsigma }}_{1} }}{{s_{1} }}\Xi_{1} (t) + \sum\limits_{k = 1}^{l} {\frac{{\hat{\bar{\varsigma }}_{1k} }}{{s_{1k} }}} \left( {T_{1k} + \Xi_{1k} } \right)} \right) \hfill \\ - \tilde{W}_{3}^{T} \Pi_{1} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}W_{1} + \tilde{W}_{3}^{T} \Pi_{1} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}\hat{W}_{1} - \tilde{W}_{4}^{T} \Pi_{2} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}W_{1} + \tilde{W}_{4}^{T} \Pi_{2} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}\hat{W}_{1} \hfill \\ \end{gathered}$$
(A.27)
Similarly, the fourth term in (A.2) can be written as
$$\begin{gathered} \dot{L}_{2} = - \tilde{W}_{2}^{T} \left[ {\hat{\bar{\varsigma }}_{2} \hat{\bar{\varsigma }}_{2}^{T} + \sum\limits_{k = 1}^{l} {\hat{\bar{\varsigma }}_{2k} \hat{\varsigma }_{2k}^{T} } } \right]\tilde{W}_{2} + \tilde{W}_{2}^{T} \left( {\frac{{\hat{\bar{\varsigma }}_{2} }}{{s_{2} }}\Xi_{2} (t) + \sum\limits_{k = 1}^{l} {\frac{{\hat{\bar{\varsigma }}_{2k} }}{{s_{2k} }}} \left( {T_{2k} + \Xi_{2k} } \right)} \right) \hfill \\ - \tilde{W}_{4}^{T} \Pi^{\prime}_{2} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}W_{2} + \tilde{W}_{4}^{T} \Pi^{\prime}_{2} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}\hat{W}_{2} - \tilde{W}_{3}^{T} \Pi^{\prime}_{1} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}W_{2} + \tilde{W}_{3}^{T} \Pi^{\prime}_{1} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}\hat{W}_{2} \hfill \\ \end{gathered}$$
(A.28)
where \(\Pi^{\prime}_{1}\) and \(\Pi^{\prime}_{2}\) are defined as (33) and (34), \(\hat{\bar{\varsigma }}_{2} = \hat{\varsigma }_{2} (t)/(\hat{\varsigma }_{2}^{T} (t)\hat{\varsigma }_{2} (t) + 1)\), \(\hat{\bar{\varsigma }}_{2k} \equiv \hat{\bar{\varsigma }}_{2} (t_{k} ) = \hat{\varsigma }_{2k} /(\hat{\varsigma }_{2k}^{T} \hat{\varsigma }_{2k} + 1)\), \(s_{2} = \hat{\varsigma }_{2}^{T} (t)\hat{\varsigma }_{2} (t) + 1\), \(s_{2k} = \hat{\varsigma }_{2k}^{T} \hat{\varsigma }_{2k} + 1\), \({\rm T}_{2k} = - \tilde{W}_{3}^{T} \Pi^{\prime}_{1} (t_{k} ) - \tilde{W}_{4}^{T} \Pi^{\prime}_{2} (t_{k} )\), and the bounded term \(\Xi_{2} (t)\) is
$$\begin{gathered} \Xi_{2} (t) = W_{2}^{T} \nabla \sigma_{2} g_{1} \bar{u}_{1} \left( {\tanh (D_{1} ) - \tanh (\hat{D}_{1} )} \right) + W_{2}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} \left( {\tanh (\delta D_{2} ) - \tanh (\delta \hat{D}_{2} )} \right) \hfill \\ + W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} R_{11}^{ - T} R_{21} \left( {\tanh (\hat{D}_{1} ) - \tanh (D_{1} )} \right) + W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \left( {\tanh (\delta D_{2} ) - \tanh (\delta \hat{D}_{2} )} \right) \hfill \\ + \varepsilon_{h2} + \bar{u}_{1}^{2} \bar{R}_{22} \left( {\bar{\varepsilon }_{{\hat{D}_{2} }} - \bar{\varepsilon }_{{D_{2} }} } \right) + \bar{u}_{2}^{2} \bar{R}_{21} \left( {\bar{\varepsilon }_{{\hat{D}_{1} }} - \bar{\varepsilon }_{{D_{1} }} } \right) \hfill \\ \end{gathered}$$
(A.29)
Next, using (A.11)–(A.12) and (A.27)–(A.28), the derivative of Lyapunov function (A.2) becomes
$$\begin{gathered} \dot{L} < - x^{T} qx + k_{1} \left\| x \right\| + k_{2} \hfill \\ - \tilde{W}_{1}^{T} \left[ {\hat{\bar{\varsigma }}_{1} \hat{\bar{\varsigma }}_{1}^{T} + \sum\limits_{k = 1}^{l} {\hat{\bar{\varsigma }}_{1k} \hat{\varsigma }_{1k}^{T} } } \right]\tilde{W}_{1} + \tilde{W}_{1}^{T} \left( {\frac{{\hat{\bar{\varsigma }}_{1} }}{{s_{1} }}\Xi_{1} (t) + \sum\limits_{k = 1}^{l} {\frac{{\hat{\bar{\varsigma }}_{1k} }}{{s_{1k} }}} \left( {T_{1k} + \Xi_{1k} } \right)} \right) \hfill \\ - \tilde{W}_{2}^{T} \left[ {\hat{\bar{\varsigma }}_{2} \hat{\bar{\varsigma }}_{2}^{T} + \sum\limits_{k = 1}^{l} {\hat{\bar{\varsigma }}_{2k} \hat{\varsigma }_{2k}^{T} } } \right]\tilde{W}_{2} + \tilde{W}_{2}^{T} \left( {\frac{{\hat{\bar{\varsigma }}_{2} }}{{s_{2} }}\Xi_{2} (t) + \sum\limits_{k = 1}^{l} {\frac{{\hat{\bar{\varsigma }}_{2k} }}{{s_{2k} }}} \left( {T_{2k} + \Xi_{2k} } \right)} \right) \hfill \\ - \tilde{W}_{3}^{T} \left( {a_{3}^{ - 1} \dot{\hat{W}}_{3} - \Pi_{1} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}\hat{W}_{1} - \Pi^{\prime}_{1} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}\hat{W}_{2} } \right) - \tilde{W}_{3}^{T} \left( {\Pi_{1} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}W_{1} + \Pi^{\prime}_{1} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}W_{2} } \right) \hfill \\ - \tilde{W}_{4}^{T} \left( {a_{4}^{ - 1} \dot{\hat{W}}_{4} - \Pi_{2} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}\hat{W}_{1} - \Pi^{\prime}_{2} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}\hat{W}_{2} } \right) - \tilde{W}_{4}^{T} \left( {\Pi_{2} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}W_{1} + \Pi^{\prime}_{2} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}W_{2} } \right) \hfill \\ \end{gathered}$$
(A.30)
where \(k_{1} = k^{\prime}_{1} + k^{\prime\prime}_{1}\), \(k_{2} = k^{\prime}_{2} + k^{\prime\prime}_{2}\), \(q = q_{1} + q_{2}\). Now, we define the actor NN tuning laws for the first and second agent as
$$\dot{\hat{W}}_{3} = - a_{3} \left( {\left( {B_{3} \hat{W}_{3} - B_{1} \hat{\bar{\varsigma }}_{1}^{T} \hat{W}_{1} } \right) - \Pi_{1} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}\hat{W}_{1} - \Pi^{\prime}_{1} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}\hat{W}_{2} } \right)$$
(A.31)
$$\dot{\hat{W}}_{4} = - a_{4} \left( {\left( {B_{4} \hat{W}_{4} - B_{2} \hat{\bar{\varsigma }}_{2}^{T} \hat{W}_{2} } \right) - \Pi_{2} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}\hat{W}_{1} - \Pi^{\prime}_{2} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}\hat{W}_{2} } \right)$$
(A.32)
These add to \(\dot{L}\) the terms
$$\begin{gathered} \tilde{W}_{3}^{T} B_{3} W_{1} - \tilde{W}_{3}^{T} B_{3} \tilde{W}_{3} - \tilde{W}_{3}^{T} B_{1} \hat{\bar{\varsigma }}_{1}^{T} W_{1} + \tilde{W}_{3}^{T} B_{1} \hat{\bar{\varsigma }}_{1}^{T} \tilde{W}_{1} + \hfill \\ \tilde{W}_{4}^{T} B_{4} W_{2} - \tilde{W}_{4}^{T} B_{4} \tilde{W}_{4} - \tilde{W}_{4}^{T} B_{2} \hat{\bar{\varsigma }}_{2}^{T} W_{2} + \tilde{W}_{4}^{T} B_{2} \hat{\bar{\varsigma }}_{2}^{T} \tilde{W}_{2} \hfill \\ \end{gathered}$$
(A.33)
Using (A.33), and applying Young inequality [40] to the terms \(\tilde{W}_{3}^{T} B_{1} \hat{\bar{\varsigma }}_{1}^{T} \tilde{W}_{1}\), \(\tilde{W}_{4}^{T} B_{2} \hat{\bar{\varsigma }}_{2}^{T} \tilde{W}_{2}\), \(\dot{L}\) becomes
$$\begin{gathered} \dot{L} < - x^{T} qx + k_{1} \left\| x \right\| + k_{2} \hfill \\ - \tilde{W}_{1}^{T} \left[ {\hat{\bar{\varsigma }}_{1} \hat{\bar{\varsigma }}_{1}^{T} + \sum\limits_{k = 1}^{l} {\hat{\bar{\varsigma }}_{1k} \hat{\varsigma }_{1k}^{T} } } \right]\tilde{W}_{1} + \tilde{W}_{1}^{T} \left( {\frac{{\hat{\bar{\varsigma }}_{1} }}{{s_{1} }}\Xi_{1} (t) + \sum\limits_{k = 1}^{l} {\frac{{\hat{\bar{\varsigma }}_{1k} }}{{s_{1k} }}} \left( {T_{1k} + \Xi_{1k} } \right)} \right) \hfill \\ - \tilde{W}_{3}^{T} B_{3} \tilde{W}_{3} + \tilde{W}_{3}^{T} B_{3} W_{1} - \tilde{W}_{3}^{T} B_{1} \hat{\bar{\varsigma }}_{1}^{T} W_{1} + \frac{1}{2}\tilde{W}_{3}^{T} B_{1} B_{1}^{T} \tilde{W}_{3} + \frac{1}{2}\tilde{W}_{1}^{T} \hat{\bar{\varsigma }}_{1} \hat{\bar{\varsigma }}_{1}^{T} \tilde{W}_{1} \hfill \\ - \tilde{W}_{2}^{T} \left[ {\hat{\bar{\varsigma }}_{2} \hat{\bar{\varsigma }}_{2}^{T} + \sum\limits_{k = 1}^{l} {\hat{\bar{\varsigma }}_{2k} \hat{\varsigma }_{2k}^{T} } } \right]\tilde{W}_{2} + \tilde{W}_{2}^{T} \left( {\frac{{\hat{\bar{\varsigma }}_{2} }}{{s_{2} }}\Xi_{2} (t) + \sum\limits_{k = 1}^{l} {\frac{{\hat{\bar{\varsigma }}_{2k} }}{{s_{2k} }}} \left( {T_{2k} + \Xi_{2k} } \right)} \right) \hfill \\ - \tilde{W}_{4}^{T} B_{4} \tilde{W}_{4} + \tilde{W}_{4}^{T} B_{4} W_{2} - \tilde{W}_{4}^{T} B_{2} \hat{\bar{\varsigma }}_{2}^{T} W_{2} + \frac{1}{2}\tilde{W}_{4}^{T} B_{2} B_{2}^{T} \tilde{W}_{4} + \frac{1}{2}\tilde{W}_{2}^{T} \hat{\bar{\varsigma }}_{2} \hat{\bar{\varsigma }}_{2}^{T} \tilde{W}_{2} \hfill \\ - \tilde{W}_{3}^{T} \left( {\Pi_{1} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}W_{1} + \Pi^{\prime}_{1} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}W_{2} } \right) - \tilde{W}_{4}^{T} \left( {\Pi_{2} (t)\frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}W_{1} + \Pi^{\prime}_{2} (t)\frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}W_{2} } \right) \hfill \\ \end{gathered}$$
(A.34)
Denoting \(N_{i} = \hat{\bar{\varsigma }}_{i} \hat{\bar{\varsigma }}_{i}^{T} + 2\sum\nolimits_{k = 1}^{l} {\hat{\bar{\varsigma }}_{ik} \hat{\varsigma }_{ik}^{T} }\), \(\varGamma_{i} = \frac{{\hat{\bar{\varsigma }}_{i} }}{{s_{i} }}\Xi_{i} (t) + \sum\nolimits_{k = 1}^{l} {\frac{{\hat{\bar{\varsigma }}_{ik} }}{{s_{ik} }}} \left( {T_{ik} + \Xi_{ik} } \right)\), \(i = 1,\,2\). If Condition 1 is satisfied, then \(N_{i}\) is positive definite and thus \(\dot{L}\) can be written as
$$\begin{gathered} \dot{L} < - x^{T} qx + k_{1} \left\| x \right\| + k_{2} \hfill \\ - 0.5\lambda_{\hbox{min} } (N_{1} )\tilde{W}_{1}^{T} \tilde{W}_{1} + \tilde{W}_{1}^{T} \varGamma_{1} - 0.5\lambda_{\hbox{min} } (N_{2} )\tilde{W}_{2}^{T} \tilde{W}_{2} + \tilde{W}_{2}^{T} \varGamma_{2} \hfill \\ - \tilde{W}_{3}^{T} \left( {B_{3} - \frac{1}{2}B_{1} B_{1}^{T} } \right)\tilde{W}_{3} + \tilde{W}_{3}^{T} \left( {B_{3} W_{1} + B_{1} \hat{\bar{\varsigma }}_{1}^{T} W_{1} + \Pi_{1} \frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}W_{1} + \Pi^{\prime}_{1} \frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}W_{2} } \right) \hfill \\ - \tilde{W}_{4}^{T} \left( {B_{4} - \frac{1}{2}B_{2} B_{2}^{T} } \right)\tilde{W}_{4} + \tilde{W}_{4}^{T} \left( {B_{4} W_{2} + B_{2} \hat{\bar{\varsigma }}_{2}^{T} W_{2} + \Pi_{2} \frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}W_{1} + \Pi^{\prime}_{2} \frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}W_{2} } \right) \hfill \\ \end{gathered}$$
(A.35)
where \(\lambda_{\hbox{min} } (N_{i} )\), \(i = 1,\,2\) is the minimum eigenvalue of \(N_{i}\), \(i = 1,\,2\). Define \(c = B_{3} - \frac{1}{2}B_{1} B_{1}^{T}\), \(d = B_{4} - \frac{1}{2}B_{2} B_{2}^{T}\)
If we choose the design parameters \(B_{1} ,\,B_{2} ,\,B_{3} ,\,B_{4}\) such that \(c > 0\) and \(d > 0\), then the derivative of the Lyapunov function is less than zero if
$$\left\| x \right\| > \frac{{k_{1} }}{{2\lambda_{\hbox{min} } (q)}} + \sqrt {\frac{{k_{1}^{2} }}{{4\lambda_{\hbox{min} }^{2} (q)}} + \frac{{k_{2} }}{{\lambda_{\hbox{min} } (q)}}}$$
(A.36)
$$\left\| {\tilde{W}_{1} } \right\| > \frac{{2\varGamma_{1} }}{{\lambda_{\hbox{min} } (N_{1} )}}$$
(A.37)
$$\left\| {\tilde{W}_{2} } \right\| > \frac{{2\varGamma_{2} }}{{\lambda_{\hbox{min} } (N_{2} )}}$$
(A.38)
$$\left\| {\tilde{W}_{3} } \right\| > \frac{{B_{3} W_{1} + B_{1} \hat{\bar{\varsigma }}_{1}^{T} W_{1} + \Pi_{1} \frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}W_{1} + \Pi^{\prime}_{1} \frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}W_{2} }}{c}$$
(A.39)
$$\left\| {\tilde{W}_{4} } \right\| > \frac{{B_{4} W_{2} + B_{2} \hat{\bar{\varsigma }}_{2}^{T} W_{2} + \Pi_{2} \frac{{\hat{\bar{\varsigma }}_{1}^{T} }}{{s_{1} }}W_{1} + \Pi^{\prime}_{2} \frac{{\hat{\bar{\varsigma }}_{2}^{T} }}{{s_{2} }}W_{2} }}{d}$$
(A.40)
Thus, using standard Lyapunov theory, all the critic and actor NN weight estimation errors are UUB, and the systems states are guaranteed to never leave their initial compact set.
This completes the proof. \(\square\)
Appendix B
Proof of Theorem 2
a Consider all the UUB weight errors in Theorem 2. The approximate constrained coupled HJ equations are
$$\begin{gathered} H_{1} \left( {x,\hat{W}_{1} ,\hat{u}_{1} ,\hat{u}_{2} } \right) = Q_{1} (x) + \hat{W}_{1}^{T} \nabla \sigma_{1} f - \hat{W}_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (\hat{D}_{1} ) - \hat{W}_{1}^{T} \nabla \sigma_{1} g_{2} \bar{u}_{2} \tanh (\hat{D}_{2} ) \hfill \\ + \hat{W}_{3}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (\hat{D}_{1} ) - \hat{W}_{3}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (\delta \hat{D}_{1} ) + \bar{u}_{1}^{2} \bar{R}_{11} \bar{\varepsilon }_{{\hat{D}_{1} }} \hfill \\ + \hat{W}_{4}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} R_{22}^{ - T} R_{12} \tanh (\hat{D}_{2} ) - \hat{W}_{4}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} R_{22}^{ - T} R_{12} \tanh (\delta \hat{D}_{2} ) + \bar{u}_{2}^{2} \bar{R}_{12} \bar{\varepsilon }_{{\hat{D}_{2} }} \hfill \\ \end{gathered}$$
(B.1)
$$\begin{gathered} H_{2} \left( {x,\hat{W}_{2} ,\hat{u}_{1} ,\hat{u}_{2} } \right) = Q_{2} (x) + \hat{W}_{2}^{T} \nabla \sigma_{2} f - \hat{W}_{2}^{T} \nabla \sigma_{2} g_{1} \bar{u}_{1} \tanh (\hat{D}_{1} ) - \hat{W}_{2}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} \tanh (\hat{D}_{2} ) \hfill \\ + \hat{W}_{3}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} R_{11}^{ - T} R_{21} \tanh (\hat{D}_{1} ) - \hat{W}_{3}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} R_{11}^{ - T} R_{21} \tanh (\delta \hat{D}_{1} ) + \bar{u}_{1}^{2} \bar{R}_{21} \bar{\varepsilon }_{{\hat{D}_{1} }} \hfill \\ + \hat{W}_{4}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} \tanh (\hat{D}_{2} ) - \hat{W}_{4}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} \tanh (\delta \hat{D}_{2} ) + \bar{u}_{2}^{2} \bar{R}_{12} \bar{\varepsilon }_{{\hat{D}_{2} }} \hfill \\ \end{gathered}$$
(B.2)
After adding zero from HJ equations in (17) and (18) and using the fact that \(\tilde{W}_{1} = W_{1} - \hat{W}_{1}\), \(\tilde{W}_{2} = W_{2} - \hat{W}_{2}\), \(\tilde{W}_{3} = W_{1} - \hat{W}_{3}\), \(\tilde{W}_{4} = W_{2} - \hat{W}_{4}\), one has
$$\begin{gathered} H_{1} \left( {x,\hat{W}_{1} ,\hat{u}_{1} ,\hat{u}_{2} } \right) = - \tilde{W}_{1}^{T} \nabla \sigma_{1} f + \tilde{W}_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \tanh (\hat{D}_{1} ) + \tilde{W}_{1}^{T} \nabla \sigma_{1} g_{2} \bar{u}_{2} \tanh (\hat{D}_{2} ) \hfill \\ + \tilde{W}_{3}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \left( {\tanh (\delta \hat{D}_{1} ) - \tanh (\hat{D}_{1} )} \right) \hfill \\ + \tilde{W}_{4}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} R_{22}^{ - T} R_{12} \left( {\tanh (\delta \hat{D}_{2} ) - \tanh (\hat{D}_{2} )} \right) \hfill \\ + W_{1}^{T} \nabla \sigma_{1} g_{2} \bar{u}_{2} \left( {\tanh (D_{2} ) - \tanh (\hat{D}_{2} )} \right) \hfill \\ + W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} \left( {\tanh (\delta D_{1} ) - \tanh (\delta \hat{D}_{1} )} \right)\, \hfill \\ + W_{2}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} R_{22}^{ - T} R_{12} \left( {\tanh (\delta D_{2} ) - \tanh (\delta \hat{D}_{2} )} \right) \hfill \\ + W_{2}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} R_{22}^{ - T} R_{12} \left( {\tanh (\hat{D}_{2} - \tanh (D_{2} )} \right) \hfill \\ + \bar{u}_{1}^{2} \bar{R}_{11} \bar{\varepsilon }_{{\hat{D}_{1} }} + \bar{u}_{2}^{2} \bar{R}_{12} \bar{\varepsilon }_{{\hat{D}_{2} }} - \bar{u}_{1}^{2} \bar{R}_{11} \bar{\varepsilon }_{{D_{1} }} - \bar{u}_{2}^{2} \bar{R}_{12} \bar{\varepsilon }_{{D_{2} }} + \varepsilon_{CHJ1} \hfill \\ \end{gathered}$$
(B.3)
$$\begin{gathered} H_{2} \left( {x,\hat{W}_{2} ,\hat{u}_{1} ,\hat{u}_{2} } \right) = - \tilde{W}_{2}^{T} \nabla \sigma_{2} f + \tilde{W}_{2}^{T} \nabla \sigma_{2} g_{1} \bar{u}_{1} \tanh (\hat{D}_{1} ) + \tilde{W}_{2}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} \tanh (\hat{D}_{2} ) \hfill \\ + \tilde{W}_{4}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} \left( {\tanh (\delta \hat{D}_{2} ) - \tanh (\hat{D}_{2} )} \right) \hfill \\ + \tilde{W}_{3}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} R_{11}^{ - T} R_{21} \left( {\tanh (\delta \hat{D}_{1} ) - \tanh (\hat{D}_{1} )} \right) \hfill \\ + W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} R_{11}^{ - T} R_{21} \left( {\tanh (\hat{D}_{1} ) - \tanh (D_{1} )} \right) \hfill \\ + W_{1}^{T} \nabla \sigma_{1} g_{1} \bar{u}_{1} R_{11}^{ - T} R_{21} \left( {\tanh (\delta D_{1} ) - \tanh (\delta \hat{D}_{1} )} \right) \hfill \\ + W_{2}^{T} \nabla \sigma_{2} g_{1} \bar{u}_{1} \left( {\tanh (D_{1} ) - \tanh (\hat{D}_{1} )} \right) \hfill \\ + W_{2}^{T} \nabla \sigma_{2} g_{2} \bar{u}_{2} \left( {\tanh (\delta D_{2} ) - \tanh (\delta \hat{D}_{2} )} \right) \hfill \\ + \bar{u}_{1}^{2} \bar{R}_{21} \bar{\varepsilon }_{{\hat{D}_{1} }} + \bar{u}_{2}^{2} \bar{R}_{22} \bar{\varepsilon }_{{\hat{D}_{2} }} - \bar{u}_{1}^{2} \bar{R}_{21} \bar{\varepsilon }_{{D_{1} }} - \bar{u}_{2}^{2} \bar{R}_{22} \bar{\varepsilon }_{{D_{2} }} + \varepsilon_{CHJ2} \hfill \\ \end{gathered}$$
(B.4)
Now using Assumptions 1 and 2, taking norms in (B.3) and (B.4) reveals
$$\begin{gathered} \left\| {H_{1} \left( {x,\hat{W}_{1} ,\hat{u}_{1} ,\hat{u}_{2} } \right)} \right\| \le b_{f} b_{\sigma 1x} \left\| x \right\|\left\| {\tilde{W}_{1}^{T} } \right\| + b_{\sigma 1x} \left( {b_{g1} \bar{u}_{1} + b_{g2} \bar{u}_{2} } \right)\left\| {\tilde{W}_{1}^{T} } \right\| + b_{\sigma 1x} b_{g1} \bar{u}_{1} \left\| {\tilde{W}_{3}^{T} } \right\| \hfill \\ + b_{\sigma 2x} b_{g2} \bar{u}_{2} \lambda_{\hbox{max} } \left( {R_{22}^{ - T} } \right)\lambda_{\hbox{max} } \left( {R_{12} } \right)\left\| {\tilde{W}_{4}^{T} } \right\| \hfill \\ + b_{\sigma 1x} b_{g2} \bar{u}_{2} W_{1m}^{T} + b_{\sigma 1x} b_{g1} \bar{u}_{1} W_{1m}^{T} + 2b_{\sigma 2x} b_{g2} \bar{u}_{2} \lambda_{\hbox{max} } \left( {R_{22}^{ - T} } \right)\lambda_{\hbox{max} } \left( {R_{12} } \right)W_{2m}^{T} + \left\| {\varepsilon_{1} } \right\| \hfill \\ \end{gathered}$$
(B.5)
$$\begin{gathered} \left\| {H_{2} \left( {x,\hat{W}_{2} ,\hat{u}_{1} ,\hat{u}_{2} } \right)} \right\| \le b_{f} b_{\sigma 2x} \left\| x \right\|\left\| {\tilde{W}_{2}^{T} } \right\| + b_{\sigma 2x} b_{g1} \bar{u}_{1} \left\| {\tilde{W}_{2}^{T} } \right\| + b_{\sigma 2x} b_{g2} \bar{u}_{2} \left\| {\tilde{W}_{2}^{T} } \right\| \hfill \\ + b_{\sigma 2x} b_{g2} \bar{u}_{2} \left\| {\tilde{W}_{4}^{T} } \right\| + 2b_{\sigma 1x} b_{g1} \bar{u}_{1} \lambda_{\hbox{max} } \left( {R_{11}^{ - T} } \right)\lambda_{\hbox{max} } \left( {R_{21} } \right)\left\| {\tilde{W}_{3}^{T} } \right\| \hfill \\ + 2b_{\sigma 1x} b_{g1} \bar{u}_{1} \lambda_{\hbox{max} } \left( {R_{11}^{ - T} } \right)\lambda_{\hbox{max} } \left( {R_{21} } \right)W_{1m}^{T} + b_{\sigma 2x} b_{g1} \bar{u}_{1} W_{2m}^{T} + b_{\sigma 2x} b_{g2} \bar{u}_{2} W_{2m}^{T} + \left\| {\varepsilon_{2} } \right\| \hfill \\ \end{gathered}$$
(B.6)
where
$$\begin{gathered} \left\| {\varepsilon_{1} } \right\| \le \bar{u}_{1}^{2} \bar{R}_{11} \left( {\bar{\varepsilon }_{{\hat{D}_{1} }} - \bar{\varepsilon }_{{D_{1} }} } \right) + \bar{u}_{2}^{2} \bar{R}_{12} \left( {\bar{\varepsilon }_{{\hat{D}_{2} }} - \bar{\varepsilon }_{{D_{2} }} } \right) + \varepsilon_{h1} \hfill \\ \left\| {\varepsilon_{2} } \right\| \le \bar{u}_{1}^{2} \bar{R}_{21} \left( {\bar{\varepsilon }_{{\hat{D}_{1} }} - \bar{\varepsilon }_{{D_{1} }} } \right) + \bar{u}_{2}^{2} \bar{R}_{22} \left( {\bar{\varepsilon }_{{\hat{D}_{2} }} - \bar{\varepsilon }_{{D_{2} }} } \right) + \varepsilon_{h2}, \hfill \\ \end{gathered}$$
and \(\varepsilon_{h1}\), \(\varepsilon_{h2}\) are bounds for \(\varepsilon_{CHJ1}\), \(\varepsilon_{CHJ2}\), respectively. All the signals on the right hand side of (B.5) and (B.6) are UUB. Therefore, \(\left\| {H_{1} \left( {x,\hat{W}_{1} ,\hat{u}_{1} ,\hat{u}_{2} } \right)} \right\|\) and \(\left\| {H_{2} \left( {x,\hat{W}_{2} ,\hat{u}_{1} ,\hat{u}_{2} } \right)} \right\|\) are UUB and convergence to the approximate coupled HJ solutions is obtained.
b. Consider \(\hat{u}_{1}\) and \(\hat{u}_{2}\) in (25) and (26). Then one has
$$\begin{gathered} \left\| {u_{1} - \hat{u}_{1} } \right\| = \bar{u}_{1} \left\| { - \tanh \left( {1/(2\bar{u}_{1} )R_{11}^{ - 1} g_{1}^{T} \nabla \sigma_{1}^{T} W_{1} } \right) + \tanh \left( {1/(2\bar{u}_{1} )R_{11}^{ - 1} g_{1}^{T} \nabla \sigma_{1}^{T} \hat{W}_{3} } \right)} \right\| \hfill \\ \le \bar{u}_{1} \left\| { - \tanh \left( {1/(2\bar{u}_{1} )R_{11}^{ - 1} g_{1}^{T} \nabla \sigma_{1}^{T} W_{1} } \right) + \tanh \left( {1/(2\bar{u}_{1} )R_{11}^{ - 1} g_{1}^{T} \nabla \sigma_{1}^{T} \left( {W_{1} - \tilde{W}_{3} } \right)} \right)} \right\| \hfill \\ \end{gathered}$$
(B.7)
$$\begin{gathered} \left\| {u_{2} - \hat{u}_{2} } \right\| = \bar{u}_{2} \left\| { - \tanh \left( {1/(2\bar{u}_{2} )R_{22}^{ - 1} g_{2}^{T} \nabla \sigma_{2}^{T} W_{2} } \right) + \tanh \left( {1/(2\bar{u}_{2} )R_{22}^{ - 1} g_{2}^{T} \nabla \sigma_{2}^{T} \hat{W}_{4} } \right)} \right\| \hfill \\ \le \bar{u}_{1} \left\| { - \tanh \left( {1/(2\bar{u}_{2} )R_{22}^{ - 1} g_{2}^{T} \nabla \sigma_{2}^{T} W_{2} } \right) + \tanh \left( {1/(2\bar{u}_{2} )R_{22}^{ - 1} g_{2}^{T} \nabla \sigma_{2}^{T} \left( {W_{2} - \tilde{W}_{4} } \right)} \right)} \right\| \hfill \\ \end{gathered}$$
(B.8)
Hence, \(\left\| {u_{1} - \hat{u}_{1} } \right\|\) and \(\left\| {u_{2} - \hat{u}_{2} } \right\|\) are UUB. Therefore, the pair \(\left( {\hat{u}_{1} ,\hat{u}_{2} } \right)\) gives the approximate Nash equilibrium solution of the game and this completes the proof.\(\square\)