Appendix
Two lemmas are first presented in order to prove the main theorem.
Lemma 1
Suppose that
\({F:\mathbb{R}^Q\rightarrow \mathbb{R}}\)
is continuous and differentiable on a compact set
\({{\bf D}\subset\mathbb{R}^Q,}\)
and that
\(\overline{\varvec{\Upomega}}=\{{\bf x}\in {\bf D}\,|\,\frac{\partial F({\bf x})}{\partial{\bf x}}=0\}\)
contains only a finite number of points. If a sequence
\(\{{\bf x}^k\}\subset {\bf D}\)
satisfies
$$ \lim\limits_{k\rightarrow\infty}\|{\bf x}^{k+1}-{\bf x}^k\|=0,\quad \lim\limits_{k\rightarrow\infty}\left\|\frac{\partial F({\bf x}^k)} {\partial{\bf x}}\right\|=0, $$
then there exists a point
\({\bf x}^{\ast}\in \overline{\varvec{\Upomega}}\)
such that
\(\lim_{k\rightarrow\infty}{\bf x}^k={\bf x}^{\ast}.\)
Proof
This result is almost the same as Theorem 14.1.5 in [29]. The detail of the proof is therefore omitted. \(\square\)
To continue the proof of the theorem, introduce the following notations for any 1 ≤ j ≤ J, 1 ≤ i ≤ n and \(k=0,1,2,\ldots\):
$$ \begin{aligned} \mathit{\Upphi}_0^{k,j}={\bf u}^k\cdot (f^{k,j}\odot G^{k,j}),\quad& \varphi^{k,j}=f^{k+1,j}-f^{k,j},\\ \quad&\psi^{k,j}=G^{k+1,j}-G^{k,j},\\ \end{aligned} $$
(21)
$$ \xi_i^{k,j}={\bf x}^j-{\bf a}_{\bf i}^k, \quad \mathit{\Upphi}_i^{k,j}=\xi_i^{k,j}\odot {\bf b}_{\bf i}^k. $$
(22)
Lemma 2
Suppose that Assumptions (A1) and (A2) both hold, then for all 1 ≤ i ≤ n, 1 ≤ j ≤ J
and
\(k=0,1,2\ldots\)
$$ \|\mathit{\Upphi}_0^{k,j}\|\leq nC_0, \quad \|\xi_i^{k,j}\|\leq C_1, \quad \|\mathit{\Upphi}_i^{k,j}\|\leq C_1, \quad \|O^{j}\|\leq C_1, $$
(23)
$$ \sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})(\Updelta {\bf u}^{k}\cdot (\varphi^{k,j}\odot \psi^{k,j}))\leq C_2\eta^2 \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf W}}\right\|^2, $$
(24)
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})(\Updelta{\bf u}^k\cdot(\varphi^{k,j}\odot G^{k,j}))\\ &\quad \leq C_3\eta^2\left( \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf u}}\right\|^2+\sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2\right), \end{aligned} $$
(25)
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime} (\mathit{\Upphi}_0^{k,j})(\Updelta {\bf u}^k\cdot(f^{k,j}\odot\psi^{k,j}))\\ &\quad \leq C_4\eta^2 \left( \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf u}}\right\|^2+ \sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+ \sum\limits_{i=1}^n \left\| \frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right), \end{aligned} $$
(26)
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime} (\mathit{\Upphi}_0^{k,j})({\bf u}^k\cdot(\varphi^{k,j}\odot\psi^{k,j}))\\ &\quad\leq C_5\eta^2\sum\limits_{i=1}^n\left( \left\| \frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}} \right\|^2+\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right),\\ \end{aligned} $$
(27)
$$ \sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})(\Updelta{\bf u}^k\cdot(f^{k,j}\odot G^{k,j}))= -\eta \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf u}}\right\|^2, $$
(28)
$$ \sum\limits_{{j = 1}}^{J} {g_{j}^{\prime } } \left({\Upphi}_{0}^{{k,j}} \right)({\mathbf{u}}^{k} \cdot(\varphi^{{k,j}} \odot G^{{k,j}} )) \le - (\eta - C_{6} \eta ^{2})\sum\limits_{{i = 1}}^{n} {\left\| {\frac{{\partial E({\mathbf{W}}^{k} )}}{{\partial {\mathbf{v}}_{{\mathbf{i}}} }}}\right\|^{2} } , $$
(29)
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime} (\mathit{\Upphi}_0^{k,j})({\bf u}^k\cdot(f^{k,j}\odot \psi^{k,j}))\\ &\quad\leq -(\eta-C_7\eta^2)\sum\limits_{i=1}^n\left(\| \frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\|^2+ \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right),\\ \end{aligned} $$
(30)
$$ \frac{1}{2}\sum\limits_{j=1}^J g_j^{\prime\prime}(s_{k,j}) (\mathit{\Upphi}_0^{k+1,j}-\mathit{\Upphi}_0^{k,j})^2\leq C_8\eta^2\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf W}}\right\|^2, $$
(31)
where
C
m
(m = 1, 2, 3, 4, 5, 6, 7, 8) are constants independent of
k
and
j, and each
\({s_{k,j}\in\mathbb{R}}\)
is a constant lying on the segment between
\(\mathit{\Upphi}_0^{k,j}\)
and
\({\mathit{\Upphi}}_0^{k+1,j}.\)
In order to keep the presentation in a more readable form, the following provides the proof of the convergence theorem by using the above two lemmas first. Then, the rather tedious proof of Lemma 2 is given at the end of this Appendix.
Proof of Theorem 1
The proof is divided into three parts, dealing with each of statements (i), (ii) and (iii), respectively. \(\square\)
Proof of Statement (i)
Using the Taylor expansion and Lemma 2, the following can be established for all \(k=0,1,2,\ldots\):
$$ \begin{aligned} &E({\bf W}^{k+1})-E({\bf W}^{k})\\ &\quad=\sum\limits_{j=1}^J (g_j(\mathit{\Upphi}_0^{k+1,j})-g_j(\mathit{\Upphi}_0^{k,j}))\\ &\quad=\sum\limits_{j=1}^J \left[\vphantom{\frac{1}{2}} g_j^{\prime}(\mathit{\Upphi}_0^{k,j})(\mathit{\Upphi}_0^{k+1,j}-\mathit{\Upphi}_0^{k,j})\right.\\ &\qquad\left. +\frac{1}{2}g_j^{\prime\prime} (s_{k,j})(\mathit{\Upphi}_0^{k+1,j}-\mathit{\Upphi}_0^{k,j})^2\right]\\ &\quad=\sum\limits_{j=1}^J \left[\vphantom{\frac{1}{2}} g_j^{\prime}(\mathit{\Upphi}_0^{k,j})({\bf u}^{k+1}\cdot (f^{k+1,j}\odot G^{k+1,j})\right.\\ &\qquad\left.-{\bf u}^{k}\cdot (f^{k,j}\odot G^{k,j}))+\frac{1}{2} g_j^{\prime\prime} (s_{k,j})(\mathit{\Upphi}_0^{k+1,j}-\mathit{\Upphi}_0^{k,j})^2\right]\\ &\quad=\sum\limits_{j=1}^J \left[g_j^{\prime} (\mathit{\Upphi}_0^{k,j})(\Updelta {\bf u}^{k}\cdot (\varphi^{k,j}\odot \psi^{k,j})+\Updelta{\bf u}^k\cdot(\varphi^{k,j}\odot G^{k,j})\right.\\ &\qquad+\Updelta {\bf u}^k\cdot(f^{k,j}\odot\psi^{k,j})+{\bf u}^k\cdot(\varphi^{k,j}\odot\psi^{k,j})\\ &\qquad+\Updelta{\bf u}^k\cdot(f^{k,j}\odot G^{k,j}) +{\bf u}^k\cdot(\varphi^{k,j}\odot G^{k,j})\\ &\qquad\left.+{\bf u}^k\cdot(f^{k,j}\odot \psi^{k,j}))\right]+ \frac{1}{2}\sum\limits_{j=1}^J g_j^{\prime\prime} (s_{k,j})(\mathit{\Upphi}_0^{k+1,j}-\mathit{\Upphi}_0^{k,j})^2\\ &\quad\leq C_2\eta^2 \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf W}}\right\|^2\\ &\qquad+ C_3\eta^2\left(\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf u}}\right\|^2 +\sum\limits_{i=1}^n\left\|\frac{\partial E({\bf W}^k)} {\partial{\bf v}_{\bf i}}\right\|^2\right)\\ &\qquad+C_4\eta^2 \left( \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf u}}\right\|^2+\sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf a}_{\bf i}}\right\|^2\right.\\ &\qquad\left.+ \sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf b}_{\bf i}}\right\|^2\right)\\ &\qquad+C_5\eta^2\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2\right.\\ &\qquad\left.+\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right)\\ &\qquad-\eta\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf u}}\right\|^2-(\eta-C_6\eta^2)\sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2\\ &\qquad-(\eta-C_7\eta^2)\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf b}_{\bf i}}\right\|^2\right)\\ &\qquad+C_8\eta^2\left\|\frac{\partial E({\bf W}^k)} {\partial{\bf W}}\right\|^2\\ &\quad\leq C_2\eta^2\left\|\frac{\partial E({\bf W}^k)} {\partial{\bf W}}\right\|^2-\eta\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf u}}\right\|^2\right.\\ &\qquad\left.+\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf v}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf a}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right)\right)\\ &\qquad+(C_9+C_{10}+C_5)\eta^2\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf u}}\right\|^2\right.\\ &\qquad\left.+\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf a}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf b}_{\bf i}}\right\|^2\right)\right)\\ &\qquad+C_8\eta^2\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf W}}\right\|^2\\ &\quad=-(\eta-C\eta^2)\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf W}}\right\|^2, \end{aligned} $$
(32)
where C = C
2 + C
5 + C
8 + C
9 + C
10 and \({s_{k,j}\in\mathbb{R}}\) lies on the segment between \(\mathit{\Upphi}_0^{k,j}\) and \(\mathit{\Upphi}_0^{k+1,j}\).
Write β = η − Cη2. Then,
$$ E({\bf W}^{k+1})\leq E({\bf W}^{k})-\beta\left\|\frac{\partial E({\bf W}^k)}{\partial {\bf W}}\right\|^2. $$
(33)
Obviously, if the learning rate η is chosen such that
$$ 0< \eta < \frac{1}{C} $$
(34)
is satisfied, then the following holds
$$ E({\bf W}^{k+1})\leq E({\bf W}^{k}),\quad k=0,1,2,\ldots. $$
(35)
This proves statement (i). \(\square\)
Proof of Statement (ii)
From (33), it follows that
$$ \begin{aligned} &E({\bf W}^{k+1})\\ &\quad \leq E({\bf W}^{k})-\beta\left\| \frac{\partial E({\bf W}^k)}{\partial {\bf W}}\right\|^2\\ &\quad \leq{\cdots}\leq E({\bf W}^{0})-\beta\sum\limits_{t=0}^k \left\|\frac{\partial E({\bf W}^t)}{\partial {\bf W}}\right\|^2. \end{aligned} $$
Since E(W
k+1) ≥ 0, the following holds:
$$ \beta\sum\limits_{t=0}^k \left\|\frac{\partial E({\bf W}^t)} {\partial {\bf W}}\right\|^2\leq E({\bf W}^{0}). $$
Letting \(k\rightarrow \infty\) results in
$$ \sum\limits_{t=0}^\infty\left\|\frac{\partial E({\bf W}^t)} {\partial {\bf W}}\right\|^2\leq \frac{1}{\beta} E({\bf W}^{0})< \infty. $$
This immediately gives
$$ \lim\limits_{k\rightarrow\infty}\left\|\frac{\partial E({\bf W}^k)}{\partial {\bf W}}\right\|=0. $$
(36)
Statement (ii) is therefore proved. \(\square\)
Proof of Statement (iii)
It follows from (12) and (36) that
$$ \lim\limits_{k\rightarrow\infty}\|\Updelta {\bf W}^k\|=0. $$
(37)
Note that the error function E(W) defined in (6) is continuous and differentiable. According to (37), Assumption (A3) and Lemma 1, it is straightforward to show that there exists a point \({\bf W}^{\ast}\in\mathit{\Upomega}\) such that
$$ \lim\limits_{k\rightarrow\infty}{\bf W}^k={\bf W}^{\ast}. $$
Thus, statement (iii) is proved, and this completes the proof of Theorem 1. \(\square\)
What remains is to prove Lemma 2. This is done by proving (23) to (31) successively in the sequel.
Proof of Lemma 2 (23)
For a fixed and finite set of training patterns, the estimates of (23) can be established by using Assumption (A1) in conjunction with (7), (8), (11), (21), (22) and also with the definitions of operator “\(\odot\)” and window function \(G(\cdot)\). \(\square\)
Proof of Lemma 2 (24)
By using the Mean Value Theorem, for 1 ≤ i ≤ n, 1 ≤ j ≤ J and \(k=0,1,2,\ldots,\) the following can be established:
$$ \begin{aligned} &h_i^{k+1,j}-h_i^{k,j}\\ &\quad={\exp}\left(\sum\limits_{l=1}^m (-(x_l^j-a_{li}^{k+1})^2(b_{li}^{k+1})^2)\right)\\ &\qquad-{\exp}\left(\sum\limits_{l=1}^m (-(x_l^j-a_{li}^k)^2(b_{li}^k)^2)\right)\\ &\quad={\exp}(-(\xi_i^{k+1,j}\odot {\bf b}_{\bf i}^{k+1})\cdot(\xi_i^{k+1,j}\odot {\bf b}_{\bf i}^{k+1}))\\ &\qquad-{\exp}(-(\xi_i^{k,j}\odot {\bf b}_{\bf i}^k)\cdot(\xi_i^{k,j}\odot {\bf b}_{\bf i}^k))\\ &\quad={\exp}(-\mathit{\Upphi}_i^{k+1,j}\cdot \mathit{\Upphi}_i^{k+1,j})-{\exp}(-\mathit{\Upphi}_i^{k,j}\cdot \mathit{\Upphi}_i^{k,j})\\ &\quad={\exp}(t_{i}^{s,j})(-(\mathit{\Upphi}_i^{k+1,j}\cdot \mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j}\cdot \mathit{\Upphi}_i^{k,j}))\\ &\quad=-{\exp}(t_{i}^{s,j})((\mathit{\Upphi}_i^{k+1,j}+ \mathit{\Upphi}_i^{k,j})\cdot(\mathit{\Upphi}_i^{k+1,j}- \mathit{\Upphi}_i^{k,j})), \end{aligned} $$
where t
s,j
i
lies in between \(-\mathit{\Upphi}_i^{k,j}\cdot \mathit{\Upphi}_i^{k,j}\) and \(-\mathit{\Upphi}_i^{k+1,j}\cdot \mathit{\Upphi}_i^{k+1,j}\). By (23) and the Properties 1) and 3) of operator “\(\odot\)”, it follows that
$$ \begin{aligned} &|h_i^{k+1,j}-h_i^{k,j}|\\ &\quad\leq 2C_1\|\mathit{\Upphi}_i^{k+1,j}- \mathit{\Upphi}_i^{k,j}\|=2C_1\|\xi_i^{k+1,j}\odot {\bf b}_{\bf i}^{k+1}-\xi_i^{k,j}\odot {\bf b}_{\bf i}^k\|\\ &\quad=2C_1\|\xi_i^{k+1,j}\odot {\bf b}_{\bf i}^{k+1}-\xi_i^{k,j}\odot {\bf b}_{\bf i}^{k+1} + \xi_i^{k,j}\odot {\bf b}_{\bf i}^{k+1}\\ &\qquad-\xi_i^{k,j}\odot {\bf b}_{\bf i}^k\|\\ &\quad=2C_1\|(\xi_i^{k+1,j}-\xi_i^{k,j})\odot {\bf b}_{\bf i}^{k+1} + \xi_i^{k,j}\odot ({\bf b}_{\bf i}^{k+1}-{\bf b}_{\bf i}^k)\|\\ &\quad\leq 2C_1\|(\xi_i^{k+1,j}-\xi_i^{k,j})\odot {\bf b}_{\bf i}^{k+1}\| + 2C_1\|\xi_i^{k,j}\odot ({\bf b}_{\bf i}^{k+1}-{\bf b}_{\bf i}^k)\|\\ &\quad\leq 2C_0C_1\|\xi_i^{k+1,j}-\xi_i^{k,j}\|+2C_1^2\|{\bf b}_{\bf i}^{k+1}-{\bf b}_{\bf i}^k\|\\ &\quad\leq C_{21}(\|\Updelta {\bf a}_{\bf i}^k\|+\|\Updelta {\bf b}_{\bf i}^k\|), \end{aligned} $$
(38)
where C
21 = 2C
1max{C
0,C
1}. Furthermore, for 1 ≤ i ≤ n, 1 ≤ j ≤ J and \(k=0,1,2,\ldots,\) the following holds:
$$ \begin{aligned} \psi_i^{k,j}&=G_i^{k+1,j}-G_i^{k,j} \\ &=G^{\prime}(\sigma_{i}^{p,j})(h_i^{k+1,j}-h_i^{k,j}),\\ \end{aligned} $$
where σ
p,j
i
lies in between h
k+1,j
i
and h
k,j
i
. Then, by (38) and Assumption (A2),
$$ \begin{aligned} |\psi_i^{k,j}|&=M_1|h_i^{k+1,j}-h_i^{k,j}|\\ &\leq M_1C_{21}(\|\Updelta {\bf a}_{\bf i}^k\|+\|\Updelta {\bf b}_{\bf i}^k\|). \end{aligned} $$
(39)
It follows from (39) that for any 1 ≤ j ≤ J and \(k=0,1,2,\ldots,\)
$$ \begin{aligned} \|\psi^{k,j}\|^2 &=\|G^{k+1,j}-G^{k,j}\|^2=\left\| \left(\begin{array}{c} G_1^{k+1,j}-G_1^{k,j}\\ G_2^{k+1,j}-G_2^{k,j}\\ \ldots\\ G_n^{k+1,j}-G_n^{k,j}\end{array}\right)\right\|^2\\ &\leq M_1^2C_{21}^2\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|+\|\Updelta {\bf b}_{\bf i}^k\|)^2\\ &\leq 2M_1^2C_{21}^2\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2). \end{aligned} $$
(40)
By Assumption (A1), it can be derived that
$$ \|\psi^{k,j}\|\leq4\sqrt{n}M_1C_0C_{21} $$
(41)
As for \(\varphi^{k,j}\),
$$ \begin{aligned} \varphi_i^{k,j}&=f_i^{k+1,j}-f_i^{k,j} \\ &=f({\bf v}_{\bf i}^{k+1}\cdot{\bf x}^j)-f({\bf v}_{\bf i}^{k}\cdot{\bf x}^j) \\ &=f^{\prime}(\tau_{i}^{r,j})({\bf v}_{\bf i}^{k+1}-{\bf v}_{\bf i}^{k})\cdot{\bf x}^j,\\ \end{aligned} $$
where τ
r,j
i
lies in between \({\bf v}_{\bf i}^{k+1}\cdot{\bf x}^j\) and \({\bf v}_{\bf i}^{k}\cdot{\bf x}^j\). Because \(|\frac{df(x)}{dx}|<1\), then
$$ \begin{aligned} |\varphi_i^{k,j}|&=|f_i^{k+1,j}-f_i^{k,j}| \\ &\leq M\|\Updelta {\bf v}_{\bf i}^{k}\|, \end{aligned} $$
(42)
furthermore,
$$ \begin{aligned} \|\varphi^{k,j}\|^2&=\|f^{k+1,j}-f^{k,j}\|^2 \\ &\leq M^2\sum\limits_{i=1}^n\|\Updelta {\bf v}_{\bf i}^{k}\|^2. \end{aligned} $$
(43)
By Assumption (A1), the following holds:
$$ \|\varphi^{k,j}\|\leq2\sqrt{n}MC_0. $$
(44)
According to the definition of g
j
(t) as expressed in (9), it is straightforward to establish that g
′
j
(t) = t − O
j. This together with (23) leads to \(|g_j^{\prime}(\mathit{\Upphi}_0^{k,j})|\leq (nC_0+C_1)\). Employing (41) and (44), it can be derived that
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})(\Updelta {\bf u}^{k}\cdot (\varphi^{k,j}\odot \psi^{k,j}))\\ &\quad=\frac{1}{3}\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j}) (\Updelta {\bf u}^{k}\cdot (\varphi^{k,j}\odot \psi^{k,j}))\\ &\qquad +\frac{1}{3}\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j}) (\Updelta {\bf u}^{k}\cdot (\varphi^{k,j}\odot \psi^{k,j}))\\ &\qquad+\frac{1}{3}\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j}) (\Updelta {\bf u}^{k}\cdot (\varphi^{k,j}\odot \psi^{k,j}))\\ &\quad \leq \frac{1}{3}(nC_0+C_1)\sum\limits_{j=1}^J \|\Updelta {\bf u}^{k}\|\|\varphi^{k,j}\odot \psi^{k,j}\| \\ &\qquad+\frac{1}{3}(nC_0+C_1)\sum\limits_{j=1}^J \|\Updelta {\bf u}^{k}\|\|\varphi^{k,j}\odot \psi^{k,j}\| \\ &\qquad+\frac{1}{3}(nC_0+C_1)\sum\limits_{j=1}^J \|\Updelta {\bf u}^{k}\|\|\varphi^{k,j}\odot \psi^{k,j}\| \\ &\quad\leq \frac{1}{3}(nC_0+C_1)\sum\limits_{j=1}^J \|\Updelta {\bf u}^{k}\|\|\varphi^{k,j}\|\| \psi^{k,j}\| \\ &\qquad+\frac{1}{3}(nC_0+C_1)\sum\limits_{j=1}^J \|\Updelta {\bf u}^{k}\|\|\varphi^{k,j}\|\| \psi^{k,j}\| \\ &\qquad+\frac{1}{3}(nC_0+C_1)\sum\limits_{j=1}^J \|\Updelta {\bf u}^{k}\|\|\varphi^{k,j}\|\| \psi^{k,j}\| \\ &\quad\leq \frac{4}{3}M_1C_0C_{21}{\sqrt{n}} (nC_0+C_1)\sum\limits_{j=1}^J \|\Updelta {\bf u}^{k}\|\|\varphi^{k,j}\| \\ &\qquad+\frac{2}{3}MC_0{\sqrt{n}}(nC_0+C_1)\sum\limits_{j=1}^J \|\Updelta {\bf u}^{k}\|\| \psi^{k,j}\| \\ &\qquad+\frac{2}{3}C_0(nC_0+C_1)\sum\limits_{j=1}^J \|\varphi^{k,j}\|\| \psi^{k,j}\| \\ &\quad\leq \frac{2}{3} M_1C_0C_{21}{\sqrt{n}} (nC_0+C_1)\sum\limits_{j=1}^J(\|\Updelta {\bf u}^{k}\|^2+\|\varphi^{k,j}\|^2) \\ &\qquad+\frac{1}{3} MC_0{\sqrt{n}}(nC_0+C_1)\sum\limits_{j=1}^J(\|\Updelta {\bf u}^{k}\|^2+\|\psi^{k,j}\|^2) \\ &\qquad+\frac{1}{3}C_0(nC_0+C_1)\sum\limits_{j=1}^J (\|\varphi^{k,j}\|^2+\|\psi^{k,j}\|^2)\\ &\quad\leq \frac{2}{3}JM_1C_0C_{21}{\sqrt{n}}(nC_0+C_1)\left(\|\Updelta {\bf u}^{k}\|^2+M^2\sum\limits_{i=1}^n\|\Updelta{\bf v}_{\bf i}^{k}\|^2\right)\\ &\qquad+\frac{1}{3}JMC_0{\sqrt{n}}(nC_0+C_1)(\|\Updelta {\bf u}^{k}\|^2\\ &\qquad+2M_1^2C_{21}^2\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2))\\ &\qquad+\frac{1}{3}JC_0(nC_0+C_1)\left(M^2\sum\limits_{i=1}^n\|\Updelta{\bf v}_{\bf i}^{k}\|^2\right.\\ &\qquad\left.+2M_1^2C_{21}^2\sum\limits_{i=1}^n\left(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2\right)\right)\\ &\quad\leq C_2\left(\|\Updelta {\bf u}^k\|^2+\sum\limits_{i=1}^n\|\Updelta{\bf v}_{\bf i}^{k}\|^2+\sum\limits_{i=1}^n\|\Updelta {\bf a}_{\bf i}^k\|^2+\sum\limits_{i=1}^n\|\Updelta {\bf b}_{\bf i}^k\|^2\right)\\ &\quad =C_2\eta^2\left( \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf u}}\right\|^2+ \sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2\right.\\ &\qquad\left.+\sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf a}_{\bf i}}\right\|^2 +\sum\limits_{i=1}^n\left\| \frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right) \\ &\quad =C_2\eta^2\|\frac{\partial E({\bf W}^k)}{\partial{\bf W}}\|^2\\ \end{aligned} $$
where \(C_2=\frac{1}{3}JC_0(nC_0+C_1)\hbox{max}\{\sqrt{n}(2M_1C_{21}+M)\), \(M^2(2\sqrt{n}M_1C_{21}+1),2M_1^2C_{21}^2(\sqrt{n}M+1)\}\).
So Lemma 2 (24) is proved. \(\square\)
Proof of Lemma 2 (25)
Using the definition of window function \(G(\cdot)\), it can be established that
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})(\Updelta{\bf u}^k\cdot(\varphi^{k,j}\odot G^{k,j}))\\ &\quad \leq(nC_0+C_1)\sum\limits_{j=1}^J\|\Updelta{\bf u}^k\|\|\varphi^{k,j}\|\|G^{k,j}\|\\ &\quad \leq{\sqrt{n}}(nC_0+C_1)\sum\limits_{j=1}^J\|\Updelta{\bf u}^k\|\|\varphi^{k,j}\|\\ &\quad \leq \frac{1}{2} {\sqrt{n}}(nC_0+C_1)\sum\limits_{j=1}^J(\|\Updelta{\bf u}^k\|^2+\|\varphi^{k,j}\|^2)\\ &\quad \leq \frac{1}{2}J{\sqrt{n}}(nC_0+C_1)(\|\Updelta {\bf u}^{k}\|^2+M^2\sum\limits_{i=1}^n\|\Updelta{\bf v}_{\bf i}^{k}\|^2)\\ &\quad \leq C_3\eta^2\left(\left\|\frac{\partial E({\bf W}^k)} {\partial{\bf u}}\right\|^2+\sum\limits_{i=1}^n\| \frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\|^2\right),\\ \end{aligned} $$
where \(C_3=\frac{1}{2}J\sqrt{n}(nC_0+C_1)\hbox{max}\{1,M^2\}\). So Lemma 2 (25) is proved. \(\square\)
Proof of Lemma 2 (26)
With the definition of activation function f(x), the following can be derived:
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})(\Updelta {\bf u}^k\cdot(f^{k,j}\odot\psi^{k,j})) \\ &\quad\leq(nC_0+C_1)\sum\limits_{j=1}^J\|\Updelta{\bf u}^k\|\|f^{k,j}\|\|\psi^{k,j}\| \\ &\quad\leq{\sqrt{n}}(nC_0+C_1)\sum\limits_{j=1}^J\|\Updelta{\bf u}^k\|\|\psi^{k,j}\| \\ &\quad\leq \frac{1}{2}{\sqrt{n}}(nC_0+C_1)\sum\limits_{j=1}^J(\|\Updelta{\bf u}^k\|^2+\|\psi^{k,j}\|^2)\\ &\quad\leq \frac{1}{2} J{\sqrt{n}}(nC_0+C_1)(\|\Updelta {\bf u}^{k}\|^2\\ &\qquad+2M_1^2C_{21}^2\sum\limits_{i=1}^n(\|\Updelta{\bf a}_{\bf i}^{k}\|^2+\|\Updelta{\bf b}_{\bf i}^{k}\|^2)\\ &\quad\leq C_4\eta^2\left(\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf u}}\right\|^2+\sum\limits_{i=1}^n \left\| \frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2\right.\\ &\qquad\left.+\sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf b}_{\bf i}}\right\|^2\right),\\ \end{aligned} $$
where \(C_4=\frac{1}{2}J\sqrt{n}(nC_0+C_1) \hbox{max}\{1,2M_1^2C_{21}^2\}\). So Lemma 2 (26) is proved. \(\square\)
Proof of Lemma 2 (27)
By Assumption (A1)
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime} (\mathit{\Upphi}_0^{k,j})({\bf u}^k\cdot(\varphi^{k,j}\odot\psi^{k,j})) \\ &\quad\leq(nC_0+C_1)\sum\limits_{j=1}^J\|{\bf u}^k\|\|\varphi^{k,j}\|\|\psi^{k,j}\| \\ &\quad\leq(nC_0+C_1)C_0\sum\limits_{j=1}^J \|\varphi^{k,j}\|\|\psi^{k,j}\|\\ &\quad\leq \frac{1}{2}(nC_0+C_1)C_0\sum\limits_{j=1}^J (\|\varphi^{k,j}\|^2+\|\psi^{k,j}\|^2)\\ &\quad\leq\frac{1}{2} J(nC_0+C_1)C_0 (M^2\sum\limits_{i=1}^n\|\Updelta {\bf v}_{\bf i}^{k}\|^2\\ &\qquad+2M_1^2C_{21}^2\sum\limits_{i=1}^n (\|\Updelta{\bf a}_{\bf i}^{k}\|^2+\|\Updelta{\bf b}_{\bf i}^{k}\|^2)\\ &\quad\leq C_5\eta^2\sum\limits_{i=1}^n\left(\|\frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\|^2+\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right)\\ \end{aligned} $$
where \(C_5=\frac{1}{2}JC_0(nC_0+C_1) \hbox{max}\{M^2,2M_1^2C_{21}^2\}\). So Lemma 2 (27) is proved. \(\square\)
Proof of Lemma 2 (28)
It follows from (14) that
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})(\Updelta{\bf u}^k\cdot(f^{k,j}\odot G^{k,j}))\\ &\quad =\frac{\partial E({\bf W}^k)}{\partial{\bf u}}\cdot \left(-\eta \frac{\partial E({\bf W}^k)}{\partial{\bf u}}\right)\\ &\quad=-\eta\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf u}}\right\|^2.\\ \end{aligned} $$
So Lemma 2 (28) is proved. \(\square\)
Proof of Lemma 2 (29)
Using the Taylor expansion, the following can be established:
$$ \begin{aligned} &{\bf u}^k\cdot(\varphi^{k,j}\odot G^{k,j})\\ &\quad =\sum\limits_{i=1}^nu_i^k(f_i^{k+1,j}-f_i^{k,j})G_i^{k,j}\\ &\quad =\sum\limits_{i=1}^nu_i^k\left(f^{\prime}({\bf v}_{\bf i}^{k}\cdot{\bf x}^j)(\Updelta {\bf v}_{\bf i}^{k}\cdot{\bf x}^j)+\frac{1}{2} f^{\prime\prime}(\widetilde{\tau}_i^{r,j})(\Updelta {\bf v}_{\bf i}^{k}\cdot{\bf x}^j)^2\right)G_i^{k,j}\\ &\quad =\sum\limits_{i=1}^nu_i^kf^{\prime}({\bf v}_{\bf i}^{k}\cdot{\bf x}^j)(\Updelta {\bf v}_{\bf i}^{k}\cdot{\bf x}^j)G_i^{k,j}\\ &\qquad+\frac{1}{2}\sum\limits_{i=1}^nu_i^kf^{\prime\prime} (\widetilde{\tau}_i^{r,j})(\Updelta {\bf v}_{\bf i}^{k}\cdot{\bf x}^j)^2G_i^{k,j}\\ &\quad \triangleq \,\Upgamma_1+\Upgamma_2, \end{aligned} $$
(45)
where \(\widetilde{\tau}_i^{r,j}\) lies in between \({\bf v}_{\bf i}^{k+1}\cdot{\bf x}^j\) and \({\bf v}_{\bf i}^{k}\cdot{\bf x}^j\).
From (15), the following can be derived:
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\Upgamma_1\\ &\quad =\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\sum\limits_{i=1}^nu_i^kf^{\prime} ({\bf v}_{\bf i}^{k}\cdot{\bf x}^j)(\Updelta {\bf v}_{\bf i}^{k}\cdot{\bf x}^j)G_i^{k,j}\\ &\quad= \sum\limits_{i=1}^n \frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}} \cdot \left(-\eta \frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right)\\ &\quad =-\eta\sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2.\\ \end{aligned} $$
(46)
Observing \(|\frac{d^2f(x)}{dx^2}|<1\) and the definition of \(G(\cdot)\), it follows that
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\Upgamma_2\\ &\quad=\frac{1}{2}\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j}) \sum\limits_{i=1}^nu_i^kf^{\prime\prime} (\widetilde{\tau}_i^{r,j})(\Updelta {\bf v}_{\bf i}^{k}\cdot{\bf x}^j)^2G_i^{k,j}\\ &\quad\leq \frac{1}{2} M^2C_0(nC_0+C_1) \sum\limits_{j=1}^J\sum\limits_{i=1}^n\|\Updelta {\bf v}_{\bf i}^{k}\|^2\\ &\quad \leq C_{6}\eta^2\sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2, \end{aligned} $$
(47)
where \(C_{6}=\frac{1}{2}JM^2C_0(nC_0+C_1)\). According to (46) and (47), it can be derived that
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j}){\bf u}^k\cdot(\varphi^{k,j}\odot G^{k,j}) \\ &\quad=\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\Upgamma_1+\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\Upgamma_2 \\ &\quad\leq-\eta\sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2+ C_{6}\eta^2\sum\limits_{i=1}^n \left\| \frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2 \\ &\quad =-(\eta-C_{6}\eta^2)\sum\limits_{i=1}^n\left\| \frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2.\\ \end{aligned} $$
So Lemma 2 (29) is proved. \(\square\)
Proof of Lemma 2 (30)
Using the Taylor expansion, it follows that
$$ \begin{aligned} &{\bf u}^k\cdot(f^{k,j}\odot \psi^{k,j})\\ &\quad=\sum\limits_{i=1}^nu_i^kf_i^{k,j}(G_i^{k+1,j}-G_i^{k,j})\\ &\quad=\sum\limits_{i=1}^nu_i^kf_i^{k,j}(G^{\prime} (h_i^{k,j})(h_i^{k+1,j}-h_i^{k,j})\\ &\qquad+\frac{1}{2}G^{\prime\prime} (\widetilde{\sigma}_{i}^{p,j})(h_i^{k+1,j}-h_i^{k,j})^2)\\ &\quad=\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime} (h_i^{k,j})(h_i^{k+1,j}-h_i^{k,j})\\ &\qquad+\frac{1}{2}\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime\prime} (\widetilde{\sigma}_{i}^{p,j})(h_i^{k+1,j}-h_i^{k,j})^2\\ &\quad\triangleq\, \mathit{\Upomega}_1+\mathit{\Upomega}_2, \end{aligned} $$
(48)
where \(\widetilde{\sigma}_{i}^{p,j}\) lies in between h
k+1,j
i
and h
k,j
i
. Thus,
$$ \begin{aligned} &\mathit{\Upomega}_1\\ &\quad=\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime} (h_i^{k,j})(h_i^{k+1,j}-h_i^{k,j})\\ &\quad=\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime} (h_i^{k,j})({\exp}(-\mathit{\Upphi}_i^{k+1,j}\cdot\mathit{\Upphi}_i^{k+1,j})\\ &\qquad-{\exp}(-\mathit{\Upphi}_i^{k,j}\cdot\mathit{\Upphi}_i^{k,j}))\\ &\quad=\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime} (h_i^{k,j}) \left[\vphantom{\frac{1}{2}} {\exp}(-\mathit{\Upphi}_i^{k,j}\cdot\mathit{\Upphi}_i^{k,j})\right.\\ &\qquad\cdot (-(\mathit{\Upphi}_i^{k+1,j}\cdot\mathit{\Upphi}_i^{k+1,j} -\mathit{\Upphi}_i^{k,j}\cdot\mathit{\Upphi}_i^{k,j}))\\ &\qquad\left.+\frac{1}{2}{\exp}(\widetilde{t}_i^{s,j}) (-(\mathit{\Upphi}_i^{k+1,j}\cdot\mathit{\Upphi}_i^{k+1,j} -\mathit{\Upphi}_i^{k,j}\cdot\mathit{\Upphi}_i^{k,j}))^2\right]\\ &\quad=\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j}) h_i^{k,j}(-(\mathit{\Upphi}_i^{k+1,j}\cdot\mathit{\Upphi}_i^{k+1,j}- \mathit{\Upphi}_i^{k,j}\cdot\mathit{\Upphi}_i^{k,j}))\\ &\qquad+\frac{1}{2}\sum\limits_{i=1}^nu_i^kf_i^{k,j} G^{\prime}(h_i^{k,j}){\exp}(\widetilde{t}_i^{s,j})\\ &\qquad\cdot(\mathit{\Upphi}_i^{k+1,j}\cdot\mathit{\Upphi}_i^{k+1,j} -\mathit{\Upphi}_i^{k,j}\cdot\mathit{\Upphi}_i^{k,j})^2\\ &\quad\triangleq \Updelta_1+\Updelta_2,\\ \end{aligned} $$
(49)
where \(\widetilde{t}_i^{s,j}\) lies in between \(-\mathit{\Upphi}_i^{k+1,j}\cdot \mathit{\Upphi}_i^{k+1,j}\) and \(-\mathit{\Upphi}_i^{k,j}\cdot \mathit{\Upphi}_i^{k,j}\). It follows from the properties of the operator “\(\odot\)” that
$$ \begin{aligned} &\Updelta_1\\ &\quad=\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime} (h_i^{k,j}) h_i^{k,j}(-(\mathit{\Upphi}_i^{k+1,j}\cdot\mathit{\Upphi}_i^{k+1,j}- \mathit{\Upphi}_i^{k,j}\cdot\mathit{\Upphi}_i^{k,j}))\\ &\quad=\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime} (h_i^{k,j})h_i^{k,j} [-(2\mathit{\Upphi}_i^{k,j}\cdot(\mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j})\\ &\qquad+(\mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j}) \cdot(\mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j}))]\\ &\quad=-2\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} (\mathit{\Upphi}_i^{k,j}\cdot(\mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j}))\\ &\qquad-\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} \|\mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j}\|^2\\ &\quad=-2\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} (\mathit{\Upphi}_i^{k,j}\cdot((\xi_i^{k+1,j}-\xi_i^{k,j})\odot {\bf b}_{\bf i}^{k+1}\\ &\qquad+ \xi_i^{k,j}\odot ({\bf b}_{\bf i}^{k+1}-{\bf b}_{\bf i}^k)))-\delta\\ &\quad=-2\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} ((\xi_i^{k,j}\odot {\bf b}_{\bf i}^k)\cdot((-\Updelta {\bf a}_{\bf i}^k)\odot {\bf b}_{\bf i}^{k+1}\\ &\qquad+\xi_i^{k,j}\odot \Updelta {\bf b}_{\bf i}^k))-\delta\\ &\quad=2\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} (\xi_i^{k,j}\odot {\bf b}_{\bf i}^k)\cdot(\Updelta {\bf a}_{\bf i}^k\odot {\bf b}_{\bf i}^{k+1})\\ &\qquad-2\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime} (h_i^{k,j})h_i^{k,j}(\xi_i^{k,j}\odot {\bf b}_{\bf i}^k)\cdot (\xi_i^{k,j}\odot \Updelta {\bf b}_{\bf i}^k)-\delta\\ &\quad=2\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} (\xi_i^{k,j}\odot {\bf b}_{\bf i}^k\odot {\bf b}_{\bf i}^{k+1})\cdot\Updelta {\bf a}_{\bf i}^k\\ &\qquad-2\sum\limits_{i=1}^nu_i^kf_i^{k,j} G^{\prime}(h_i^{k,j})h_i^{k,j}(\xi_i^{k,j}\odot\xi_i^{k,j} \odot {\bf b}_{\bf i}^k)\cdot\Updelta {\bf b}_{\bf i}^k-\delta\\ &\quad=2\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} (\xi_i^{k,j}\odot {\bf b}_{\bf i}^k\odot {\bf b}_{\bf i}^{k})\cdot\Updelta {\bf a}_{\bf i}^k\\ &\qquad+2\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} (\xi_i^{k,j}\odot {\bf b}_{\bf i}^k\odot \Updelta {\bf b}_{\bf i}^k)\cdot\Updelta {\bf a}_{\bf i}^k\\ &\qquad-2\sum\limits_{i=1}^nu_i^kf_i^{k,j} G^{\prime}(h_i^{k,j})h_i^{k,j}(\xi_i^{k,j}\odot\xi_i^{k,j} \odot {\bf b}_{\bf i}^k)\cdot\Updelta {\bf b}_{\bf i}^k-\delta,\\ \end{aligned} $$
(50)
where \(\delta=\sum\nolimits_{i=1}^nu_i^kf_i^{k,j} G^{\prime}(h_i^{k,j})h_i^{k,j} \|\mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j}\|^2\). Then, according to (18) and (19),
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\Updelta_1\\ &\quad=2\sum\limits_{j=1}^Jg_j^{\prime}(\mathit{\Upphi}_0^{k,j}) \sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} (\xi_i^{k,j}\odot {\bf b}_{\bf i}^k\odot {\bf b}_{\bf i}^{k})\cdot\Updelta {\bf a}_{\bf i}^k\\ &\qquad+2\sum\limits_{j=1}^Jg_j^{\prime}(\mathit{\Upphi}_0^{k,j}) \sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} (\xi_i^{k,j}\odot {\bf b}_{\bf i}^k\odot \Updelta {\bf b}_{\bf i}^k)\cdot\Updelta {\bf a}_{\bf i}^k\\ &\qquad-2\sum\limits_{j=1}^Jg_j^{\prime}(\mathit{\Upphi}_0^{k,j}) \sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime} (h_i^{k,j})h_i^{k,j}(\xi_i^{k,j}\odot\xi_i^{k,j} \odot {\bf b}_{\bf i}^k)\cdot\Updelta {\bf b}_{\bf i}^k\\ &\qquad-\sum\limits_{j=1}^Jg_j^{\prime}(\mathit{\Upphi}_0^{k,j})\delta \\ &\quad=\sum\limits_{i=1}^n \frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\cdot\Updelta {\bf a}_{\bf i}^k+2\sum\limits_{j=1}^Jg_j^{\prime}(\mathit{\Upphi}_0^{k,j}) \sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})\\ &\qquad\cdot h_i^{k,j}(\xi_i^{k,j}\odot {\bf b}_{\bf i}^k\odot \Updelta {\bf b}_{\bf i}^k)\cdot\Updelta {\bf a}_{\bf i}^k\\ &\qquad +\sum\limits_{i=1}^n \frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\cdot\Updelta {\bf b}_{\bf i}^k-\sum\limits_{j=1}^Jg_j^{\prime}(\mathit{\Upphi}_0^{k,j})\delta.\\ \end{aligned} $$
(51)
With Assumptions (A1) and (A2), and also (23) plus Property 1) of “\(\odot\)”, the following can be established:
$$ \begin{aligned} &2\sum\limits_{j=1}^Jg_j^{\prime}(\mathit{\Upphi}_0^{k,j}) \sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})\\ &\qquad\cdot h_i^{k,j} (\xi_i^{k,j}\odot {\bf b}_{\bf i}^k\odot \Updelta {\bf b}_{\bf i}^k)\cdot\Updelta {\bf a}_{\bf i}^k\\ &\quad\leq2M_1C_0C_1(nC_0+C_1)\sum\limits_{j=1}^J\sum\limits_{i=1}^n \|\Updelta {\bf b}_{\bf i}^k\|\|\Updelta {\bf a}_{\bf i}^k\|\\ &\quad\leq JM_1C_0C_1(nC_0+C_1)\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2), \end{aligned} $$
(52)
and
$$ \begin{aligned} &-\sum\limits_{j=1}^Jg_j^{\prime}(\mathit{\Upphi}_0^{k,j})\delta\\ &\quad=-\sum\limits_{j=1}^Jg_j^{\prime}(\mathit{\Upphi}_0^{k,j}) \sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime}(h_i^{k,j})h_i^{k,j} \|\mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j}\|^2\\ &\quad\leq M_1C_0(nC_0+C_1)\sum\limits_{j=1}^J \sum\limits_{i=1}^n\|\mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j}\|^2\\ &\quad= JM_1C_0(nC_0+C_1)\sum\limits_{i=1}^n \|(-\Updelta {\bf a}_{\bf i}^k)\odot {\bf b}_{\bf i}^{k+1}\\ &\qquad+\xi_i^{k,j}\odot \Updelta {\bf b}_{\bf i}^k\|^2\\ &\quad\leq JM_1C_0(nC_0+C_1)\sum\limits_{i=1}^n((C_0^2+C_0C_1)\|\Updelta {\bf a}_{\bf i}^k\|^2\\ &\qquad+(C_1^2+C_0C_1)\|\Updelta {\bf b}_{\bf i}^k\|^2)\\ &\quad\leq C_{71}\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2), \end{aligned} $$
(53)
where C
71 = JM
1
C
0(C
0 + C
1)(nC
0 + C
1)max{C
0,C
1}. The combination of (51), (52) and (53) leads to
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\Updelta_1 \leq \sum\limits_{i=1}^n \frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\cdot\Updelta {\bf a}_{\bf i}^k \\ &\qquad+\sum\limits_{i=1}^n \frac{\partial E({\bf W}^k)} {\partial{\bf b}_{\bf i}}\cdot\Updelta {\bf b}_{\bf i}^k+ C_{72}\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2), \end{aligned} $$
(54)
where C
72 = C
71 + JM
1
C
0
C
1(nC
0 + C
1). Furthermore,
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\Updelta_2\\ &\quad=\frac{1}{2}\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\sum\limits_{i=1}^nu_i^kf_i^{k,j} G^{\prime}(h_i^{k,j}){\exp}(\widetilde{t}_i^{s,j})\\ &\qquad\cdot(\mathit{\Upphi}_i^{k+1,j}\cdot\mathit{\Upphi}_i^{k+1,j}- \mathit{\Upphi}_i^{k,j}\cdot\mathit{\Upphi}_i^{k,j})^2\\ &\quad\leq \frac{1}{2}M_1C_0(nC_0+C_1) \sum\limits_{j=1}^J\sum\limits_{i=1}^n (\mathit{\Upphi}_i^{k+1,j}\cdot\mathit{\Upphi}_i^{k+1,j}- \mathit{\Upphi}_i^{k,j}\cdot\mathit{\Upphi}_i^{k,j})^2\\ &\quad=\frac{1}{2}M_1C_0(nC_0+C_1) \sum\limits_{j=1}^J\sum\limits_{i=1}^n [(\mathit{\Upphi}_i^{k+1,j}+\mathit{\Upphi}_i^{k,j})\cdot (\mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j})]^2\\ &\quad\leq 2M_1C_0C_1^2(nC_0+C_1)\sum\limits_{j=1}^J \sum\limits_{i=1}^n\|\mathit{\Upphi}_i^{k+1,j}-\mathit{\Upphi}_i^{k,j}\|^2\\ &\quad\leq 2JM_1C_0C_1^2(nC_0+C_1)\sum\limits_{i=1}^n((C_0^2+C_0C_1)\|\Updelta {\bf a}_{\bf i}^k\|^2\\ &\qquad+(C_1^2+C_0C_1)\|\Updelta {\bf b}_{\bf i}^k\|^2)\\ &\quad\leq C_{73}\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2), \end{aligned} $$
(55)
where C
73 = 2JM
1
C
0
C
21
(C
0 + C
1)(nC
0 + C
1)max{C
0,C
1}. The combination of (49), (54) and (55) leads to
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\mathit{\Upomega}_1\\ &\quad=\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\Updelta_1 +\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\Updelta_2\\ &\quad\leq \sum\limits_{i=1}^n \frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\cdot\Updelta {\bf a}_{\bf i}^k +\sum\limits_{i=1}^n \frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\cdot\Updelta {\bf b}_{\bf i}^k\\ &\quad+(C_{72}+C_{73})\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2)\\ &\quad=-\eta\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf a}_{\bf i}}\right\|^2+\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right)\\ &\qquad+(C_{72}+C_{73})\eta^2\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+\left\|\frac{\partial E({\bf W}^k)} {\partial{\bf b}_{\bf i}}\right\|^2\right).\\ \end{aligned} $$
(56)
As for \(\mathit{\Upomega}_2\), from Assumption (A2) and (38), it can be seen that
$$ \begin{aligned} \mathit{\Upomega}_2&=\sum\limits_{i=1}^nu_i^kf_i^{k,j}G^{\prime\prime} (\widetilde{\sigma}_i^{p,j})(h_i^{k+1,j}-h_i^{k,j})^2\\ &\leq M_2C_0\sum\limits_{i=1}^n|h_i^{k+1,j}-h_i^{k,j}|^2 \\ &\leq M_2C_0C_{21}^2\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|+\|\Updelta {\bf b}_{\bf i}^k\|)^2\\ &\leq 2M_2C_0C_{21}^2\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2),\\ \end{aligned} $$
(57)
then
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\mathit{\Upomega}_2\\ &\quad\leq 2M_2C_0C_{21}^2\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2)\\ &\quad\leq 2JM_2C_0C_{21}^2(nC_0+C_1)\sum\limits_{i=1}^n(\|\Updelta {\bf a}_{\bf i}^k\|^2+\|\Updelta {\bf b}_{\bf i}^k\|^2)\\ &\quad\leq C_{74}\eta^2\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf b}_{\bf i}}\right\|^2\right),\\ \end{aligned} $$
(58)
where C
74 = 2JM
2
C
0
C
221
(nC
0 + C
1). The combination of (56) and (58) leads to
$$ \begin{aligned} &\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j}){\bf u}^k\cdot(f^{k,j}\odot \psi^{k,j}) \\ &\quad=\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\mathit{\Upomega}_1+\sum\limits_{j=1}^J g_j^{\prime}(\mathit{\Upphi}_0^{k,j})\mathit{\Upomega}_2 \\ &\quad\leq-\eta\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf b}_{\bf i}}\right\|^2\right)\\ &\qquad+(C_{72}+C_{73})\eta^2\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+\left\|\frac{\partial E({\bf W}^k)} {\partial{\bf b}_{\bf i}}\right\|^2\right)\\ &\qquad+ C_{74}\eta^2\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)} {\partial{\bf b}_{\bf i}}\right\|^2\right)\\ &\quad=-\eta\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+\left\| \frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right)\\ &\qquad+C_7\eta^2\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+\left\| \frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right)\\ &\quad=-(\eta-C_7\eta^2)\sum\limits_{i=1}^n\left( \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf a}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right),\\ \end{aligned} $$
(59)
where C
7 = C
72 + C
73 + C
74. So Lemma 2 (30) is proved. \(\square\)
Proof of Lemma 2 (31)
By the definition of g
j
(t) as expressed in (9), it is straightforward to derive that g
′′
j
(t) = 1 and that the boundedness of \(G(\cdot)\) lies in \([0,+\infty)\). Employing Assumption (A1), and also (24) and (43), the following can be established:
$$ \begin{aligned} &\frac{1}{2}\sum\limits_{j=1}^J g_j^{\prime\prime}(s_{k,j})(\mathit{\Upphi}_0^{k+1,j}-\mathit{\Upphi}_0^{k,j})^2 \\ &\quad=\frac{1}{2}\sum\limits_{j=1}^J \|\mathit{\Upphi}_0^{k+1,j}-\mathit{\Upphi}_0^{k,j}\|^2 \\ &\quad=\frac{1}{2}\sum\limits_{j=1}^J \|{\bf u}^{k+1}\cdot (f^{k+1,j}\odot G^{k+1,j})-{\bf u}^{k}\cdot (f^{k,j}\odot G^{k,j})\|^2 \\ &\quad=\frac{1}{2}\sum\limits_{j=1}^J \|({\bf u}^{k+1}-{\bf u}^{k})\cdot (f^{k+1,j}\odot G^{k+1,j}) \\ &\qquad+{\bf u}^{k}\cdot (f^{k+1,j}\odot G^{k+1,j}-f^{k,j}\odot G^{k,j})\|^2 \\ &\quad\leq \frac{1}{2} \sum\limits_{j=1}^J(n(n+C_0)\|\Updelta {\bf u}^{k}\|^2 \\ &\qquad+C_0(n+C_0)\|f^{k+1,j}\odot G^{k+1,j}-f^{k,j}\odot G^{k,j}\|^2)\\ &\quad=\frac{1}{2} \sum\limits_{j=1}^J(n(n+C_0)\|\Updelta {\bf u}^{k}\|^2+C_0(n+C_0) \\ &\qquad\cdot\|(f^{k+1,j}-f^{k,j})\odot G^{k+1,j}+f^{k,j}\odot (G^{k+1,j}-G^{k,j})\|^2)\\ &\quad\leq \frac{1}{2} \sum\limits_{j=1}^J(n(n+C_0)\|\Updelta {\bf u}^{k}\|^2+2nC_0(n+C_0) \|\varphi^{k,j}\|^2 \\ &\qquad+2nC_0(n+C_0)\|\psi^{k,j}\|^2)\\ &\quad \leq \frac{1}{2} J n(n+C_0)\eta^2 \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf u}}\right\|^2 \\ &\qquad+JnM^2C_0(n+C_0)\eta^2\sum\limits_{i=1}^n \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf v}_{\bf i}}\right\|^2 \\ &\qquad+ 2JnM_1^2C_0C_{21}^2(n+C_0) \eta^2\sum\limits_{i=1}^n\left(\left\| \frac{\partial E({\bf W}^k)} {\partial{\bf a}_{\bf i}}\right\|^2+ \left\|\frac{\partial E({\bf W}^k)}{\partial{\bf b}_{\bf i}}\right\|^2\right) \\ &\quad= C_8\eta^2\left\|\frac{\partial E({\bf W}^k)}{\partial{\bf W}}\right\|^2\\ \end{aligned} $$
where \(C_8=\frac{1}{2}Jn(n+C_0) \hbox{max}\{1,2M^2C_0,4M_1^2C_0C_{21}^2\}\). Lemma 2 (31) is then proved. This completes the whole proof of Lemma 2. \(\square\)