Appendix
1.1 The computing details of the algorithm DisSRADMM
In this section we give the details on solving problems (10)–(13) in the proposed algorithm DisSRADMM.
Update \(\mathbf {z}_m^{k+1}\) : consider
$$\begin{aligned} \begin{aligned} \mathbf {z}^{k+1}_{m}&= \arg \min _{\mathbf {z}_m}\mathcal {L}(\varvec{\mu }^k,\varvec{\beta }^k,\varvec{\eta }^k,\mathbf {z}_m^k,\mathbf {z}_{-m}^k,\varvec{\beta }_1^k,...,\varvec{\beta }_M^k, \varvec{\mu }_1^k,...,\varvec{\mu }_M^k, \\&\qquad \mathbf {s}^k, \mathbf {u}_1^k,...\mathbf {u}_{M}^k, \mathbf {q}_1^k,...\mathbf {q}_{M}^k,\mathbf {v}_1^k,...\mathbf {v}_{M}^k). \end{aligned} \end{aligned}$$
So we have
$$\begin{aligned} \begin{aligned} \mathbf {z}_m^{k+1}&= \arg \min _{\mathbf {z}_m}\bigg \{\frac{1}{2}\Vert \mathbf {z}_m\Vert ^2 + \frac{d_{m}}{2}\Vert \mathbf {y}_m - \mathbf {X}_m\varvec{\beta }_m^k - \mathbf {A}_m\varvec{\mu }_m^k-\mathbf {z}_m\Vert ^2\\&\quad +\langle \mathbf {y}_m - \mathbf {X}_m\varvec{\beta }_m^k - \mathbf {A}_m\varvec{\mu }_m^k-\mathbf {z}_m,\mathbf {u}_m^k \rangle \bigg \}. \end{aligned} \end{aligned}$$
The previous problem can be solved elementwisely, so for any \(i=1,2,...,n_m\), we carry out the following updating procedure
$$\begin{aligned} \begin{aligned} z_{m,i}^{k+1}&= \arg \min _{z_{m,i}}\bigg \{\frac{1}{2}\Vert z_{m,i}\Vert ^2 +\frac{d_{m}}{2}\Vert \mathbf {y}_{m,i} - \mathbf {x}_{m,i}^{\top }\varvec{\beta }_m^k - \mathbf {A}_{m,i}^{\top }\varvec{\mu }_m^k-\mathbf {z}_{m,i}\Vert ^2\\&\quad +\langle \mathbf {y}_{m,i}- \mathbf {x}_{m,i}^{\top }\varvec{\beta }_m^k - \mathbf {A}_{m,i}^{\top }\varvec{\mu }_m^k-\mathbf {z}_{m,i},\mathbf {u}_{m,i}^k \rangle \bigg \}, \end{aligned} \end{aligned}$$
where \(\mathbf {A}_{m,i}\) represents the ith row vector of the matrix \(\mathbf {A}_m\). Thus, we consider the following equations
$$\begin{aligned} \begin{aligned} 0&=\frac{\partial }{\partial \mathbf {z}_m}\bigg \{\frac{1}{2}\Vert z_{m,i}\Vert ^2 +\frac{d_{m}}{2}\Vert \mathbf {y}_{m,i} - \mathbf {x}_{m,i}^{\top }\varvec{\beta }_m^k - \mathbf {A}_{m,i}^{\top }\varvec{\mu }_m^k-\mathbf {z}_{m,i}\Vert ^2\\&\quad +\langle \mathbf {y}_{m,i}- \mathbf {x}_{m,i}^{\top }\varvec{\beta }_m^k - \mathbf {A}_{m,i}^{\top }\varvec{\mu }_m^k-\mathbf {z}_{m,i},\mathbf {u}_{m,i}^k \rangle \bigg \}\\&=z_{m,i} - d_m(y_{m,i}-\mathbf {x}_{m,i}^{\top }\varvec{\beta }_m^k-\mathbf {A}_{m,i}^{\top }\varvec{\mu }_m^k-z_{m,i})-u_{m,i}^k. \end{aligned} \end{aligned}$$
It then follows that
$$\begin{aligned} z_{m,i} = \left\{ d_m(y_{m,i}-\mathbf {x}_{m,i}^{\top }\varvec{\beta }_m^k-\mathbf {A}_{m,i}^{\top }\varvec{\mu }_m^k)+u_{m,i}^k\right\} \bigg /\left( d_m+1 \right) . \end{aligned}$$
Update \(\varvec{\beta }_m^{k+1}\): note that
$$\begin{aligned} \begin{aligned} \varvec{\beta }^{k+1}_{m}&= \arg \min _{\varvec{\beta }_m}\mathcal {L}(\varvec{\mu }^k,\varvec{\beta },\varvec{\beta }_m,\varvec{\beta }_{-m}^k,\varvec{\eta }^k,\varvec{\mu }_1^k,...,\varvec{\mu }_M^k, \mathbf {z}_1^k,...,\mathbf {z}_M^k, \\&\quad \mathbf {s}^k, \mathbf {u}_1^k,...\mathbf {u}_{M}^k, \mathbf {q}_1^k,...\mathbf {q}_{M}^k,\mathbf {v}_1^k,...\mathbf {v}_{M}^k), \end{aligned} \end{aligned}$$
which is given by
$$\begin{aligned} \begin{aligned} \varvec{\beta }_m^{k+1}&= \arg \min _{\varvec{\beta }_m}\bigg \{\frac{d_{m}}{2}\Vert \mathbf {y}_m - \mathbf {X}_m\varvec{\beta }_m - \mathbf {A}_m\varvec{\mu }_m^k-\mathbf {z}_m^k\Vert ^2\\&\quad +\langle \mathbf {y}_m - \mathbf {X}_m\varvec{\beta }_m - \mathbf {A}_m\varvec{\mu }_m^k-\mathbf {z}_m^k,\mathbf {u}_m^k \rangle + \frac{b_{m}}{2}\Vert \varvec{\beta }_{m}-\varvec{\beta }^k\Vert ^2\\&\quad + \langle \varvec{\beta }_{m}-\varvec{\beta }^k,\mathbf {v}_m^k \rangle \bigg \}. \end{aligned} \end{aligned}$$
We then have the following equations
$$\begin{aligned} \begin{aligned} 0&= \frac{\partial }{\partial \varvec{\beta }_m}\bigg \{\frac{d_{m}}{2}\Vert \mathbf {y}_m - \mathbf {X}_m\varvec{\beta }_m - \mathbf {A}_m\varvec{\mu }_m^k-\mathbf {z}_m^k\Vert ^2+\langle \mathbf {y}_m - \mathbf {X}_m\varvec{\beta }_m - \mathbf {A}_m\varvec{\mu }_m^k-\mathbf {z}_m^k,\mathbf {u}_m^k \rangle \\&\quad + \frac{b_{m}}{2}\Vert \varvec{\beta }_{m}-\varvec{\beta }^k\Vert ^2+ \langle \varvec{\beta }_{m}-\varvec{\beta }^k,\mathbf {v}_m^k \rangle \bigg \}\\&=-d_m\left( \mathbf {X}_m^{\top }\mathbf {y}_m - \mathbf {X}_m^{\top }\mathbf {X}_m\varvec{\beta }_m^k - \mathbf {X}_m^{\top }\mathbf {A}_m\varvec{\mu }_m-\mathbf {X}_m^{\top }\mathbf {z}_m^k \right) - \mathbf {X}_m^{\top }\mathbf {u}_m^k\\&\quad +b_m(\varvec{\beta }_m-\varvec{\beta }^k) + \mathbf {v}_m^k\\&=(d_m\mathbf {X}_m^{\top }\mathbf {X}_m + b_m\mathbf {I})\varvec{\beta }_m - d_m(\mathbf {X}_m^{\top }\mathbf {y}_m - \mathbf {X}_m^{\top }\mathbf {A}_m\varvec{\mu }_m^k-\mathbf {X}_m^{\top }\mathbf {z}_m^k)-\mathbf {X}_m^{\top }\mathbf {u}_m^k - b_m\varvec{\beta }^k + \mathbf {v}_m^k. \end{aligned} \end{aligned}$$
Hence, it follows that
$$\begin{aligned} \begin{aligned} (d_m\mathbf {X}_m^{\top }\mathbf {X}_m + b_m\mathbf {I})\varvec{\beta }_m&= d_m[\mathbf {X}_m^{\top }\mathbf {y}_m - \mathbf {X}_m^{\top }\mathbf {A}_m\varvec{\mu }_m^k-\mathbf {X}_m^{\top }\mathbf {z}_m^k]+\mathbf {X}_m^{\top }\mathbf {u}_m^k+ b_m\varvec{\beta }^k\\&\quad -\mathbf {v}_m^k, \end{aligned} \end{aligned}$$
which is equivalent to
$$\begin{aligned} \begin{aligned} \left( \frac{d_m}{b_m}\mathbf {X}_m^{\top }\mathbf {X}_m + \mathbf {I}\right) \varvec{\beta }_m&= \frac{d_m}{b_m}(\mathbf {X}_m^{\top }\mathbf {y}_m - \mathbf {X}_m^{\top }\mathbf {A}_m\varvec{\mu }_m^k-\mathbf {X}_m^{\top }\mathbf {z}_m)+\frac{1}{b_m}\mathbf {X}_m^{\top }\mathbf {u}_m^k\\&\quad + \varvec{\beta }^k-\frac{1}{b_m}\mathbf {v}_m^k. \end{aligned} \end{aligned}$$
Let \(o_1=n_1+n_2+...+n_{m-1}\) and \(o_2=o_1+m_m\). \(\varvec{\mu }_{m,in}\) represents the elements between \(o_1\)th and \(o_2\)th and \(\varvec{\mu }_{m,out}\) represents all the other elements in \(\varvec{\mu }_m\). Then, we have
$$\begin{aligned} \begin{aligned} \varvec{\beta }_m^{k+1}&= \left( \frac{d_m}{b_m}\mathbf {X}_m^{\top }\mathbf {X}_m + \mathbf {I}\right) ^{-1}\bigg \{\frac{d_m}{b_m}(\mathbf {X}_m^{\top }\mathbf {y}_m - \mathbf {X}_m^{\top }\varvec{\mu }_{m,in}^k-\mathbf {X}_m^{\top }\mathbf {z}_m)+\frac{1}{b_m}\mathbf {X}_m^{\top }\mathbf {u}_m^k\\&\quad + \varvec{\beta }^k-\frac{1}{b_m}\mathbf {v}_m^k\bigg \}. \end{aligned} \end{aligned}$$
Update \(\varvec{\mu }_m^{k+1}\): consider
$$\begin{aligned} \begin{aligned} \varvec{\mu }^{k+1}_{m}&= \arg \min _{\varvec{\mu }_m}\mathcal {L}(\varvec{\mu }^k,\varvec{\beta }^k,\varvec{\eta }^k,\varvec{\mu }_m,\varvec{\mu }_{-m}^k,\varvec{\beta }_1^k,...,\varvec{\beta }_M^k, \mathbf {z}_1^k,...,\mathbf {z}_M^k, \\&\quad \mathbf {s}^k, \mathbf {u}_1^k,...\mathbf {u}_{M}^k, \mathbf {q}_1^k,...\mathbf {q}_{M}^k,\mathbf {v}_1^k,...\mathbf {v}_{M}^k), \end{aligned} \end{aligned}$$
which is given by
$$\begin{aligned} \begin{aligned} \varvec{\mu }_m^{k+1}&= \arg \min _{\varvec{\mu }_m}\bigg \{\frac{d_{m}}{2}\Vert \mathbf {y}_m - \mathbf {X}_m\varvec{\beta }_m^k - \mathbf {A}_m\varvec{\mu }_m-\mathbf {z}_m^k\Vert ^2\\&\quad +\langle \mathbf {y}_m - \mathbf {X}_m\varvec{\beta }_m^k - \mathbf {A}_m\varvec{\mu }_m-\mathbf {z}_m^k,\mathbf {u}_m^k \rangle + \frac{a_{m}}{2}\Vert \varvec{\mu }_{m}-\varvec{\mu }^k\Vert ^2\\&\quad + \langle \varvec{\mu }_{m}-\varvec{\mu }^k,\mathbf {q}_m^k \rangle \bigg \}. \end{aligned} \end{aligned}$$
This optimization problem can be solved by considering the following equations
$$\begin{aligned} \begin{aligned} 0&= \frac{\partial }{\partial \varvec{\mu }_m}\bigg \{\frac{d_{m}}{2}\Vert \mathbf {y}_m - \mathbf {X}_m\varvec{\beta }_m^k - \mathbf {A}_m\varvec{\mu }_m-\mathbf {z}_m^k\Vert ^2+\langle \mathbf {y}_m - \mathbf {X}_m\varvec{\beta }_m^k - \mathbf {A}_m\varvec{\mu }_m-\mathbf {z}_m^k,\mathbf {u}_m^k \rangle \\&\quad + \frac{a_{m}}{2}\Vert \varvec{\mu }_{m}-\varvec{\mu }\Vert ^2+ \langle \varvec{\mu }_{m}-\varvec{\mu },\mathbf {q}_m^k \rangle \bigg \}\\&=-d_m\left( \mathbf {A}_m^{\top }\mathbf {y}_m - \mathbf {A}_m^{\top }\mathbf {X}_m\varvec{\beta }_m^k - \mathbf {A}_m^{\top }\mathbf {A}_m\varvec{\mu }_m-\mathbf {A}_m^{\top }\mathbf {z}_m^k \right) - \mathbf {A}_m^{\top }\mathbf {u}_m^k\\&\quad +a_m(\varvec{\mu }_m-\varvec{\mu }^k) + \mathbf {q}_m^k\\&=(d_m\mathbf {A}_m^{\top }\mathbf {A}_m + a_m\mathbf {I})\varvec{\mu }_m - d_m(\mathbf {A}_m^{\top }\mathbf {y}_m - \mathbf {A}_m^{\top }\mathbf {X}_m\varvec{\beta }_m^k-\mathbf {A}_m^{\top }\mathbf {z}_m^k)-\mathbf {A}_m^{\top }\mathbf {u}_m^k - a_m\varvec{\mu }^k + \mathbf {q}_m^k. \end{aligned} \end{aligned}$$
Hence, we obtain that
$$\begin{aligned} \begin{aligned}&\left( \frac{d_m}{a_m}\mathbf {A}_m^{\top }\mathbf {A}_m + \mathbf {I}\right) \varvec{\mu }_m = \frac{d_m}{a_m}(\mathbf {A}_m^{\top }\mathbf {y}_m - \mathbf {A}_m^{\top }\mathbf {X}_m\varvec{\beta }_m^k-\mathbf {A}_m^{\top }\mathbf {z}_m)+\frac{1}{a_m}\mathbf {A}_m^{\top }\mathbf {u}_m^k\\&\quad + \varvec{\mu }^k-\frac{1}{a_m}\mathbf {q}_m^k. \end{aligned} \end{aligned}$$
Although it involves computing the inverse of large matrices, there are some tricks. Suppose the samples are arranged by nodes. Then the element at ith row and jth column of the matrix \(\mathbf {A}_m\) is given by
$$\begin{aligned} A_{m,ij}=\left\{ \begin{array}{lcl} 1&{}&{}o_{m1}\le i=j\le o_{m2}\\ 0&{}&{} \textit{otherwise} \end{array}, \right. \end{aligned}$$
where \(o_{m1}=n_1+n_2+\cdots +n_{m-1}\) and \(o_{m2}=o_{m1}+n_m\). Due to these special structures, we divide \(\varvec{\mu }_m\) into two parts \(\varvec{\mu }_{m,in}\) and \(\varvec{\mu }_{m,out}\), where \(\varvec{\mu }_{m,in}\) is composed of the \(o_{m1}\)th to the \(o_{m2}\)th elements, and \(\varvec{\mu }_{m,out}\) contains all the rest elements. Similar segmentation is also applied to \(\varvec{\mu }^k\) and \(\mathbf {q}_m^k\), which is \(\varvec{\mu }_{in}^k\) , \(\varvec{\mu }_{out}^k\), \(\mathbf {q}_{m,in}^k\) and \(\mathbf {q}_{m,out}^k\). Then we have
$$\begin{aligned} \varvec{\mu }_{m,in}^{k+1} = \frac{a_m}{d_m+a_m}\left\{ \frac{d_m}{a_m}(\mathbf {y}_m-\mathbf {X}_m\varvec{\beta }_m^k-\mathbf {z}_m^k)+\frac{1}{a_m}\mathbf {u}_m^k + \varvec{\mu }_{in}^k - \mathbf {q}_{m,in}^k\right\} . \end{aligned}$$
Since the problem can also be solved elementwisely, then for any \(i=1,2,...,n_m\), we obtain that
$$\begin{aligned} \mu _{m,in,i}^{k+1} = \frac{d_m}{d_m+a_m}(y_{m,i}-\mathbf {x}_{m,i}^{\top }\varvec{\beta }_m^k-z_{m,i})+\frac{1}{d_m+a_m}(u_{m,in,i}^k+a_m\mu _{in,i}^k-q_{m,in,i}^k). \end{aligned}$$
Similarly, we can update \(\varvec{\mu }_{m,out}^k\) elementwisely
$$\begin{aligned} \mu _{m,out,i}^{k+1}=\mu _{out,i}-\frac{1}{a_m}q_{m,out,i}^k. \end{aligned}$$
Update \(\varvec{\mu }^{k+1}\): note that
$$\begin{aligned} \begin{aligned} \varvec{\mu }^{k+1}&= \arg \min _{\varvec{\mu }}\mathcal {L}(\varvec{\mu },\varvec{\beta }^k,\varvec{\eta }^k,\varvec{\mu }_1^k,...,\varvec{\mu }_M^k,\varvec{\beta }_1^k,...,\varvec{\beta }_M^k, \mathbf {z}_1^k,...,\mathbf {z}_M^k, \\&\qquad \mathbf {s}^k, \mathbf {u}_1^k,...\mathbf {u}_{M}^k, \mathbf {q}_1^k,...\mathbf {q}_{M}^k,\mathbf {v}_1^k,...\mathbf {v}_{M}^k), \end{aligned} \end{aligned}$$
which is given by
$$\begin{aligned} \begin{aligned} \varvec{\mu }^{k+1}&= \arg \min _{\varvec{\mu }}\frac{c}{2}\Vert \mathbf {W}\varvec{\mu }-\varvec{\eta }\Vert ^2+<\mathbf {W}\varvec{\mu }-\varvec{\eta },\mathbf {s}>+\sum _{m=1}^{M}\Big (\frac{a_{m}}{2}\Vert \varvec{\mu }_{m}-\varvec{\mu }\Vert ^2+ \\&\qquad \langle \varvec{\mu }_{m}-\varvec{\mu },\mathbf {q}_m \rangle \Big ). \end{aligned} \end{aligned}$$
Thus, we can solve the following equations
$$\begin{aligned} \begin{aligned} 0&= c\left( \mathbf {W}^{\top }\mathbf {W}\varvec{\mu }-\mathbf {W}^{\top }\varvec{\eta }^k\right) +\mathbf {W}^{\top }\mathbf {s}^k+\sum _{m=1}^M\left( a_m\varvec{\mu }-a_m\varvec{\mu }_m^{k+1}-\mathbf {q}_m^{k+1}\right) \\&=c\left( \mathbf {W}^{\top }\mathbf {W}\varvec{\mu }-\mathbf {W}^{\top }\varvec{\eta }^k\right) +\mathbf {W}^{\top }\mathbf {s}^k+\sum _{m=1}^M a_m\varvec{\mu }-\sum _{m=1}^M a_m\varvec{\mu }_m^{k+1}-\sum _{m=1}^M \mathbf {q}_m^{k+1}. \end{aligned} \end{aligned}$$
We then have
$$\begin{aligned} \left( \mathbf {W}^{\top }\mathbf {W}+\frac{1}{c}\sum _{m=1}^Ma_m\mathbf {I}\right) \varvec{\mu }= c\mathbf {W}^{\top }\varvec{\eta }^k-\mathbf {W}^{\top }\mathbf {s}^k+\sum _{m=1}^M a_m\varvec{\mu }_m^{k+1}+\sum _{m=1}^M \mathbf {q}_m^{k+1}. \end{aligned}$$
Let \(\varvec{\mu }_s^{k+1}=\sum _{m=1}^Ma_m\varvec{\mu }_m^{k+1}\) and \(\mathbf {q}_s^{k+1} = \sum _{m=1}^Mq_m^{k+1}\). Note that \(\mathbf {W}^{\top }\mathbf {W}= N\mathbf {I}- \mathbf {1}\mathbf {1}^{\top }\), and \(\mathbf {W}\) is a sparse matrix. Let \(r_{ij}\) denote the element at the ith row and the jth column of the matrix \(\left( \mathbf {W}^{\top }\mathbf {W}++\frac{1}{c}\sum _{m=1}^Ma_m\mathbf {I}\right) ^{-1}\) and \(S=N-1+\frac{1}{c}\sum _{m=1}^Ma_m\), we have
$$\begin{aligned} r_{ij}=\left\{ \begin{array}{lcl} \frac{RC + N - 2}{((N-1)-RC\times (N-2)-RC\times RC)\times (\sum _{m=1}^Ma_m-c)}&{}&{} i=j\\ \frac{1}{((N-1)-RC\times (N-2)-RC\times RC)\times (\sum _{m=1}^Ma_m-c)}&{}&{} i\ne j \end{array}, \right. \end{aligned}$$
The detail of updating \(\varvec{\mu }^{k+1}\) is given in Algorithm 2, and the calculation complexity of updating \(\varvec{\mu }^{k+1}\) is \(O(N^2)\).
Update \(\varvec{\beta }\): note that
$$\begin{aligned} \begin{aligned} \varvec{\beta }^{k+1}&=\arg \min _{\varvec{\beta }}\mathcal {L}(\varvec{\mu }^{k+1},\varvec{\beta },\varvec{\eta }^k,\varvec{\mu }_1^{k+1},...,\varvec{\mu }_M^{k+1},\varvec{\beta }_1^{k+1},...,\varvec{\beta }_M^{k+1}, \mathbf {z}_1^k,...,\mathbf {z}_M^k, \\&\qquad \mathbf {s}^k, \mathbf {u}_1^k,...\mathbf {u}_{M}^k, \mathbf {q}_1^k,...\mathbf {q}_{M}^k,\mathbf {v}_1^k,...\mathbf {v}_{M}^k), \end{aligned} \end{aligned}$$
which is given by
$$\begin{aligned} \varvec{\beta }^{k+1}=\arg \min _{\varvec{\beta }}\sum _{m=1}^{M}\Big (\frac{b_{m}}{2}\Vert \varvec{\beta }_{m}-\varvec{\beta }\Vert ^2+ \langle \varvec{\beta }_{m}-\varvec{\beta },\mathbf {v}_m \rangle \Big ). \end{aligned}$$
We then solve the following equation
$$\begin{aligned} 0=\sum _{m=1}^Mb_m\varvec{\beta }-\sum _{m=1}^Mb_m\varvec{\beta }_m^{k+1}-\sum _{m=1}^M\mathbf {v}_m^{k+1}, \end{aligned}$$
which leads to
$$\begin{aligned} \varvec{\beta }^{k+1} = \frac{1}{\sum _{m=1}^Mb_m}\left( \sum _{m=1}^Mb_m\varvec{\beta }_m^{k+1}+\sum _{m=1}^M\mathbf {v}_m^{k+1}\right) . \end{aligned}$$
Update \(\varvec{\eta }\): note that
$$\begin{aligned} \begin{aligned} \varvec{\eta }^{k+1}&=\arg \min _{\varvec{\eta }}\mathcal {L}(\varvec{\mu }^{k+1},\varvec{\beta }^{k+1},\varvec{\eta },\varvec{\mu }_1^{k+1},...,\varvec{\mu }_M^{k+1},\varvec{\beta }_1^{k+1},...,\varvec{\beta }_M^{k+1}, \mathbf {z}_1^k,...,\mathbf {z}_M^k, \\&\quad \mathbf {s}^k, \mathbf {u}_1^k,...\mathbf {u}_{M}^k, \mathbf {q}_1^k,...\mathbf {q}_{M}^k,\mathbf {v}_1^k,...\mathbf {v}_{M}^k), \end{aligned} \end{aligned}$$
which is given by
$$\begin{aligned} \varvec{\eta }^{k+1} = \arg \min _{\varvec{\eta }}\left( P_N(|\varvec{\eta }|,\varvec{\pi }) + \frac{c}{2} \Vert \mathbf {W}\varvec{\mu }^{k+1} - \varvec{\eta }\Vert ^2 + \langle \mathbf {W}\varvec{\mu }^{k+1} - \varvec{\eta }, \mathbf {s}^k\rangle \right) . \end{aligned}$$
Since the penalty function is convex in \(\eta _{ij}\), then 0 must be contained in the subdifferential of this function at point \(\eta ^{k+1}_{ij}\).
$$\begin{aligned} \begin{aligned}&0 \in \partial \bigg \{P_N(\varvec{\eta }, \varvec{\pi })+\frac{c}{2}{\Vert W\varvec{\beta }^{k+1}-\varvec{\eta }\Vert }^2+<W\varvec{\eta }^{k+1}-\varvec{\eta },\mathbf {s}^k>\bigg \}\bigg |_{\varvec{\eta }=\varvec{\eta }^{k+1}}\\ \Rightarrow&0\in \partial P_N(\varvec{\eta }, \varvec{\pi })|_{\varvec{\eta }=\varvec{\eta }^{k+1}} + c\varvec{\eta }^{k+1}-cW\varvec{\mu }^{k+1}-\mathbf {s}^k. \end{aligned} \end{aligned}$$
Let \(\delta _{ij}=\mathbf {W}_{ij}^T\varvec{\mu }^{k+1}+\frac{1}{c}s_{ij}\), we then have \(L_1\) penalty:
$$\begin{aligned} {\hat{\eta }}_{ij}=T(\delta _{ij},\pi /c); \end{aligned}$$
MCP:
$$\begin{aligned} {\hat{\eta }}_{ij}=\left\{ \begin{array}{lcl} \frac{T(\delta _{ij},\pi /c)}{1-1/(\alpha c)}&{}&{}if |\delta _{ij}|\le \alpha \pi \\ \delta _{ij}&{}&{}if |\delta _{ij}|>\alpha \pi \end{array}; \right. \end{aligned}$$
SCAD:
$$\begin{aligned} {\hat{\eta }}_{ij}=\left\{ \begin{array}{lcl} T(\delta _{ij},\pi /c)&{}&{}if|\delta _{ij}|\le \pi +\pi /c\\ \frac{T(\delta _{ij},\alpha \pi /\{(\alpha -1)c\})}{1-1/\{(\alpha -1)c\}}&{}&{}if \pi +\pi /c<|\delta _{ij}|\le \alpha \pi \\ \delta _{ij}&{}&{}if |\delta _{ij}|>\alpha \pi \end{array}, \right. \end{aligned}$$
where \(T(t,\pi )=sign(t)(|t|-\pi )_+\).
1.2 Proof of Theorem 1
Here we take \((\mathbf {z}_1,...,\mathbf {z}_M,\varvec{\beta }_1,...,\varvec{\beta }_M,\varvec{\mu }_1,...,\varvec{\mu }_M,\varvec{\eta })\) as a whole because their updates can be implemented elementwisely. By the definition of
$$\begin{aligned} \big (\mathbf {z}_1^{k+1},...,\mathbf {z}_M^{k+1},\varvec{\beta }_1^{k+1},...,\varvec{\beta }_M^{k+1},\varvec{\mu }_1^{k+1},...,\varvec{\mu }_M^{k+1},\varvec{\eta }^{k+1}\big ), \end{aligned}$$
we have
$$\begin{aligned} \begin{aligned}&\mathcal {L}(\varvec{\beta }^{k+1},\varvec{\mu }^{k+1}, \mathbf {z}_1^{k+1},...,\mathbf {z}_M^{k+1},\varvec{\beta }_1^{k+1},...,\varvec{\beta }_M^{k+1},\varvec{\mu }_1^{k+1},...,\varvec{\mu }_M^{k+1},\varvec{\eta }^{k+1},\\&\qquad \mathbf {s}^{k},\mathbf {u}_1^{k},...,\mathbf {u}_M^{k},\mathbf {v}_1^{k},...,\mathbf {v}_M^{k},\mathbf {q}_1^{k},...,\mathbf {q}_M^{k})\\&\quad \le \mathcal {L}(\varvec{\beta }^{k+1}, \varvec{\mu }^{k+1}, \mathbf {z}_1,...,\mathbf {z}_M,\varvec{\beta }_1,...,\varvec{\beta }_M,\varvec{\mu }_1,...,\varvec{\mu }_M,\varvec{\eta },\mathbf {s}^{k},\mathbf {u}_1^{k},...,\mathbf {u}_M^{k},\mathbf {v}_1^{k},...,\mathbf {v}_M^{k},\\&\qquad \mathbf {q}_1^{k},...,\mathbf {q}_M^{k}), \end{aligned} \end{aligned}$$
for any \((\mathbf {z}_1,...,\mathbf {z}_M,\varvec{\beta }_1,...,\varvec{\beta }_M,\varvec{\mu }_1,...,\varvec{\mu }_M,\varvec{\eta })\).
It then follows that
$$\begin{aligned} \begin{aligned}&\mathcal {L}(\varvec{\beta }^{k+1},\varvec{\mu }^{k+1}, \mathbf {z}_1^{k+1},...,\mathbf {z}_M^{k+1},\varvec{\beta }_1^{k+1},...,\varvec{\beta }_M^{k+1},\varvec{\mu }_1^{k+1},...,\varvec{\mu }_M^{k+1},\varvec{\eta }^{k+1},\\&\qquad \mathbf {s}^{k},\mathbf {u}_1^{k},...,\mathbf {u}_M^{k},\mathbf {v}_1^{k},...,\mathbf {v}_M^{k},\mathbf {q}_1^{k},...,\mathbf {q}_M^{k})\\&\quad \le \inf _{\begin{array}{c} \mathbf {z},\varvec{\beta },\varvec{\mu },\varvec{\eta } \end{array} }\mathcal {L}(\varvec{\beta }^{k+1}, \mathbf {z}_1,...,\mathbf {z}_M,\varvec{\beta }_1,...,\varvec{\beta }_M,\varvec{\mu }_1,...,\varvec{\mu }_M,\varvec{\eta }, \mathbf {s}^{k},\mathbf {u}_1^{k},...,\mathbf {u}_M^{k},\mathbf {v}_1^{k},...,\mathbf {v}_M^{k},\mathbf {q}_1^{k},...,\mathbf {q}_M^{k})\\&\quad =\frac{1}{M}\sum _{m=1}^M\Vert \mathbf {y}_m-\mathbf {X}_m\varvec{\beta }^{k+1}-\mathbf {A}_m\varvec{\mu }^{k+1}\Vert ^2 + P_N((\mathbf {W}\varvec{\mu }^{k+1}), \varvec{\pi })\triangleq l^{k+1}. \end{aligned} \end{aligned}$$
We write the objective function \(\mathcal {L}\) in the following form
$$\begin{aligned} \mathcal {L} = \mathcal {L}_0+\mathcal {L}_1, \end{aligned}$$
where
$$\begin{aligned} \begin{aligned}&\mathcal {L}_0(\varvec{\beta },\varvec{\mu },\mathbf {z}_1,...,\mathbf {z}_M,\varvec{\beta }_1,...,\varvec{\beta }_M,\varvec{\mu }_1,...,\varvec{\mu }_M,\varvec{\eta },\mathbf {s},\mathbf {u}_1,...,\mathbf {u}_M,\mathbf {v}_1,...,\mathbf {v}_M,\mathbf {q}_1,...,\mathbf {q}_M)\\&\quad =\frac{c}{2}\Vert \mathbf {W}\varvec{\mu }-\varvec{\eta }\Vert ^2 +<\mathbf {W}\varvec{\mu }-\varvec{\eta }, \mathbf {s}>\\&\qquad +\sum _{m=1}^M\bigg (\frac{d_m}{2}\Vert \mathbf {y}_m-\mathbf {X}_m\varvec{\beta }_m-\mathbf {A}_m\mu _m-\mathbf {z}_m\Vert ^2\\&\qquad +<\mathbf {y}_m-\mathbf {X}_m\varvec{\beta }_m-\mathbf {A}_m\mu _m-\mathbf {z}_m,\mathbf {u}_m>\bigg )\\&\qquad +\sum _{m=1}^M\left( \frac{b_m}{2}\Vert \varvec{\beta }-\varvec{\beta }_m\Vert ^2+<\varvec{\beta }-\varvec{\beta }_m,\mathbf {v}_m>\right) \\&\qquad +\sum _{m=1}^M\left( \frac{a_m}{2}\Vert \varvec{\mu }-\varvec{\mu }_m\Vert ^2+<\varvec{\mu }-\varvec{\mu }_m,\mathbf {q}_m>\right) ,\\ \text {and}\\&\qquad \mathcal {L}_1(\varvec{\beta },\varvec{\mu },\mathbf {z}_1,...,\mathbf {z}_M,\varvec{\beta }_1,...,\varvec{\beta }_M,\varvec{\mu }_1,...,\varvec{\mu }_M,\varvec{\eta },\mathbf {s},\mathbf {u}_1,...,\mathbf {u}_M,\mathbf {v}_1,...,\mathbf {v}_M,\mathbf {q}_1,...,\mathbf {q}_M)\\&\quad =\frac{1}{M}\sum _{m=1}^M\Vert \mathbf {z}_m\Vert ^2 + P_N(\varvec{\eta }, \varvec{\pi }). \end{aligned} \end{aligned}$$
Since \(\mathcal {L}_0\) is differentiable with respect to \((\mathbf {z}_m,\varvec{\beta }_m,\varvec{\mu }_m)\), where \(m=1,...M\) and \(\mathcal {L}_1\) is convex with respect to \(\varvec{\eta }\). By Lemma 3.1 and Theorem 4.1 of Tseng (2001), we know that there exists a stationary point of \(\mathcal {L}\), denoted as \((\varvec{\beta }^*,\mathbf {s}^*,\mathbf {u}_m^*, \mathbf {v}_m^*)\), where \(m = 1, ... M\) and let \(l^*\) be its corresponding value of the objective function \(\mathcal {L}\).
Therefore we have
$$\begin{aligned} l^* = \lim _{k\rightarrow \infty }l^{k} = \lim _{k\rightarrow \infty }l^{k+t}, \end{aligned}$$
for any positive integer t. Note that
$$\begin{aligned} \left( \begin{array}{c} \mathbf {s}^{k+t}\\ \mathbf {u}_1^{k+t}\\ \vdots \\ \mathbf {u}_M^{k+t}\\ \mathbf {v}_1^{k+t}\\ \vdots \\ \mathbf {v}_M^{k+t}\\ \mathbf {q}_1^{k+t}\\ \vdots \\ \mathbf {q}_M^{k+t} \end{array}\right) = \left( \begin{array}{c} \mathbf {s}^{k}\\ \mathbf {u}_1^{k}\\ \vdots \\ \mathbf {u}_M^{k}\\ \mathbf {v}_1^{k}\\ \vdots \\ \mathbf {v}_M^{k}\\ \mathbf {q}_1^{k}\\ \vdots \\ \mathbf {q}_M^{k} \end{array}\right) + \sum _{i=1}^{t-1}\left( \begin{array}{c} c(\mathbf {W}\varvec{\mu }^{k+i}-\varvec{\eta }^{k+i})\\ d_1(\mathbf {y}_1 - \mathbf {X}_1\varvec{\beta }_1^{k+i} -\mathbf {A}_m\mu _m^{k+i}- \mathbf {z}_1^{k+i})\\ \vdots \\ d_M(\mathbf {y}_M - \mathbf {X}_M\varvec{\beta }_M^{k+i} -\mathbf {A}_m\mu _m^{k+i}- \mathbf {z}_M^{k+i})\\ b_1(\varvec{\beta }^{k+i}-\varvec{\beta }_1^{k+i})\\ \vdots \\ b_M(\varvec{\beta }^{k+i}-\varvec{\beta }_M^{k+i})\\ a_1(\varvec{\mu }^{k+i}-\varvec{\mu }_1^{k+i})\\ \vdots \\ a_M(\varvec{\mu }^{k+i}-\varvec{\mu }_M^{k+i}) \end{array} \right) , \end{aligned}$$
so we have
$$\begin{aligned}&\mathcal {L}(\varvec{\beta }^{k+1},\varvec{\mu }^{k+1}, \mathbf {z}_1^{k+t+1},...,\mathbf {z}_M^{k+t+1},\varvec{\beta }_1^{k+t+1},...,\varvec{\beta }_M^{k+t+1},\varvec{\mu }_1^{k+t+1},...,\varvec{\mu }_M^{k+t+1},\varvec{\eta }^{k+t+1},\\&\qquad \mathbf {s}^{k+t},\mathbf {u}_1^{k+t},...,\mathbf {u}_M^{k+t},\mathbf {v}_1^{k+t},...,\mathbf {v}_M^{k+t},\mathbf {q}_1^{k+t},...,\mathbf {q}_M^{k+t})\\&\quad =\frac{1}{M}\sum _{m=1}^M\Vert \mathbf {z}_m^{k+t+1}\Vert ^2 + P_N(\varvec{\eta }_j^{k+t+1}, \varvec{\pi })\\&\qquad +\frac{d_1}{2}\Vert \mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^{k+t+1}-\mathbf {A}_m\mu _1^{k+t+1}-\mathbf {z}_1^{k+t+1}\Vert ^2\\&\qquad +...+\frac{d_M}{2}\Vert \mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^{k+t+1}-\mathbf {A}_M\mu _M^{k+t+1}-\mathbf {z}_M^{k+t+1}\Vert ^2\\&\qquad +\frac{b_1}{2}\Vert \varvec{\beta }^{k+t+1}-\varvec{\beta }_1^{k+t+1}\Vert ^2+...+\frac{b_M}{2}\Vert \varvec{\beta }^{k+t+1}-\varvec{\beta }_M^{k+t+1}\Vert ^2 +\frac{c}{2}\Vert \mathbf {W}\varvec{\beta }^{k+t+1}-\varvec{\eta }^{k+t+1}\Vert ^2\\&\qquad +\frac{a_1}{2}\Vert \varvec{\mu }^{k+t+1}-\varvec{\mu }_1^{k+t+1}\Vert ^2+...+\frac{a_M}{2}\Vert \varvec{\mu }^{k+t+1}-\varvec{\mu }_M^{k+t+1}\Vert ^2\\&\qquad +\mathbf {u}_1^{k+tT}(\mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^{k+t+1}-\mathbf {z}_1^{k+t+1})+...+\mathbf {u}_M^{k+tT}(\mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^{k+t+1}-\mathbf {z}_M^{k+t+1})\\&\qquad +\mathbf {v}_1^{k+tT}(\varvec{\beta }^{k+t+1}-\varvec{\beta }_1^{k+t+1})+...+\mathbf {v}_M^{k+tT}(\varvec{\beta }^{k+t+1}-\varvec{\beta }_M^{k+t+1})\\&\qquad +\mathbf {q}_1^{k+tT}(\varvec{\mu }^{k+t+1}-\varvec{\mu }_1^{k+t+1})+...+\mathbf {q}_M^{k+tT}(\varvec{\mu }^{k+t+1}-\varvec{\mu }_M^{k+t+1})\\&\qquad +\mathbf {s}^{k+tT}(\mathbf {W}\varvec{\mu }^{k+t+1}-\varvec{\eta }^{k+t+1})\\&\quad =\frac{1}{M}\sum _{m=1}^M\Vert \mathbf {z}_m^{k+t+1}\Vert ^2 + P_N(\varvec{\eta }_j^{k+t+1}, \varvec{\pi })\\&\qquad +\frac{a_1}{2}\Vert \mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^{k+t+1}-\mathbf {z}_1^{k+t+1}\Vert ^2+...+\frac{a_M}{2}\Vert \mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^{k+t+1}-\mathbf {z}_M^{k+t+1}\Vert ^2\\&\qquad +\frac{b_1}{2}\Vert \varvec{\beta }^{k+t+1}-\varvec{\beta }_1^{k+t+1}\Vert ^2+...+\frac{b_M}{2}\Vert \varvec{\beta }^{k+t+1}-\varvec{\beta }_M^{k+t+1}\Vert ^2 +\frac{c}{2}\Vert \mathbf {W}\varvec{\beta }^{k+t+1}-\varvec{\eta }^{k+t+1}\Vert ^2\\&\qquad +\mathbf {u}_1^{kT}(\mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^{k+t+1}-\mathbf {z}_1^{k+t+1})+...+\mathbf {u}_M^{kT}(\mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^{k+t+1}-\mathbf {z}_M^{k+t+1})\\&\qquad +\mathbf {v}_1^{kT}(\varvec{\beta }^{k+t+1}-\varvec{\beta }_1^{k+t+1})+...+\mathbf {v}_M^{kT}(\varvec{\beta }^{k+t+1}-\varvec{\beta }_M^{k+t+1})+\mathbf {s}^{kT}(\mathbf {W}\varvec{\beta }^{k+t+1}-\varvec{\eta }^{k+t+1})\\&\qquad +\sum _{i=1}^{t-1}d_1(\mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^{k+i}-\mathbf {A}_1\mu _1^{k+i}-\mathbf {z}_1^{k+i})^{\top }(\mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^{k+t+1}-\mathbf {A}_1\mu _1^{k+t+1}-\mathbf {z}_1^{k+t+1})\\&\qquad \vdots \\&\qquad +\sum _{i=1}^{t-1}d_M(\mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^{k+i}-\mathbf {A}_M\mu _M^{k+i}-\mathbf {z}_M^{k+i})^{\top }(\mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^{k+t+1}-\mathbf {A}_M\mu _M^{k+t+1}\\&\qquad -\mathbf {z}_M^{k+t+1})+\sum _{i=1}^{t-1}b_1(\varvec{\beta }^{k+i}-\varvec{\beta }_1^{k+i})^{\top }(\varvec{\beta }^{k+t+1}-\varvec{\beta }_1^{k+t+1})\\&\qquad \vdots \\&\qquad +\sum _{i=1}^{t-1}b_M(\varvec{\beta }^{k+i}-\varvec{\beta }_M^{k+i})^{\top }(\varvec{\beta }^{k+t+1}-\varvec{\beta }_M^{k+t+1})+\sum _{i=1}^{t-1}a_1(\varvec{\mu }^{k+i}-\varvec{\mu }_1^{k+i})^{\top }(\varvec{\mu }^{k+t+1}\\&\qquad -\varvec{\mu }_1^{k+t+1})\\&\qquad \vdots \\&\qquad +\sum _{i=1}^{t-1}a_M(\varvec{\mu }^{k+i}-\varvec{\mu }_M^{k+i})^{\top }(\varvec{\mu }^{k+t+1}-\varvec{\mu }_M^{k+t+1})+\sum _{i=1}^{t-1}c(\mathbf {W}\varvec{\mu }^{k+i}-\varvec{\eta }^{k+i})^{\top }(\mathbf {W}\varvec{\mu }^{k+t+1}\\&\qquad -\varvec{\eta }^{k+t+1})\\&\quad \le l^{k+t+1}. \end{aligned}$$
Hence, we obtain that
$$\begin{aligned}&\lim _{k\rightarrow \infty }\mathcal {L}(\varvec{\beta }^{k+1},\varvec{\mu }^{k+1}, \mathbf {z}_1^{k+t+1},...,\mathbf {z}_M^{k+t+1},\varvec{\beta }_1^{k+t+1},...,\varvec{\beta }_M^{k+t+1},\varvec{\mu }_1^{k+t+1},...,\varvec{\mu }_M^{k+t+1},\varvec{\eta }^{k+t+1},\\&\qquad \mathbf {s}^{k+t},\mathbf {u}_1^{k+t},...,\mathbf {u}_M^{k+t},\mathbf {v}_1^{k+t},...,\mathbf {v}_M^{k+t},\mathbf {q}_1^{k+t},...,\mathbf {q}_M^{k+t})\\&\quad =\frac{1}{M}\sum _{m=1}^M\Vert \mathbf {z}_m^*\Vert ^2 + P_N(\varvec{\eta }_j^*, \varvec{\pi })\\&\qquad +\lim _{k\rightarrow \infty }\mathbf {u}_1^{k\top }(\mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^*-\mathbf {A}_1\varvec{\mu }_1^*-\mathbf {z}_1^*)+...+\lim _{k\rightarrow \infty }\mathbf {u}_M^{k\top }(\mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^*-\mathbf {A}_M\varvec{\mu }_M^*\\&\qquad -\mathbf {z}_M^*)+\lim _{k\rightarrow \infty }\mathbf {v}_1^{k\top }(\varvec{\beta }^*-\varvec{\beta }_1^*)+...+\lim _{k\rightarrow \infty }\mathbf {v}_M^{k\top }(\varvec{\beta }^*-\varvec{\beta }_M^*)\\&\qquad +\lim _{k\rightarrow \infty }\mathbf {q}_1^{k\top }(\varvec{\mu }^*-\varvec{\mu }_1^*)+...+\lim _{k\rightarrow \infty }\mathbf {q}_M^{k\top }(\varvec{\mu }^*-\varvec{\mu }_M^*)\\&\qquad +\lim _{k\rightarrow \infty }\mathbf {s}^{k\top }(\mathbf {W}\varvec{\mu }^*-\varvec{\eta }^*)\\&\qquad +\left( t-\frac{1}{2}\right) (a_1\Vert \mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^*-\mathbf {z}_1^*\Vert ^2+...+a_M\Vert \mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^*-\mathbf {z}_M^*\Vert ^2)\\&\qquad +\left( t-\frac{1}{2}\right) (b_1\Vert \varvec{\beta }^*-\varvec{\beta }_1^*\Vert ^2+...+b_M\Vert \varvec{\beta }^*-\varvec{\beta }_M^*\Vert ^2)\\&\qquad +\left( t-\frac{1}{2}\right) (c\Vert \mathbf {W}\varvec{\beta }^*-\varvec{\eta }^*\Vert ^2)\\&\quad =l^*+\lim _{k\rightarrow \infty }\mathbf {u}_1^{k\top }(\mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^*-\mathbf {A}_1\varvec{\mu }_1^*-\mathbf {z}_1^*)+...+\lim _{k\rightarrow \infty }\mathbf {u}_M^{k\top }(\mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^*-\mathbf {A}_M\varvec{\mu }_m^*\\&\qquad -\mathbf {z}_M^*)+\lim _{k\rightarrow \infty }\mathbf {v}_1^{k\top }(\varvec{\beta }^*-\varvec{\beta }_1^*)+...+\lim _{k\rightarrow \infty }\mathbf {v}_M^{k\top }(\varvec{\beta }^*-\varvec{\beta }_M^*)\\&\qquad +\lim _{k\rightarrow \infty }\mathbf {q}_1^{k\top }(\varvec{\mu }^*-\varvec{\mu }_1^*)+...+\lim _{k\rightarrow \infty }\mathbf {q}_M^{k\top }(\varvec{\mu }^*-\varvec{\mu }_M^*)\\&\qquad +\lim _{k\rightarrow \infty }\mathbf {s}^{k\top }(\mathbf {W}\varvec{\mu }^*-\varvec{\eta }^*)\\&\qquad +\left( t-\frac{1}{2}\right) (d_1\Vert \mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^*-\mathbf {A}_1\varvec{\mu }_1^*-\mathbf {z}_1^*\Vert ^2+...+d_M\Vert \mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^*-\mathbf {A}_M\varvec{\mu }_M^*\\&\qquad -\mathbf {z}_M^*\Vert ^2)+\left( t-\frac{1}{2}\right) (b_1\Vert \varvec{\beta }^*-\varvec{\beta }_1^*\Vert ^2+...+b_M\Vert \varvec{\beta }^*-\varvec{\beta }_M^*\Vert ^2)\\&\qquad +\left( t-\frac{1}{2}\right) (a_1\Vert \varvec{\mu }^*-\varvec{\mu }_1^*\Vert ^2+...+a_M\Vert \varvec{\mu }^*-\varvec{\mu }_M^*\Vert ^2)\\&\qquad +\left( t-\frac{1}{2}\right) (c\Vert \mathbf {W}\varvec{\mu }^*-\varvec{\eta }^*\Vert ^2)\\&\quad \le l^*, \end{aligned}$$
for any positive integer t. It then follows that
$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert \mathbf {r}^{k} \Vert =\lim _{k\rightarrow \infty }\left\Vert \left( \begin{array}{c} \mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^{k}-\mathbf {A}_1\varvec{\mu }_1^{k}-\mathbf {z}_1^{k}\\ ...\\ \mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^{k}-\mathbf {A}_M\varvec{\mu }_M^{k}-\mathbf {z}_M^{k}\\ \varvec{\beta }^{k} - \varvec{\beta }_1^{k}\\ ...\\ \varvec{\beta }^{k} - \varvec{\beta }_M^{k}\\ \varvec{\mu }^{k} - \varvec{\mu }_1^{k}\\ ...\\ \varvec{\mu }^{k} - \varvec{\mu }_M^{k}\\ \mathbf {W}\varvec{\mu }^{k} - \varvec{\eta }^{k} \end{array} \right) \right\Vert =\left\Vert \left( \begin{array}{c} \mathbf {y}_1-\mathbf {X}_1\varvec{\beta }_1^*-\mathbf {A}_1\varvec{\mu }_1^*-\mathbf {z}_1^*\\ ...\\ \mathbf {y}_M-\mathbf {X}_M\varvec{\beta }_M^*-\mathbf {A}_M\varvec{\mu }_M^*-\mathbf {z}_M^*\\ \varvec{\beta }^* - \varvec{\beta }_1^*\\ ...\\ \varvec{\beta }^* - \varvec{\beta }_M^*\\ \varvec{\mu }^* - \varvec{\mu }_1^*\\ ...\\ \varvec{\mu }^* - \varvec{\mu }_M^*\\ \mathbf {W}\varvec{\mu }^* - \varvec{\eta }^* \end{array} \right) \right\Vert =0. \end{aligned}$$
Thus, the primal feasibility condition is met.
For the dual residual, by definition, \(\varvec{\mu }^{{k+1}}\) minimizes
$$\begin{aligned} \mathcal {L}(\varvec{\beta },\mathbf {z}_1^{k},...,\mathbf {z}_M^{k},\varvec{\beta }_1^{k},...,\varvec{\beta }_M^{k},\varvec{\mu }_1,...,\varvec{\mu }_M,\varvec{\eta }^{k},\mathbf {s}^{k},\mathbf {u}_1^{k},...,\mathbf {u}_M^{k},\mathbf {v}_1^{k},...,\mathbf {v}_M^{k},\mathbf {q}_1^{k},...,\mathbf {q}_M^{k}), \end{aligned}$$
so we obtain that
$$\begin{aligned} \mathbf {0}_p\in \left( \frac{\partial \mathcal {L}}{\partial \varvec{\mu }}\right) . \end{aligned}$$
Therefore, we have
$$\begin{aligned} \begin{aligned} \mathbf {0}_p&=c\mathbf {W}^{\top }(\mathbf {W}\varvec{\mu }^{k+1}-\varvec{\eta }^{k}) + \sum _{m=1}^Ma_m(\varvec{\mu }^{k+1}-\varvec{\mu }_m^{k}) + \mathbf {W}^{\top }\mathbf {s}^{k}+\sum _{m=1}^M\mathbf {q}_m^{k}\\&=c\mathbf {W}^{\top }(\mathbf {W}\varvec{\mu }^{k+1}-\varvec{\eta }^{k+1}) + \sum _{m=1}^Ma_m(\varvec{\mu }^{k+1}-\varvec{\mu }_m^{k+1}) + \mathbf {W}^{\top }\mathbf {s}^{k+1}+\sum _{m=1}^M\mathbf {q}_m^{k+1}\\&\quad +c\mathbf {W}^{\top }(\varvec{\eta }^{k+1}-\varvec{\eta }^{k}) + \sum _{m=1}^Ma_m(\varvec{\mu }_m^{k+1}-\varvec{\mu }_m^{k})- \mathbf {W}^{\top }(\mathbf {s}^{k+1}-\mathbf {s}^{k})-\sum _{m=1}^M(\mathbf {q}_m^{k+1}-\mathbf {q}_m^{k})\\&=\mathbf {W}^{\top }\mathbf {s}^{k+1}+\sum _{m=1}^M\mathbf {q}_m^{k+1}+c\mathbf {W}^{\top }(\varvec{\eta }^{k+1}-\varvec{\eta }^{k}) + \sum _{m=1}^Ma_m(\varvec{\mu }_m^{k+1}-\varvec{\mu }_m^{k}). \end{aligned} \end{aligned}$$
Thus, the dual residual is given by
$$\begin{aligned} \mathbf {d}^{k+1} = -c\mathbf {W}^{\top }(\varvec{\eta }^{k+1}-\varvec{\eta }^{k}) - \sum _{m=1}^Ma_m(\varvec{\mu }_m^{k+1}-\varvec{\mu }_m^{k}). \end{aligned}$$
Furthermore, we obtain that
$$\begin{aligned} \lim _{k\rightarrow \infty }\mathbf {d}^{k+1} = -c\mathbf {W}^{\top }(\varvec{\eta }^*-\varvec{\eta }^*) - \sum _{m=1}^Mb_m(\varvec{\mu }_m^*-\varvec{\mu }_m^*) = \mathbf {0}_p, \end{aligned}$$
which implies that the dual feasibility is satisfied. \(\square \)