Incomplete Multi-view Learning via Consensus Graph Completion

Zhang, Heng; Chen, Xiaohong; Zhang, Enhao; Wang, Liping

doi:10.1007/s11063-022-10973-9

Incomplete Multi-view Learning via Consensus Graph Completion

Published: 23 August 2022

Volume 55, pages 3923–3952, (2023)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Heng Zhang¹,
Xiaohong Chen¹,
Enhao Zhang² &
…
Liping Wang¹

265 Accesses
1 Altmetric
Explore all metrics

Abstract

Traditional graph-based multi-view learning methods usually assume that data are complete. Whereas several instances of some views may be missing, making the corresponding graphs incomplete and reducing the virtue of graph regularization. To mitigate the negative effect, a novel method, called incomplete multi-view learning via consensus graph completion (IMLCGC), is proposed in this paper, which completes the incomplete graphs based on the consensus among different views and then fuses the completed graphs into a common graph. Specifically, IMLCGC develops a learning framework for incomplete multi-view data, which contains three components, i.e., consensus low-dimensional representation, graph regularization, and consensus graph completion. Furthermore, a generalization error bound of the model is established based on Rademacher’s complexity. It shows the theory that learning with incomplete multi-view data is difficult. Experimental results on six well-known datasets indicate that IMLCGC significantly outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incomplete multi-view clustering based on low-rank representation with adaptive graph regularization

Article 09 March 2023

One-step graph-based incomplete multi-view clustering

Article 19 January 2024

Consensus Graph Learning for Incomplete Multi-view Clustering

Notes

References

Zhao J, Xie XJ, Xu X, Sun SL (2017) Multi-view learning overview: recent progress and new challenges. Inf Fusion 38:43–54
Article Google Scholar
Cai WL, Zhou HH, Xu L (2021) A multi-view co-training clustering algorithm based on global and local structure preserving. IEEE Access 9:29293–29302
Article Google Scholar
Kumar A, Daumé H (2011) A co-training approach for multi-view spectral clustering. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 393–400
Liu C, Yuen PC (2011) A boosted co-training algorithm for human action recognition. IEEE Trans Circuits Syst Video Technol 21(9):1203–1213
Article Google Scholar
Yang XH, Liu WF, Liu W, Tao DC (2021) A survey on canonical correlation analysis. IEEE Trans Knowl Data Eng 33(6):2349–2368
Article Google Scholar
Brbic M, Kopriva I (2018) Multi-view low-rank sparse subspace clustering. Pattern Recogn 73:247–258
Article Google Scholar
Zhao Y, You X, Yu S, Xu C, Yuan W, Jing X-Y, Zhang T, Tao D (2018) Multi-view manifold learning with locality alignment. Pattern Recogn 78:154–166
Article Google Scholar
Xie XJ, Sun SL (2019) General multi-view learning with maximum entropy discrimination. Neurocomputing 332:184–192
Article Google Scholar
Liu XW, Dou Y, Yin JP, Wang L, Zhu E (2016) Multiple kernel k-means clustering with matrix-induced regularization. In: 30th Association-for-the-Advancement-of-Artificial-Intelligence (AAAI) conference on artificial intelligence, pp 1888–1894
Chao GQ, Sun SL (2016) Multi-kernel maximum entropy discrimination for multi-view learning. Intell Data Anal 20(3):481–493
Article Google Scholar
Zhao W, Xu C, Guan ZY, Liu Y (2021) Multiview concept learning via deep matrix factorization. IEEE Trans Neural Netw Learn Syst 32(2):814–825
Article MathSciNet Google Scholar
Yan XQ, Hu SZ, Mao YQ, Ye YD, Yu H (2021) Deep multi-view learning methods: a review. Neurocomputing 448:106–129
Article Google Scholar
Sun G, Cong Y, Zhang YL, Zhao GS, Fu Y (2021) Continual multiview task learning via deep matrix factorization. IEEE Trans Neural Netw Learn Syst 32(1):139–150
Article MathSciNet Google Scholar
Tan G, Wang Z, Shi Z (2021) Proportional-integral state estimator for quaternion-valued neural networks with time-varying delays. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3103979
Article Google Scholar
Liu Y, Fan L, Zhang C, Zhou T, Xiao Z, Geng L, Shen D (2021) Incomplete multi-modal representation learning for Alzheimer’s disease diagnosis. Med Image Anal. https://doi.org/10.1016/j.media.2020.101953
Article Google Scholar
Yang WQ, Shi YH, Gao Y, Wang L, Yang M (2018) Incomplete-data oriented multiview dimension reduction via sparse low-rank representation. IEEE Trans Neural Netw Learn Syst 29(12):6276–6291
Article Google Scholar
Li SY, Jiang Y, Zhou ZH (2014) Partial multi-view clustering. In: 28th AAAI conference on artificial intelligence, pp. 1968–1974
Wen J, Sun HJ, Fei LK, Li JX, Zhang Z, Zhang B (2021) Consensus guided incomplete multi-view spectral clustering. Neural Netw 133:207–219
Article MATH Google Scholar
Li P, Chen SC (2020) Shared Gaussian process latent variable model for incomplete multiview clustering. IEEE Trans Cybern 50(1):61–73
Article Google Scholar
Qiao LS, Zhang LM, Chen SC, Shen DG (2018) Data-driven graph construction and graph learning: a review. Neurocomputing 312:336–351
Article Google Scholar
Feng X, Ke S, Shuo Y, Aziz A, Liangtian W, Shirui P, Huan L (2021) Graph learning: a survey. IEEE Trans Artif Intell 2(2):109–127
Article Google Scholar
Wen J, Xu Y, Liu H (2020) Incomplete multiview spectral clustering with adaptive graph learning. IEEE Trans Cybern 50(4):1418–1429
Article Google Scholar
Wen J, Zhang Z, Zhang Z, Fei LK, Wang M (2021) Generalized incomplete multiview clustering with flexible locality structure diffusion. IEEE Trans Cybern 51(1):101–114
Article Google Scholar
Zhang N, Sun S (2022) Incomplete multiview nonnegative representation learning with multiple graphs. Pattern Recogn 123:108412
Article Google Scholar
Wen J, Yan K, Zhang Z, Xu Y, Wang JQ, Fei LK, Zhang B (2021) Adaptive graph completion based incomplete multi-view clustering. IEEE Trans Multimedia 23:2493–2504
Article Google Scholar
Chen J, Wang G, Giannakis GB (2019) Graph multiview canonical correlation analysis. IEEE Trans Signal Process 67(11):2826–2838
Article MathSciNet MATH Google Scholar
Shawe-Taylor J, Cristianini N (2005) Kernel methods for pattern analysis. Cambridge University Press, Cambridge
MATH Google Scholar
Zhang EH, Chen XH, Wang LP (2020) Consistent discriminant correlation analysis. Neural Process Lett 52(1):891–904
Article Google Scholar
Wang C (2007) Variational Bayesian approach to canonical correlation analysis. IEEE Trans Neural Netw 18(3):905–910
Article Google Scholar
Carroll JD (1968) Generalization of canonical correlation analysis to three or more sets of variables. In: Proceedings of the 76th annual convention of the american psychological association, vol 3, pp 227–228
Fu X, Huang KJ, Hong MY, Sidiropoulos ND, So AMC (2017) Scalable and flexible multiview max-var canonical correlation analysis. IEEE Trans Signal Process 65(16):4150–4165
Article MathSciNet MATH Google Scholar
Luo Y, Tao DC, Ramamohanarao K, Xu C, Wen YG (2015) Tensor canonical correlation analysis for multi-view dimension reduction. IEEE Trans Knowl Data Eng 27(11):3111–3124
Article Google Scholar
Liu XW, Zhu XZ, Li MM, Wang L, Zhu E, Liu TL, Kloft M, Shen DG, Yin JP, Gao W (2020) Multiple kernel k-means with incomplete kernels. IEEE Trans Pattern Anal Mach Intell 42(5):1191–1204
Google Scholar
Recht B, Fazel M, Parrilo PA (2010) Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev 52(3):471–501
Article MathSciNet MATH Google Scholar
Fazel M, Hindi H, Boyd SP (2001) Aacc, Aacc, Aacc: a rank minimization heuristic with application to minimum order system approximation. In: American Control Conference (ACC). Proceedings of the American Control Conference. IEEE, New York, pp 4734–4739
Kim E, Lee M, Oh S. Elastic-net regularization of singular values for robust subspace learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 915–923
Candes EJ, Guo F (2002) New multiscale transforms, minimum total variation synthesis: applications to edge-preserving image reconstruction. Signal Process 82(11):1519–1543
Article MATH Google Scholar
Cai JF, Candes EJ, Shen ZW (2010) A singular value thresholding algorithm for matrix completion. SIAM J Optim 20(4):1956–1982
Article MathSciNet MATH Google Scholar
Wright TG, Trefethen LN (2001) Large-scale computation of pseudospectra using ARPACK and eigs. SIAM J Sci Comput 23(2):591–605
Article MathSciNet MATH Google Scholar
Maurer A (2006) The rademacher complexity of linear transformation classes. In: Lugosi G, Simon HU (eds) Learning theory, proceedings. Lecture notes in artificial intelligence, vol 40, pp 65–78
Liu TL, Tao DC, Xu D (2016) Dimensionality-dependent generalization bounds for k-dimensional coding schemes. Neural Comput 28(10):2213–2249
Article MathSciNet MATH Google Scholar
Maurer A, Pontil M (2010) K-dimensional coding schemes in hilbert spaces. IEEE Trans Inf Theory 56(11):5839–5846
Article MathSciNet MATH Google Scholar
Zhao H, Liu H, Fu Y. Incomplete multi-modal visual data grouping. In: IJCAI, pp 2392–2398
Wen J, Zhang Z, Xu Y, Zhong ZF (2018) Incomplete multi-view clustering via graph regularized matrix factorization. In: Computer Vision—ECCV 2018 workshops, Pt Iv, vol 11132, pp 593–608
Candes EJ, Recht B (2008) Exact low-rank matrix completion via convex optimization. In: 46th annual allerton conference on communication, control, and computing, pp 806–827
Bartlett PL, Mendelson S (2003) Rademacher and Gaussian complexities: risk bounds and structural results. J Mach Learn Res 3(3):463–482
MathSciNet MATH Google Scholar

Download references

Acknowledgements

The author would like to thank the National Natural Science Foundation of China (Grants 11971231 and 1211530001) for its support.

Author information

Authors and Affiliations

College of Science, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
Heng Zhang, Xiaohong Chen & Liping Wang
College of Computer Science and Technology/College of Artificial Intelligence, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China
Enhao Zhang

Authors

Heng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Enhao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Liping Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaohong Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

1.1 A.1 Proof of Lemma 1

For convenience, let’s denote

$$\begin{aligned} g\left( D \right) =\mu {{\left\| D \right\| }_{*}}+\gamma \left\| D \right\| _{F}^{2}+\lambda tr\left( {{F}^{T}}D \right) . \end{aligned}$$

Before giving proof of Lemma 1, we first give the Lemma 3.

Lemma 3

Let $Y\in \partial g\left( D \right) $ and $Y'\in \partial g\left( D' \right) $, then the following inequality is hold

$$\begin{aligned} \left\langle Y-{Y}',D-{D}' \right\rangle \ge 2\gamma \left\| D-{D}' \right\| _{F}^{2}. \end{aligned}$$

(A1)

Proof

According to $Y\in \partial g\left( D \right) $, we have $Y=\mu {{Y}_{0}}+2\gamma D+\lambda F$, and similarly ${Y}'=\mu {Y'_0}+2\gamma {D}'+\lambda F$, where $Y_0\in \partial \left\| D \right\| _*$. There is

$$\begin{aligned} \left\langle Y-Y',D-D' \right\rangle =\mu \left\langle {Y_0}-Y'_0,D-D' \right\rangle +2\gamma \left\| D-{D}' \right\| _{F}^{2}. \end{aligned}$$

Let’s prove $\left\langle {{Y}_{0}}-{Y'_0},D-{D}' \right\rangle >0$. From the definition of $Y_0$, we can get ${{\left\| {{Y}_{0}} \right\| }_{2}}\le 1$ and $\left\langle {{Y}_{0}},D \right\rangle ={{\left\| D \right\| }_{*}}$, then [34, 45]

$$\begin{aligned} \left\langle {{Y}_{0}},{D}' \right\rangle \le {{\left\| {{Y}_{0}} \right\| }_{2}}{{\left\| {{D}'} \right\| }_{*}}\le {{\left\| {{D}'} \right\| }_{*}}, \end{aligned}$$

therefore

$$\begin{aligned} \left\langle {{Y}_{0}}-{Y'_0},D-{D}' \right\rangle ={{\left\| D \right\| }_{*}}\text {+}{{\left\| {{D}'} \right\| }_{*}}-\left\langle {Y'_0},D \right\rangle -\left\langle {{Y}_{0}},{D}' \right\rangle \ge 0. \end{aligned}$$

So, Eq. (A1) holds. $\square $

The proof of the Lemma 1 is given as follows.

Proof

Let $D^*$ and $\Lambda ^*$ are the primal and dual optimal solution of the problem (11). Then the optimality condition is

$$\begin{aligned} {{Y}^{k}}={{P}_{\Omega }}\left( {{\Lambda }^{k-1}} \right) ,\quad {{Y}^{*}}={{P}_{\Omega }}\left( {{\Lambda }^{*}} \right) , \end{aligned}$$

where $Y^*\in \partial g\left( D^* \right) $, $Y^k\in \partial g\left( D^k \right) $ for any k. Then we can get

$$\begin{aligned} {{Y}^{k}}-{{Y}^{*}}={{P}_{\Omega }}\left( {{\Lambda }^{k-1}}-{{\Lambda }^{*}} \right) \end{aligned}$$

and further according to Lemma 3 to obtain

$$\begin{aligned} \left\langle {{P}_{\Omega }}\left( {{\Lambda }^{k-1}}-{{\Lambda }^{*}} \right) ,{{D}^{k}}-{{D}^{*}} \right\rangle =\left\langle {{Y}^{k}}-{{Y}^{*}},{{D}^{k}}-{{D}^{*}} \right\rangle \ge 2\gamma \left\| {{D}^{k}}-{{D}^{*}} \right\| _{F}^{2}. \end{aligned}$$

(A2)

From the ${{P}_{\Omega }}\left( {{D}^{*}} \right) ={{P}_{\Omega }}\left( M \right) $ we have

$$\begin{aligned} {{\left\| {{P}_{\Omega }}\left( {{\Lambda }^{k}}-{{\Lambda }^{*}} \right) \right\| }_{F}}={{\left\| {{P}_{\Omega }}\left( {{\Lambda }^{k-1}}-{{\Lambda }^{*}} \right) +\rho {{P}_{\Omega }}\left( {{D}^{*}}-{{D}^{k}} \right) \right\| }_{F}}. \end{aligned}$$

(A3)

Let ${{r}_{k}}={{\left\| {{P}_{\Omega }}\left( {{\Lambda }^{k}}-{{\Lambda }^{*}} \right) \right\| }_{F}}$, then the following formula holds according to Eqs. (A2) and (A3),

$$\begin{aligned} \begin{aligned} r_{k}^{2}&=r_{k-1}^{2}-2\rho \left\langle {{P}_{\Omega }}\left( {{\Lambda }^{k-1}}-{{\Lambda }^{*}} \right) ,{{D}^{k}}-{{D}^{*}} \right\rangle +{{\rho }^{2}}\left\| {{P}_{\Omega }}\left( {{D}^{k}}-{{D}^{*}} \right) \right\| _{F}^{2} \\&\le r_{k-1}^{2}-\left( 4\gamma \rho -{{\rho }^{2}} \right) \left\| {{P}_{\Omega }}\left( {{D}^{k}}-{{D}^{*}} \right) \right\| _{F}^{2} . \end{aligned} \end{aligned}$$

Because $0<\rho <4\gamma $, so $4\gamma \rho -{{\rho }^{2}}>0$, which further has the following two properties:

1.
The sequence $\left\{ {{\left\| {{P}_{\Omega }}\left( {{\Lambda }^{k}}-{{\Lambda }^{*}} \right) \right\| }_{F}}\right\} $ is nonincreasing, and is convergent due to it has a lower bound.
2.
Therefore, $\left\| {{P}_{\Omega }}\left( {{D}^{k}}-{{D}^{*}} \right) \right\| _{F}^{2}\rightarrow 0$ with $k\rightarrow \infty $.

$\square $

1.2 A.2 Proof of Theorem 1

According to the above analysis, the optimal solution can be obtained for both of two subproblems. Therefore, Algorithm 1 makes the objective function value of Eq. (7) decrease monotonically, and because it has a lower bound, the convergence is guaranteed. Assume that Algorithm 1 converges to $A^*$, $\{W_i^*\}_{i=1}^m$, and $D^*$, and then prove that it is KKT point.

Proof

The Lagrangian function of problem (7) is

$$\begin{aligned} \mathcal {L}_2\left( A,\{W_i\}_{i=1}^m,D,\Gamma ,\Lambda \right) =\sum \limits _{i=1}^{m}{\left\| \left( A-W_i^TX_i\right) P_i\right\| _F^2}+\left\langle \Gamma ,\left( AA^T-I\right) \right\rangle +\mathcal {L}_1\left( D,\Lambda \right) , \end{aligned}$$

(A4)

where $\Gamma $ is Lagrange multiplier. Taking the derivative w.r.t. A, $\{W_i\}_{i=1}^m$, D, $\Gamma $, and $\Lambda $ respectively and setting them to zero, we can get the KKT condition of the problem (7):

According to the solving step of A and $\{W_i\}_{i=1}^m$, $A^*$ and $\{W_i^*\}_{i=1}^m$ satisfy the following equation,

$$\begin{aligned} W_i^*=\left( X_iP_iX_i^T\right) ^{-1}X_iP_i{A^*}^T \qquad i=1,2,\cdots ,m. \end{aligned}$$

(A6)

Further, $A^*$ is obtained by solving the eigenvalue problem of Eq. (10), so Eqs. (A5a), (A5b) and (A5d) is established. Since the essence of SVT is to solve D through Eq. (A5c) , therefore Eq. (A5c) obviously is satisfied. We know from Lemma 1 that $\left\| {\mathcal {P}_{\Omega }}\left( {{D}^{k}}-{{D}^{*}} \right) \right\| _{F}^{2}\rightarrow 0$ with $k\rightarrow \infty $, so Eq. (A5e) is satisfied. In summary, Algorithm 1 will converge to KKT point of problem (7). $\square $

1.3 A.3 Proof of Lemma 2

Proof

From the definition of $p_i\left( x\right) $ and algebraic operation, we get formulation as follows,

$$\begin{aligned} \begin{aligned} f(x)&=\sum \limits _{i=1}^{m}{\left\| (a-W_i^Tx_i)p_i\left( x\right) \right\| _2^2} \\&=\sum \limits _{i=1}^{m}{\left[ p_i\left( x\right) a^Ta-2p_i\left( x\right) a^TW_i^Tx_i+p_i\left( x\right) x_i^TW_iW_i^Tx_i \right] } \\&=\sum \limits _{i=1}^{m}{p_i\left( x\right) \left[ a^Ta-2\left( {{a}^{T}}{{W}_i}^T\right) {{x}_i} +\text {vec}{{\left( {{W}_i}W^{T}_i \right) }^{T}}\text {vec}\left( {{x}_i}{{x}_i}^{T} \right) \right] }. \end{aligned} \end{aligned}$$

Let

$$\begin{aligned} \begin{aligned} w&={{\left[ {{w}_{1}}^{T},{{w}_{2}}^{T},\cdots ,{{w}_{m}}^{T} \right] }^{T}}, \\ {{\Phi }_{p}}\left( {{x}} \right)&={{\left[ p_{1}\left( x\right) \varphi {{\left( x_{1} \right) }^{T}},p_{2}\left( x\right) \varphi {{\left( x_{2} \right) }^{T}},\cdots , p_{m}\left( x\right) \varphi {{\left( x_{m} \right) }^{T}} \right] }^{T}}, \\ \end{aligned} \end{aligned}$$

where $w_i=\left[ a^T,-2a^TW_i^T,\text {vec}\left( W_iW_i^T\right) ^T \right] ^T$ and $\varphi \left( x_i\right) =\left[ a^T,x_i^T,\text {vec}\left( x_ix_i^T\right) ^T \right] ^T$ for $i=1, 2, \dots , m$. Therefore we can rewrite Eq. (18) as follows,

$$\begin{aligned} f(x)=\left\langle w,\Phi _{p} \left( x \right) \right\rangle . \end{aligned}$$

(A7)

So it is easy to see that f(x) is a linear function for $\Phi _{p} \left( x \right) $. It is shown below that the feature space $\mathcal {F}$ is derived from $\hat{k}_p\left( x,y\right) $,

$$\begin{aligned} \begin{aligned} \hat{k}_p\left( x,y\right)&=\sum \limits _{i=1}^{m}{p_i\left( x\right) p_i\left( y\right) \left[ a^Ta+k\left( x_i,y_i\right) +k\left( x_i,y_i\right) ^2\right] } \\&=\sum \limits _{i=1}^{m}{p_i\left( x\right) p_i\left( y\right) \left[ a^Ta+{x_i}^Ty_i+{x_i}^Ty_i{x_i}^Ty_i\right] } \\&=\left\langle \Phi _p \left( x \right) ,\Phi _p \left( y \right) \right\rangle . \end{aligned} \end{aligned}$$

$\square $

1.4 A.4 Proof of Theorem 2

Proof

Let’s first derive the upper bound on $\left\| w\right\| _2^2$ as follows,

$$\begin{aligned} \begin{aligned} \left\| w\right\| _2^2&=\sum \limits _{i=1}^{m}{\left\| w_i\right\| _2^2} \\&=\sum \limits _{i=1}^{m}{\left[ a^Ta+4a^TW_i^TW_ia+\text {vec}\left( W_i^TW_i\right) ^T\text {vec}\left( W_iW_i^T\right) \right] } \\&=\sum \limits _{i=1}^{m}{\left[ a^Ta+4\text {vec}\left( W_i^TW_i\right) ^T\text {vec}\left( aa^T\right) +\text {vec}\left( W_iW_i^T\right) ^T\text {vec}\left( W_iW_i^T\right) \right] } \\&\le m\left( c_1^2+4c_1^2c_2+c_2^2\right) . \end{aligned} \end{aligned}$$

So $\left\| w\right\| _2<B$. Based on Lemma 2 and assumptions, it is easy to find that $f\left( x\right) $ belongs to the function class

$$\begin{aligned} \mathcal {F}_B=\left\{ x\rightarrow \left\langle w,\Phi _p\left( x\right) \right\rangle :\left\| w\right\| _2\le B\right\} . \end{aligned}$$

Obviously, $f\left( x\right) \ge 0$, and we are easy going to show that $f\left( x\right) $ has an upper bound

$$\begin{aligned} \begin{aligned} f\left( x\right)&=\left\langle w,\Phi _p\left( x\right) \right\rangle \le \left\| w\right\| _2\left\| \Phi _p\left( x\right) \right\| _2\\&=B\sqrt{\left\langle \Phi _p\left( x\right) ,\Phi _p\left( x \right) \right\rangle }=B\sqrt{\hat{k}_p\left( x,x\right) } \\&\le BR. \end{aligned} \end{aligned}$$

As a result, we did this by exploiting the McDiarmid’s concentration inequality [27, 46], the follows inequation holds with probability at lest $1-\delta $ that

$$\begin{aligned} \mathbb {E}\left[ f\left( x \right) \right] \le \frac{1}{n}\sum \limits _{k=1}^{n}{f\left( {{x}_{k}} \right) }+2\hat{R}_n\left( \mathcal {F}_B \right) +3BR\sqrt{\frac{\ln \left( {2}/{\delta }\; \right) }{2n}}, \end{aligned}$$

(A8)

where $\hat{R}_n\left( \mathcal {F}_{B}\right) $ is empirical Rademacher’s complexity of $\mathcal {F}_{B}$. Now let’s estimate the upper bound of $\hat{R}_n\left( \mathcal {F}_{B}\right) $. According to the definition of $\hat{R}_n\left( \mathcal {F}_{B}\right) $, we have following inequation holds,

$$\begin{aligned} \begin{aligned} \hat{R}_n\left( \mathcal {F}_{B}\right)&=\frac{1}{n}\mathbb {E}_\sigma \left[ \underset{f\in \mathcal {F}_{B}}{\mathop {\sup }}\sum \limits _{k=1}^{n}{\sigma _kf\left( x^{\left( k\right) }\right) }\right] \\&=\frac{1}{n}{\mathbb {E}_{\sigma }}\left\{ \underset{w,p}{\mathop {\sup }}\,\sum \limits _{k=1}^{n}{\sigma _k\left\langle w,\Phi _p\left( x^{\left( k\right) }\right) \right\rangle } \right\} \\&=\frac{1}{n}\mathbb {E}_\sigma \left\{ \underset{w,p}{\mathop {\sup }}\left\langle w,\sum \limits _{k=1}^{n}{\sigma _k\Phi _p\left( {{x}^{\left( k\right) }}\right) }\right\rangle \right\} \\&\le \frac{B}{n}{{\mathbb {E}}_{\sigma }}\left\{ \underset{p}{\mathop {\sup }}\,\sqrt{\left\langle \sum \limits _{k=1}^{n}{{{\sigma }_{k}}{{\Phi }_{p}}\left( {{x}^{\left( k\right) }} \right) },\sum \limits _{l=1}^{n}{{{\sigma }_{l}}{{\Phi }_{p}}\left( {{x}^{\left( l\right) }} \right) } \right\rangle } \right\} \\&=\frac{B}{n}{{\mathbb {E}}_{\sigma }}\left\{ \underset{p}{\mathop {\sup }}\,\sqrt{\sum \limits _{k,l=1}^{n}{{{\sigma }_{k}}{{\sigma }_{l}}\left\langle {{\Phi }_{p}}\left( {{x}^{\left( k\right) }} \right) ,{{\Phi }_{p}}\left( {{x}^{\left( l\right) }} \right) \right\rangle }} \right\} \\&\le \frac{B}{n}\sqrt{{{\mathbb {E}}_{\sigma }}\left\{ \underset{p}{\mathop {\sup }}\,\left[ \sum \limits _{k,l=1}^{n}{{{\sigma }_{k}}{{\sigma }_{l}}\hat{k}_p\left( x^{\left( k\right) },x^{\left( l\right) }\right) }\right] \right\} }, \\ \end{aligned} \end{aligned}$$

(A9)

where $\left\{ \sigma _k\right\} _{k=1}^n$ are i.i.d. Rademacher random variable. The first inequality sign makes use of Cauchy–Schwarz inequality. The last inequality holds due to square root function is concave and can be derived from Jensen’s inequality. As a result,

$$\begin{aligned} {\Psi }_{p}={{\mathbb {E}}_{\sigma }}\left\{ \underset{p}{\mathop {\sup }}\,\left[ \sum \limits _{k,l=1}^{n}{{{\sigma }_{k}}{{\sigma }_{l}}\hat{k}_p\left( x^{\left( k\right) },x^{\left( l\right) }\right) }\right] \right\} . \end{aligned}$$

(A10)

Equation (19) can be proved by combining Eqs. (A8), (A9) and (A10). $\square $

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, H., Chen, X., Zhang, E. et al. Incomplete Multi-view Learning via Consensus Graph Completion. Neural Process Lett 55, 3923–3952 (2023). https://doi.org/10.1007/s11063-022-10973-9

Download citation

Accepted: 14 July 2022
Published: 23 August 2022
Issue Date: August 2023
DOI: https://doi.org/10.1007/s11063-022-10973-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Incomplete Multi-view Learning via Consensus Graph Completion

Abstract

Access this article

Similar content being viewed by others

Incomplete multi-view clustering based on low-rank representation with adaptive graph regularization

One-step graph-based incomplete multi-view clustering

Consensus Graph Learning for Incomplete Multi-view Clustering

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A

1.1 A.1 Proof of Lemma 1

Lemma 3

Proof

Proof

1.2 A.2 Proof of Theorem 1

Proof

1.3 A.3 Proof of Lemma 2

Proof

1.4 A.4 Proof of Theorem 2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Incomplete Multi-view Learning via Consensus Graph Completion

Abstract

Access this article

Similar content being viewed by others

Incomplete multi-view clustering based on low-rank representation with adaptive graph regularization

One-step graph-based incomplete multi-view clustering

Consensus Graph Learning for Incomplete Multi-view Clustering

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A

Appendix A

1.1 A.1 Proof of Lemma 1

Lemma 3

Proof

Proof

1.2 A.2 Proof of Theorem 1

Proof

1.3 A.3 Proof of Lemma 2

Proof

1.4 A.4 Proof of Theorem 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation