Robust mixed-norm constrained regression with application to face recognitions

Sang, Xiaoshuang; Xu, Yesong; Lu, Hong; Zhao, Qinghua; Ali, Zakir; Lu, Jianfeng

doi:10.1007/s00521-020-04925-4

Robust mixed-norm constrained regression with application to face recognitions

Original Article
Published: 07 May 2020

Volume 32, pages 17551–17567, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Xiaoshuang Sang¹,
Yesong Xu¹,
Hong Lu¹,
Qinghua Zhao²,
Zakir Ali¹ &
…
Jianfeng Lu¹

420 Accesses
Explore all metrics

Abstract

Most existing regression-based classification methods cope with pixelwise noise via $\ell _1$-norm or $\ell _2$-norm, but neglect the structural information between pixels. To the best of our knowledge, nuclear norm-based matrix regression approaches have achieved great success for addressing imagewise noise, but may result in unreasonable regression and incorrect classification, especially when test images are extremely corrupted by larger occlusions and severe illumination variations, since they apply the corrupted test images to reconstruction process directly, and the influence of noise will be unavoidable. To overcome this limitation, this paper presents a robust mixed-norm constrained regression model to deal with the structural noise corruption. To be more specific, nuclear norm of the error between corrupted test image and its corresponding recovered image is exploited as a regular term for characterizing the low rank noise structure, and Frobenius norm is utilized to depict the difference between the recovered image and restructured image on account of the less noise of recovered image. Then, we adopt the alternating direction method of multipliers to settle our proposed approaches efficiently. Furthermore, the theoretical convergence proof and detailed analysis of computational complexity are provided to assess our algorithms. Eventually, extensive experiments on five well-known face databases have manifested that the proposed methods outperform some state-of-the-art regression-based approaches for primarily addressing noise caused by occlusion and illumination changes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 7

Fig. 8

Nuclear- $$L_1$$ Norm Joint Regression for Face Reconstruction and Recognition

Robust Regression with Nonconvex Schatten p-Norm Minimization

A joint matrix minimization approach for multi-image face recognition

Article 20 April 2018

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Chen S, Gong C, Yang J, Li X, Wei Y, Li J (2018) Adversarial metric learning. Preprint arXiv:180203170
Tang J, Lin J, Li Z, Yang J (2018) Discriminative deep quantization hashing for face image retrieval. IEEE Trans Neural Netw Learn Syst 29(12):6154–6162
Article Google Scholar
Yang J, Luo L, Qian J, Tai Y, Zhang F, Xu Y (2017) Nuclear norm based matrix regression with applications to face recognition with occlusion and illumination changes. IEEE Trans Pattern Anal Mach Intell 39(1):156–171
Article Google Scholar
Naseem I, Togneri R, Bennamoun M (2010) Linear regression for face recognition. IEEE Trans Pattern Anal Mach Intell 32(11):2106–2112
Article Google Scholar
Naseem I, Togneri R, Bennamoun M (2012) Robust regression for face recognition. Pattern Recognit 45(1):104–118
Article Google Scholar
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Article Google Scholar
Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: Which helps face recognition? In: IEEE International conference on computer vision. IEEE, Barcelona, Spain, pp 471–478
Holland PW, Welsch RE (1977) Robust regression using iteratively reweighted least-squares. Commun Stat-theor M 6(9):813–827
Article MATH Google Scholar
Yang M, Zhang L, Yang J, Zhang D (2011) Robust sparse coding for face recognition. In: IEEE computer society conference on computer vision and pattern recognition, IEEE, pp 625–632
He R, Zheng WS, Hu BG (2011) Maximum correntropy criterion for robust face recognition. IEEE Trans Pattern Anal Mach Intell 33(8):1561–1576
Article Google Scholar
He R, Zheng WS, Tan T, Sun Z (2014) Half-quadratic-based iterative minimization for robust sparse representation. IEEE Trans Pattern Anal Mach Intell 36(2):261–275
Article Google Scholar
Jia K, Chan TH, Ma Y (2012) Robust and practical face recognition via structured sparsity. In: European conference on computer vision, Springer, pp 331–344
Luo L, Yang J, Qian J, Tai Y (2015) Nuclear-$\ell _1$ norm joint regression for face reconstruction and recognition with mixed noise. Pattern Recognit 48(12):3811–3824
Google Scholar
Xu Y, Fang X, Wu J, Li X, Zhang D (2016) Discriminative transfer subspace learning via low-rank and sparse representation. IEEE Trans Image Process 25(2):850–863
Article MathSciNet MATH Google Scholar
Qian J, Luo L, Yang J, Zhang F, Lin Z (2015) Robust nuclear norm regularized regression for face recognition with occlusion. Pattern Recognit 48(10):3145–3159
Article Google Scholar
Deng YJ, Li HC, Wang Q, Du Q (2018) Nuclear norm-based matrix regression preserving embedding for face recognition. Neurocomputing 311:279–290
Article Google Scholar
Luo L, Tu Q, Yang J, Yang J (2018) An adaptive line search scheme for approximated nuclear norm based matrix regression. Neurocomputing 289:23–31
Article Google Scholar
Zhang H, Wu QJ, Chow TW, Zhao M (2012) A two-dimensional neighborhood preserving projection for appearance-based face recognition. Pattern Recognit 45(5):1866–1876
Article MATH Google Scholar
Wright J, Ganesh A, Rao S, Peng Y, Ma Y (2009) Robust principal component analysis: exact recovery of corrupted low-rank matrices via convex optimization. In: Advances in neural information processing systems, pp 2080–2088
Favaro P, Vidal R, Ravichandran A (2011) A closed form solution to robust subspace estimation and clustering. In: IEEE computer society conference on computer vision and pattern recognition, IEEE, pp 1801–1807
Elhamifar E, Vidal R (2013) Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781
Article Google Scholar
Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2013) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184
Article Google Scholar
Candes EJ, Tao T (2010) The power of convex relaxation: near-optimal matrix completion. IEEE Trans Inf Theory 56(5):2053–2080
Article MathSciNet MATH Google Scholar
Nie FP, Huang H, Ding C, Luo DJ, Wang H (2011) Robust principal component analysis with non-greedy $\ell _1$-norm maximization. In: Proceedings of the 2011 international joint conference on artificial intelligence, vol 22, pp 1433–1438
Vidal R, Favaro P (2014) Low rank subspace clustering (lrsc). Pattern Recognit Lett 43:47–61
Article Google Scholar
Chen J, Yang J, Luo L, Qian J, Xu W (2015) Matrix variate distribution-induced sparse representation for robust image classification. IEEE Trans Neural Netw Learn Syst 26(10):2291–2300
Article MathSciNet Google Scholar
Zheng J, Lou K, Yang X, Bai C, Tang J (2019) Weighted mixed-norm regularized regression for robust face identification. IEEE Trans Neural Netw Learn Syst
Chen S, Yang J, Wei Y, Luo L, Lu GF, Gong C (2019) $\delta$-norm-based robust regression with applications to image analysis. IEEE Trans Cybern
Zhang H, Jian Y, Xie J, Qian J, Zhang B (2017) Weighted sparse coding regularized nonconvex matrix regression for robust face recognition. Inf Sci 394:1–17
MathSciNet Google Scholar
Chen S, Yang J, Luo L, Wei Y, Zhang K, Tai Y (2017) Low-rank latent pattern approximation with applications to robust image classification. IEEE Trans Image Process 26(11):5519–5530
Article MathSciNet MATH Google Scholar
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press, New York
Book MATH Google Scholar
Lin Z, Chen M, Ma Y (2010) The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices. Preprint arXiv:10095055
Luo L, Yang J, Qian J, Tai Y, Lu GF (2017) Robust image regression based on the extended matrix variate power exponential distribution of dependent noise. IEEE Trans Neural Netw Learn Syst 28(9):2168–2182
Article MathSciNet Google Scholar
Cai JF, Candès EJ, Shen Z (2010) A singular value thresholding algorithm for matrix completion. Siam J Optim 20(4):1956–1982
Article MathSciNet MATH Google Scholar
Hale ET, Yin W, Zhang Y (2008) Fixed-point continuation for $\ell _1$-minimization: methodology and convergence. Siam J Optim 19(3):1107–1130
MathSciNet MATH Google Scholar
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J et al (2010) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122
Article MATH Google Scholar
Gabay D, Mercier B (1976) A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput Math Appl 2(1):17–40
Article MATH Google Scholar
Yuan X, Yang J (2013) Sparse and low-rank matrix decomposition via alternating direction methods. Pac J Optim 9(1):167–180
MathSciNet MATH Google Scholar
He B, Yang H (1998) Some convergence properties of a method of multipliers for linearly constrained monotone variational inequalities. Oper Res Lett 23(3–5):151–161
Article MathSciNet MATH Google Scholar
Lee KC, Ho J, Kriegman DJ (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Pattern Anal Mach Intell 27(5):684–698
Article Google Scholar
Qian J, Yang J, Gao G (2013) Discriminative histograms of local dominant orientation (d-hldo) for biometric image feature extraction. Pattern Recognit 46(10):2724–2739
Article Google Scholar
Huang GB, Mattar M, Berg T, Learned-Miller E (2008) Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In: Workshop on faces in ’Real-Life’ images: detection, alignment, and recognition, Marseille, France
Phillips PJ, Flynn PJ, Scruggs T, Bowyer KW, Chang J, Hoffman K, Marques J, Min J, Worek W (2005) Overview of the face recognition grand challenge. IEEE Comput Soc Conf Comput Vis Pattern Recognit 1:947–954
Google Scholar
Gross R, Matthews I, Cohn J, Kanade T, Baker S (2010) Multi-pie. Image Vis Comput 28(5):807–813
Article Google Scholar
Zhang H, Yang J, Shang F, Gong C, Zhang Z (2018) Lrr for subspace segmentation via tractable schatten-$p$ norm minimization and factorization. IEEE Trans Cybern 49(5):1722–1734
Google Scholar
Guan N, Liu T, Zhang Y, Tao D, Davis LS (2017) Truncated cauchy non-negative matrix factorization. IEEE Trans Pattern Anal Mach Intell 41(1):246–259
Article Google Scholar

Download references

Acknowledgements

This work is partly supported by The National Key Research and Development Program of China (No.2018YFB1004900) and 111 Project (No.B13022).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
Xiaoshuang Sang, Yesong Xu, Hong Lu, Zakir Ali & Jianfeng Lu
College of Information Engineering, Nanjing University of Finance and Economics, Nanjing, 210023, China
Qinghua Zhao

Authors

Xiaoshuang Sang
View author publications
You can also search for this author in PubMed Google Scholar
Yesong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Qinghua Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zakir Ali
View author publications
You can also search for this author in PubMed Google Scholar
Jianfeng Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianfeng Lu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A Proof of Theorem 3

Proof

In view of the fact that $({{\mathbf{x}}^{*}},{\widetilde{\mathbf{H}}}^{*} ,{\mathbf{T}}^{*},{{\mathbf{Z}}}^{*})$ is a saddle point of L, we have $L({{\mathbf{x}}^{*}},{\widetilde{\mathbf{H}}}^{*},{\mathbf{T}}^{*},{{\mathbf{Z}}}^{*}) \le L({{\mathbf{x}}_{k+1}},{\widetilde{\mathbf{H}}}_{k+1} ,{\mathbf{T}}_{k+1},{{\mathbf{Z}}}^{*})$, with ${{D}}({{\mathbf{x}}^{*}})-{\widetilde{\mathbf{H}}}^{*} - {\mathbf{T}}^{*} = 0$, and it can be rewritten as follows:

$$\begin{aligned} f^{*} - f_{k+1} \le \text {Tr}(({{\mathbf{Z}}}^{*})^{T}{\mathbf{R}}_{k+1}). \end{aligned}$$

(26)

For the sake of derivation, the augmented Lagrangian function of Eq. (5) can be reformulated as ${L_\mu }({\mathbf{x}},{\widetilde{\mathbf{H}}} ,{\mathbf{T}},{{\mathbf{Z}}})= {\Vert {\mathbf{T}} \Vert _F^2} + \lambda {\Vert {\mathbf{H}} - {\widetilde{\mathbf{H}}} \Vert _{*}} +\frac{\beta }{2} \left\| {\mathbf{x}} \right\| _2^{2} +\frac{\mu }{2} \Vert {{D}({\mathbf{x}})-{\widetilde{\mathbf{H}}}} - {\mathbf{T}} +\frac{1}{\mu } {{\mathbf{Z}}}\Vert _F^2-\frac{1 }{2\mu }\Vert {{\mathbf{Z}}}\Vert _F^2$.

${\mathbf{x}}_{k+1} = \arg \mathop {\min }\limits _{{\mathbf{x}}} {L_\mu }({\mathbf{x}},{\widetilde{\mathbf{H}}}_{k},{\mathbf{T}}_{k},{\mathbf{Z}}_{k})$, which is equivalent to

$$\begin{aligned} 0 &= \partial { {L_\mu }({\mathbf{x}}_{k+1},{\widetilde{\mathbf{H}}}_{k},{\mathbf{T}}_{k},{\mathbf{Z}}_{k})}\\&= \frac{\beta }{2}\partial {(\Vert {\mathbf{x}}_{k+1}\Vert _p^p)}+\mu {\mathbf{M}}^{T}( {\mathbf{M}}{\mathbf{x}}_{k+1} -\text {Vec} \left( {\widetilde{\mathbf{H}}}_{k}+ {\mathbf{T}}_{k}-\frac{1}{\mu }{{\mathbf{Z}}}_{k}\right) , \end{aligned}$$

by virtue of ${{\mathbf{Z}}}_{k+1} ={\mathbf{Z}}_{k} +\mu ({D}({\mathbf{x}}_{k+1})-{\widetilde{\mathbf{H}}}_{k+1} - {\mathbf{T}}_{k+1})$, we can recombine to get

$$\begin{aligned} 0&= \frac{\beta }{2}\partial {(\Vert {\mathbf{x}}_{k+1}\Vert _p^p)} + {\mathbf{M}}^{T}(\text {Vec} ({\mathbf{Z}}_{k+1}) +\mu \text {Vec}({\widetilde{\mathbf{H}}}_{k+1}\\&\quad -\,{\widetilde{\mathbf{H}}}_{k})+\mu \text {Vec}({\mathbf{T}}_{k+1}-{\mathbf{T}}_{k})). \end{aligned}$$

This suggests that ${\mathbf{x}}_{k+1}$ optimizes $\frac{\beta }{2}\Vert {\mathbf{x}}\Vert _p^p + (\text {Vec} ({\mathbf{Z}}_{k+1}) +\mu \text {Vec}({\widetilde{\mathbf{H}}}_{k+1}-{\widetilde{\mathbf{H}}}_{k})+ \mu \text {Vec}({\mathbf{T}}_{k+1} -{\mathbf{T}}_{k}))^{T}{\mathbf{M}}{\mathbf{x}}$. Similar to the above derivation, we can get the arguments that ${\widetilde{\mathbf{H}}}_{k+1}$ minimizes $\lambda {\Vert {\mathbf{H}} - {\widetilde{\mathbf{H}}} \Vert _{*}} -\text {Tr}(({\mathbf{Z}}_{k+1} + \mu ({\mathbf{T}}_{k+1}-{\mathbf{T}}_{k}))^{T}{\widetilde{\mathbf{H}}})$, and ${\mathbf{T}}_{k+1}$ minimizes $\Vert {\mathbf{T}}\Vert _F^2 - \text {Tr}(({\mathbf{Z}}_{k+1})^{T}{\mathbf{T}})$. Hence, we have

$$\begin{aligned} \begin{aligned}&(1) \quad \frac{\beta }{2}\Vert {\mathbf{x}}_{k+1}\Vert _p^p + (\text {Vec} ({\mathbf{Z}}_{k+1}) +\mu \text {Vec}({\widetilde{\mathbf{H}}}_{k+1} -{\widetilde{\mathbf{H}}}_{k})\\&\qquad \quad +\,\mu \text {Vec}({\mathbf{T}}_{k+1} -{\mathbf{T}}_{k}))^{T}{\mathbf{M}}{\mathbf{x}}_{k+1}\\&\qquad \le \beta \Vert {\mathbf{x}}^{*}\Vert _p^p + (\text {Vec} ({\mathbf{Z}}_{k+1}) +\mu \text {Vec}({\widetilde{\mathbf{H}}}_{k+1} -{\widetilde{\mathbf{H}}}_{k})\\&\qquad \quad +\,\mu \text {Vec}({\mathbf{T}}_{k+1} -{\mathbf{T}}_{k}))^{T}{\mathbf{M}}{\mathbf{x}}^{*}\\&(2) \quad \lambda {\Vert {\mathbf{H}} - {\widetilde{\mathbf{H}}}_{k+1} \Vert _{*}} -\text {Tr}(({\mathbf{Z}}_{k+1} \\&\qquad \quad +\, \mu ({\mathbf{T}}_{k+1}-{\mathbf{T}}_{k}))^{T}{\widetilde{\mathbf{H}}}_{k+1}) \\&\qquad \le \lambda {\Vert {\mathbf{H}} - {\widetilde{\mathbf{H}}}^{*} \Vert _{*}} -\text {Tr}(({\mathbf{Z}}_{k+1}\\&\qquad \quad +\, \mu ({\mathbf{T}}_{k+1}-{\mathbf{T}}_{k}))^{T}{\widetilde{\mathbf{H}}}^{*})\\&(3) \quad \Vert {\mathbf{T}}_{k+1}\Vert _F^2 - \text {Tr}(({\mathbf{Z}}_{k+1})^{T}{\mathbf{T}}_{k+1}) \\&\qquad \le \Vert {\mathbf{T}}^{*}\Vert _F^2 - \text {Tr}(({\mathbf{Z}}_{k+1})^{T}{\mathbf{T}}^{*}) \end{aligned} \end{aligned}$$

Adding these three inequalities, with ${{D}}({{\mathbf{x}}^{*}})-{\widetilde{\mathbf{H}}}^{*} - {\mathbf{T}}^{*} = 0$ and ${\mathbf{R}}_{k+1} = {D}({\mathbf{x}}_{k+1}) -{\widetilde{\mathbf{H}}}_{k+1}- {\mathbf{T}}_{k+1}$, and after regrouping, we obtain

$$\begin{aligned} \begin{aligned} f_{k+1} - f^{*}&\le - \text {Vec}^{T} ({\mathbf{Z}}_{k+1})\mathbf{{r}}_{k+1} \\&\quad -\, \mu \text {Vec}^{T} ({\mathbf{T}}_{k+1}- {\mathbf{T}}_{k})(\mathbf{{r}}_{k+1} + \text {Vec} ({\mathbf{T}}_{k+1}- {\mathbf{T}}^{*} ))\\&\quad -\,\mu \text {Vec}^{T}({\widetilde{\mathbf{H}}}_{k+1}- {\widetilde{\mathbf{H}}}_{k})(\mathbf{{r}}_{k+1}+ \text {Vec} (({\widetilde{\mathbf{H}}}_{k+1}-{\widetilde{\mathbf{H}}}^{*})\\&\quad +\,({\mathbf{T}}_{k+1}- {\mathbf{T}}^{*} )) \end{aligned} \end{aligned}$$

(27)

Adding (26) and (27), and rearranging, we can obtain this inequality:

$$\begin{aligned} \begin{aligned}&\text {Vec}^{T} ({\mathbf{Z}}_{k+1}\\&\quad -\,{\mathbf{Z}}^{*})\mathbf{{r}}_{k+1} +\mu \text {Vec}^{T} ({\mathbf{T}}_{k+1}- {\mathbf{T}}_{k})\mathbf{{r}_{k+1}}\\&\quad +\,\mu \text {Vec}^{T} ({\mathbf{T}}_{k+1}\\&\quad -\, {\mathbf{T}}_{k})\text {Vec} ({\mathbf{T}}_{k+1} - {\mathbf{T}}^{*} )\\&\quad +\, \mu \text {Vec}^{T}({\widetilde{\mathbf{H}}}_{k+1}\\&\quad -\, {\widetilde{\mathbf{H}}}_{k})\text {Vec}({\widetilde{\mathbf{H}}}_{k+1}\\&\quad -\,{\widetilde{\mathbf{H}}}^{*})\\&\quad +\, \mu \text {Vec}^{T}({\widetilde{\mathbf{H}}}_{k+1}- {\widetilde{\mathbf{H}}}_{k})(\mathbf{{r}}_{k+1}\\&\quad +\, \text {Vec}({\mathbf{T}}_{k+1}- {\mathbf{T}}^{*} ))\\&\quad \le 0 \end{aligned} \end{aligned}$$

Substituting ${\mathbf{T}}_{k+1}-{\mathbf{T}}^{*}={\mathbf{T}}_{k+1}-{\mathbf{T}}_{k}+{\mathbf{T}}_{k}-{\mathbf{T}}^{*}$ in the fourth term and utilizing $\text {Vec}^{T}(U)\text {Vec}(V)={\text {Tr}}(U^{T}V)$, where $\forall ~U, V \in {R}^{s \times t}$, the aforementioned inequality can be rewritten as

$$\begin{aligned} \begin{aligned}&-\text {Tr}(({\mathbf{Z}}_{k+1}-{\mathbf{Z}}^{*})^{T}{\mathbf{R}}_{k+1})-\mu \text {Tr} (({\mathbf{T}}_{k+1}- {\mathbf{T}}_{k})^{T}({\mathbf{T}}_{k+1}- {\mathbf{T}}^{*} ))\\&\quad -\, \mu \text {Tr}(({\widetilde{\mathbf{H}}}_{k+1}- {\widetilde{\mathbf{H}}}_{k})^{T}({\widetilde{\mathbf{H}}}_{k+1}\\&\quad -\,{\widetilde{\mathbf{H}}}^{*}))+\mu \text {Tr}(({\widetilde{\mathbf{H}}}_{k+1}- {\widetilde{\mathbf{H}}}_{k})^{T}({\mathbf{T}}^{*}-{\mathbf{T}}_{k}) )\\& \ge \mu \text {Tr} (({\mathbf{T}}_{k+1}- {\mathbf{T}}_{k})^{T}{} \mathbf{{R}_{k+1}})+\mu \text {Tr} (({\widetilde{\mathbf{H}}}_{k+1}\\&\quad -\, {\widetilde{\mathbf{H}}}_{k})^{T}({\mathbf{R}}_{k+1}+({\mathbf{T}}_{k+1}- {\mathbf{T}}_{k})) \end{aligned} \end{aligned}$$

(28)

Subsequently, it can be proved that the two items on the right side in (28) are both greater than 0. To see this, ${\mathbf{T}}_{k+1}$ minimizes $\Vert {\mathbf{T}}\Vert _F^2 - \text {Tr}(({\mathbf{Z}}_{k+1})^{T}{\mathbf{T}})$, then ${\mathbf{T}}_{k}$ minimizes $\Vert {\mathbf{T}}\Vert _F^2 - \text {Tr}(({\mathbf{Z}}_{k})^{T}{\mathbf{T}})$, which are equivalent to

$$\begin{aligned} \begin{aligned}&\Vert {\mathbf{T}}_{k+1}\Vert _F^2 - \text {Tr}(({\mathbf{Z}}_{k+1})^{T}{\mathbf{T}}_{k+1})\\&\quad \le \Vert {\mathbf{T}}_{k}\Vert _F^2 - \text {Tr}(({\mathbf{Z}}_{k+1})^{T}{\mathbf{T}}_{k})\\&\quad \Vert {\mathbf{T}}_{k}\Vert _F^2 - \text {Tr}(({\mathbf{Z}}_{k})^{T}{\mathbf{T}}_{k})\\&\quad \le \Vert {\mathbf{T}}_{k+1}\Vert _F^2 - \text {Tr}(({\mathbf{Z}}_{k})^{T}{\mathbf{T}}_{k+1}) \end{aligned} \end{aligned}$$

Adding the two inequalities above, taking advantage of ${{\mathbf{Z}}}_{k+1} ={\mathbf{Z}}_{k} +\mu {\mathbf{R}}_{k+1}$ and reorganizing, we get $\mu \text {Tr} (({\mathbf{T}}_{k+1}- {\mathbf{T}}_{k})^{T}{} \mathbf{{R}_{k+1}}) \ge 0$. Similarly, using the argument that ${\widetilde{\mathbf{H}}}_{k+1}$ minimizes $\lambda {\Vert {\mathbf{H}} - {\widetilde{\mathbf{H}}} \Vert _{*}} -\text {Tr}(({\mathbf{Z}}_{k+1} + \mu ({\mathbf{T}}_{k+1}-{\mathbf{T}}_{k}))^{T}{\widetilde{\mathbf{H}}})$, we obtain $\mu \text {Tr} (({\widetilde{\mathbf{H}}}_{k+1}- {\widetilde{\mathbf{H}}}_{k})^{T}({\mathbf{R}}_{k+1}+({\mathbf{T}}_{k+1}-{\mathbf{T}}_{k})) \ge 0$.

In addition, since ${\widetilde{\mathbf{H}}}_{k+1}-{\widetilde{\mathbf{H}}}_{k} ={\widetilde{\mathbf{H}}}_{k+1}-{\widetilde{\mathbf{H}}}^{*}+{\widetilde{\mathbf{H}}}^{*}+{\widetilde{\mathbf{H}}}_{k}$, then

$$\begin{aligned} \begin{aligned}&\mu \text {Tr}(({\widetilde{\mathbf{H}}}_{k+1}- {\widetilde{\mathbf{H}}}_{k})^{T}({\mathbf{T}}^{*}-{\mathbf{T}}_{k}) )\\&\quad =\mu \text {Tr}(({\widetilde{\mathbf{H}}}_{k+1}- {\widetilde{\mathbf{H}}}^{*})^{T}({\mathbf{T}}^{*}-{\mathbf{T}}_{k}) )\\&\qquad +\,\mu \text {Tr}(({\widetilde{\mathbf{H}}}^{*}-{\widetilde{\mathbf{H}}}_{k})^{T}({\mathbf{T}}^{*}-{\mathbf{T}}_{k}))\\&\qquad \le \mu \text {Tr}(({\widetilde{\mathbf{H}}}_{k}-{\widetilde{\mathbf{H}}}^{*})^{T}({\mathbf{T}}_{k}\\&\qquad -\,{\mathbf{T}}^{*}))-\mu \text {Tr}(({\widetilde{\mathbf{H}}}_{k+1}- {\widetilde{\mathbf{H}}}^{*})^{T}({\mathbf{T}}_{k+1}-{\mathbf{T}}^{*}) ) \end{aligned} \end{aligned}$$

With the previous step and multiplying through by 2, we can transform (28) into the following form

$$\begin{aligned} \begin{aligned}&-2\text {Tr}(({\mathbf{Z}}_{k+1}-{\mathbf{Z}}^{*})^{T}{\mathbf{R}}_{k+1}) \\&\quad -\,2\mu \text {Tr} (({\mathbf{T}}_{k+1}- {\mathbf{T}}_{k})^{T}({\mathbf{T}}_{k+1}\\&\quad -\, {\mathbf{T}}^{*} ))\\&\quad -\, 2\mu \text {Tr}(({\widetilde{\mathbf{H}}}_{k+1} - {\widetilde{\mathbf{H}}}_{k})^{T}({\widetilde{\mathbf{H}}}_{k+1}\\&\quad -\,{\widetilde{\mathbf{H}}}^{*}))+2\mu \text {Tr}(({\widetilde{\mathbf{H}}}_{k}\\&\quad -\,{\widetilde{\mathbf{H}}}^{*})^{T}({\mathbf{T}}_{k}-{\mathbf{T}}^{*}))\\&\quad -\,2\mu \text {Tr}(({\widetilde{\mathbf{H}}}_{k+1}- {\widetilde{\mathbf{H}}}^{*})^{T} ({\mathbf{T}}_{k+1}-{\mathbf{T}}^{*}) )\\&\ge 0 \end{aligned} \end{aligned}$$

(29)

Using ${{\mathbf{Z}}}_{k+1} ={\mathbf{Z}}_{k} +\mu {\mathbf{R}}_{k+1}$ and perfect square expression, (29) can be rewritten as

$$\begin{aligned} \begin{aligned}&\left[ \frac{1}{\mu }\Vert {\mathbf{Z}}_{k}-{\mathbf{Z}}^{*}\Vert _F^2+\mu \Vert {\widetilde{\mathbf{H}}}_{k}\right. \\&\left. \quad +\,{\mathbf{T}}_{k}-{\widetilde{\mathbf{H}}}^{*}-{\mathbf{T}}^{*}\Vert _F^2\right] \\&\quad -\,\left[ \frac{1}{\mu }\Vert {\mathbf{Z}}_{k+1}-{\mathbf{Z}}^{*}\Vert _F^2\right. \\&\left. \quad +\,\mu \Vert {\widetilde{\mathbf{H}}}_{k+1}+{\mathbf{T}}_{k+1}-{\widetilde{\mathbf{H}}}^{*} -{\mathbf{T}}^{*}\Vert _F^2\right] \\&\qquad \ge \mu \Vert {\mathbf{R}}_{k+1}\Vert _F^2 +\mu \Vert {\widetilde{\mathbf{H}}}_{k+1}-{\widetilde{\mathbf{H}}}_{k}\Vert _F^2 + \mu \Vert {\mathbf{T}}_{k+1}-{\mathbf{T}}_{k}\Vert _F^2 \end{aligned} \end{aligned}$$

(30)

Let $V_{k} =\frac{1}{\mu }\Vert {\mathbf{Z}}_{k}-{\mathbf{Z}}^{*}\Vert _F^2+\mu \Vert {\widetilde{\mathbf{H}}}_{k}+{\mathbf{T}}_{k}-{\widetilde{\mathbf{H}}}^{*}-{\mathbf{T}}^{*}\Vert _F^2$ and the inequality (30) can be abbreviated as

$$\begin{aligned}&V_{k} -V_{k+1} \ge \mu \Vert {\mathbf{R}}_{k+1}\Vert _F^2 +\mu \Vert {\widetilde{\mathbf{H}}}_{k+1}\nonumber \\&\quad -\,{\widetilde{\mathbf{H}}}_{k}\Vert _F^2 + \mu \Vert {\mathbf{T}}_{k+1}-{\mathbf{T}}_{k}\Vert _F^2 \end{aligned}$$

(31)

This indicates that $V_{k}$ decreases in each iteration since $\mu >0$, i.e., $V_{k+1} \le V_{k}\le V_{0}$. We iterate the inequality (31) and can acquire that

$$\begin{aligned}&\mu \sum \limits _{k=0}^{\infty } (\Vert {\mathbf{R}}_{k+1}\Vert _F^2+\Vert {\widetilde{\mathbf{H}}}_{k+1}-{\widetilde{\mathbf{H}}}_{k}\Vert _F^2\\&\quad +\,\Vert {\mathbf{T}}_{k+1}-{\mathbf{T}}_{k}\Vert _F^2)\le V_{0}, \end{aligned}$$

which suggests that ${\mathbf{R}}_{k}\rightarrow 0$, ${\widetilde{\mathbf{H}}}_{k+1}-{\widetilde{\mathbf{H}}}_{k} \rightarrow 0$ and ${\mathbf{T}}_{k+1}-{\mathbf{T}}_{k} \rightarrow 0$ as $k \rightarrow \infty$ by the monotone bounded theorem. Thus, the right side in (26) and (27) both go to zero as $k \rightarrow \infty$. Further, we can get that $\lim \nolimits _{k \rightarrow \infty } f_{k} =f^{*}$.

That is, $\lim \nolimits _{k \rightarrow \infty }({\Vert {\mathbf{T}}_{k} \Vert _F^2} + \lambda {\Vert {\mathbf{H}} - {\widetilde{\mathbf{H}}}_{k} \Vert _{*}} +\beta /2\left\| {\mathbf{x}}_{k} \right\| _p^p) = {\Vert {\mathbf{T}}^{*} \Vert _F^2} + \lambda {\Vert {\mathbf{H}} - {\widetilde{\mathbf{H}}}^{*} \Vert _{*}} +\beta /2 \left\| {\mathbf{x}}^{*} \right\| _p^p$, which implies that $({\mathbf{x}}_{k},{\widetilde{\mathbf{H}}}_{k},{\mathbf{T}}_{k})$ is close to $({\mathbf{x}}^{*},{\widetilde{\mathbf{H}}}^{*}, {\mathbf{T}}^{*})$ as $k \rightarrow \infty$. By previous analysis and ${\mathbf{Z}}_{k}\rightarrow {{\mathbf{Z}}}^{*}$, we have $\lim \nolimits _{k \rightarrow \infty } {V_{k}} =\lim \nolimits _{k \rightarrow \infty }(\frac{1}{\mu }\Vert {\mathbf{Z}}_{k}-{\mathbf{Z}}^{*}\Vert _F^2+\mu \Vert {\widetilde{\mathbf{H}}}_{k}+{\mathbf{T}}_{k}-{\widetilde{\mathbf{H}}}^{*}-{\mathbf{T}}^{*}\Vert _F^2) = 0$. Thus, ${\widetilde{\mathbf{H}}}_{k}+{\mathbf{T}}_{k}-{\widetilde{\mathbf{H}}}^{*}-{\mathbf{T}}^{*} \rightarrow 0$. Evidently, $\text {Tr}(({\widetilde{\mathbf{H}}}_{k}-{\widetilde{\mathbf{H}}}^{*})^{T}({\mathbf{T}}_{k}-{\mathbf{T}}^{*}) \ge 0$, then

$$\begin{aligned} \begin{aligned}&0 \le \Vert {\widetilde{\mathbf{H}}}_{k}-{\widetilde{\mathbf{H}}}^{*}\Vert _F^2\\&\quad +\,\Vert {\mathbf{T}}_{k}-{\mathbf{T}}^{*}\Vert _F^2\\&\quad \le \Vert {\widetilde{\mathbf{H}}}_{k}-{\widetilde{\mathbf{H}}}^{*}\Vert _F^2\\&\quad +\,\Vert {\mathbf{T}}_{k}-{\mathbf{T}}^{*}\Vert _F^2+2\text {Tr}(({\widetilde{\mathbf{H}}}_{k}\\&\quad -\,{\widetilde{\mathbf{H}}}^{*})^{T}({\mathbf{T}}_{k}-{\mathbf{T}}^{*}) \\&\quad = \Vert {\widetilde{\mathbf{H}}}_{k}+{\mathbf{T}}_{k}-{\widetilde{\mathbf{H}}}^{*}-{\mathbf{T}}^{*}\Vert _F^2 \rightarrow 0. \end{aligned} \end{aligned}$$

Therefore, ${\widetilde{\mathbf{H}}}_{k}\rightarrow {\widetilde{\mathbf{H}}}^{*}$ and ${\mathbf{T}}_{k}\rightarrow {\mathbf{T}}^{*}$ by squeeze rule. In virtue of the formula $\lim \nolimits _{k \rightarrow \infty } f_{k} =f^{*}$, we can derive ${\mathbf{x}}_{k}\rightarrow {\mathbf{x}}^{*}$. $\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sang, X., Xu, Y., Lu, H. et al. Robust mixed-norm constrained regression with application to face recognitions. Neural Comput & Applic 32, 17551–17567 (2020). https://doi.org/10.1007/s00521-020-04925-4

Download citation

Received: 12 November 2019
Accepted: 06 April 2020
Published: 07 May 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s00521-020-04925-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust mixed-norm constrained regression with application to face recognitions

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Nuclear- $$L_1$$ Norm Joint Regression for Face Reconstruction and Recognition

Robust Regression with Nonconvex Schatten p-Norm Minimization

A joint matrix minimization approach for multi-image face recognition

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A Proof of Theorem 3

Appendix A Proof of Theorem 3

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now