Elsevier

Neurocomputing

Volume 521, 7 February 2023, Pages 89-98
Neurocomputing

Max–Min Robust Principal Component Analysis

https://doi.org/10.1016/j.neucom.2022.11.092Get rights and content

Abstract

Principal Component Analysis (PCA) is a powerful unsupervised dimensionality reduction algorithm, which uses squared 2-norm to cleverly connect reconstruction error and projection variance, and those improved PCA methods only consider one of them, which limits their performance. To alleviate this problem, we propose a novel Max–Min Robust Principal Component Analysis via binary weight, which ingeniously combines reconstruction error and projection variance to learn projection matrix more accurately, and uses 2-norm as evaluation criterion to make the model rotation invariant. In addition, we design binary weight to remove outliers to improve the robustness of model and obtain the ability of anomaly detection. Subsequently, we exploit an efficient iterative optimization algorithm to solve this problem. Extensive experimental results show that our model outperforms related state-of-the-art PCA methods.

Introduction

Principal Component Analysis [1], [2] (PCA) is a very popular unsupervised dimensionality reduction method [3], [4], which is often used in image denoising [5], [6], [7], [8], image compression [9], [10], [11], subspace learning [12], [13], etc. In recent years, with the development of science and technology, it has also been widely used in biology and chemistry, such as cancelable biometrics [14], biometric cryptosystems [15] and chemometrics [16], etc. Therefore, it has an important place in many fields.

PCA learns projection matrix using maximum projection variance or minimum reconstruction error as cost function [17], [18]. To be specific, suppose the data matrix X=[x1,x2,x3,,xn]Rd×n, where d and n represent the dimensionality and number of data, respectively. Without loss of generality, the data have been centralized, i.e., i=1nxi=0. Let W=[w1,w2,w3,,wk]Rd×k be a projection matrix, and then the traditional PCA focuses on solving the following optimization problem:minWi=1nWxi22maxWTrWXXWminWi=1nxi-WWxi22,s.t.WW=I.Since the above objective functions are based on squared 2-norm, they are equivalent, but because of this, PCA is very sensitive to outliers [19]. Many recent research efforts have been proposed to alleviate this drawback.

In [20], Wright et al. proposed L1PCA, which uses the 1-norm to minimize the reconstruction error and try to solve the following problem:minWi=1nxi-WWxi1,s.t.WW=I.In [21], Kwak et al. proposed PCAL1 based on the maximum projected variance, which focuses on solving the following problem:minWi=1nWxi1,s.t.WW=I.In [22], Ding et al. proposed R1PCA by applying 1-norm and 2-norm to the data and spatial dimensions, respectively, which attempts to optimize the following problem:minWi=1nxi-WWxi2,1,s.t.WW=I.In [23], Wang et al. proposed 2,p-PCA using the 2-norm, which correlates PCA and R1PCA by setting different p values, and then this problem is optimized by the following problem:minWi=1nxi-WWxi2p,s.t.WW=I.Similar studies also include RPCA-OM [24], TRPCA [25], KPCA [26] and more [27], [28], [29], [30], [31]. Through the above analysis, although these existing PCA works mitigate the effects of outliers to some extent, they usually focus on the minimum reconstruction error or maximum projection variance without considering the connection between the two. In fact, the traditional PCA has this property, that is, i=1nxi-WWxi22+i=1nWxi22=i=1nxi22, which means that the projection variance and reconstruction error of data are considered at the same time, while those improved methods do not have the above equivalence relationship, resulting in limited model performance.

In this paper, we propose a novel Max–Min Robust Principal Component Analysis via binary weight (MMRPCA), which simultaneously considers reconstruction error and projection variance, so that this model not only obtains strong reconstruction ability, but also makes the data after dimensionality reduction more separable. Furthermore, to improve the robustness of model, we carefully design binary weight to make this model treat normal samples and outliers differently, and eliminate the negative effects of outliers to obtain a more accurate projection matrix. Interestingly, we also find that the ability of anomaly detection can be achieved by binary weight, which is not considered by existing PCA methods. To solve this model, we explore an efficient iterative optimization algorithm and strictly guarantee its convergence. Extensive experimental results show that our proposed method outperforms other methods.

Notations: In this paper, we use uppercase letters, bold lowercase letters and lowercase letters represent matrices, vectors and scalars, respectively. For matrix Q,qi represents the i-th column, Q is the transpose of Q, Tr(Q) is the trace operator of Q, and I denotes identity matrix. For vector r, where r1 and r2 are the 1-norm and 2-norm of vector r, respectively.

Section snippets

Methodology

Let pi=Wxi2 and ri=xi-WWxi2 represent projection variance and reconstruction error of data xi, respectively. As shown in Fig. 1, we take the two-dimensional space as an example. For the data point xi, its projection variance ri can be represented by the short side of right triangle, and the reconstruction error pi can be represented by the long side of right triangle, the essence of PCA is to hope that the larger pi is, the better, and the smaller ri is, the better, so that the important

Optimization

To solve objective function(7), we first introduce a related theorem [32], [33]:

Theorem 1

For any two functions F(W) and G(W), both related to W, then:argmaxWW=IF(W)G(W),we can obtain the solution of problem(8) by optimizing the following problem:argmaxWW=IF(W)-λG(W).where λ is also related to W.

Now, let’s solve objective function(7). Obviously, there are two variables g and W in problem(7) that need to be optimized. By the above Theorem 1, let ξi=giWxi2xi-WWxi2, then solving problem (7) is

Convergence analysis

Theorem 1 Algorithm 1 will monotonically decrease the value of objective function(7) until convergence.

Proof The Lagrangian function of problem (10) is:L0=i=1ngiWxi2-ξixi-WWxi2-TrWW-IΛ-ζi=1ngi-ηwhere Λ and ζ are Lagrange multipliers, taking the derivative w.r.t W and setting it to zero, we have the KKT condition [35] of problem(10) is:PW=WΛAccording to fourth step of Algorithm 1, the Lagrangian function of problem(13) is:L1=Tr(WPW)-Tr((WW-I)Λ)Taking the derivative w.r.t W and setting it

Experiment

In this section, we verify the superiority of our algorithm and compared it with other algorithms, including synthetical experiment, visualization experiment, reconstruction experiment, classification experiment, running time, and anomaly detection experiment, etc.

Conclusion

In this paper, we propose a novel Max–Min Robust Principal Component Analysis (MMRPCA), which combines reconstruction error and projection variance simultaneously through 2-norm, and fully considers the reconstructability and separability of model, this is not considered by other improved PCA methods. To improve the robustness of model, we design binary weight to remove outliers, so that this model has anomaly detection ability. To solve this problem, we explore an efficient iterative

CRediT authorship contribution statement

Sisi Wang: Conceptualization, Methodology, Writing - original draft. Feiping Nie: Data curation, Validation, Funding acquisition. Zheng Wang: Writing - review & editing, Visualization, Investigation. Rong Wang: Supervision, Formal analysis. Xuelong Li: Project administration.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China under Grant 62236001, in part by the Natural Science Basic Research Program of Shaanxi under Program 2021JM-071, in part by the National Natural Science Foundation of China under Grant 62176212, Grant 61936014 and Grant 61772427, and in part by the Fundamental Research Funds for the Central Universities under Grant G2019KY0501.

Sisi Wang received the M.S. degree from Northwestern Polytechnical University, Xi’an, China, where she is currently pursuing the Ph.D. degree with the School of Computer Science and the School of Artificial Intelligence, OPtics and ElectroNics (iOPEN). Her current research interests are machine learning and its applications, such as dimensionality reduction, feature selection, object detection, and anomaly detection.

References (46)

  • Y. Liu et al.

    Robust neighborhood embedding for unsupervised feature selection

    Knowl-based Syst

    (2020)
  • Q. Ye et al.

    Flexible orthogonal semisupervised learning for dimension reduction with image classification

    Neurocomputing

    (2014)
  • A. Parkins et al.

    Genetic programming techniques for hand written digit recognition

    Signal Process

    (2004)
  • T. Mandal et al.

    Curvelet based face recognition via dimension reduction

    Signal Process

    (2009)
  • H. Abdi et al.

    Principal component analysis

    WIREs Comput. Stat.

    (2010)
  • T. Tasdizen, Principal components for non-local means image denoising, in: 2008 15th IEEE International Conference on...
  • Y.M.M. Babu et al.

    PCA based image denoising

    Signal Image Process.

    (2012)
  • K. Dabov et al.

    BM3D image denoising with Shape-adaptive principal component analysis

  • X. Yang et al.

    Fuzzy embedded clustering based on bipartite graph for Large-scale hyperspectral image

    IEEE Geosci. Remote Sens. Lett.

    (2022)
  • Q. Du et al.

    Low-complexity principal component analysis for hyperspectral image compression

    Int. J. High Perform. Comput. Appl.

    (2008)
  • N. Vaswani et al.

    Robust subspace learning: Robust PCA, robust subspace tracking, and robust subspace recovery

    IEEE Signal Process. Mag.

    (2018)
  • J. Zhan et al.

    Robust PCA with partial subspace knowledge

    IEEE Trans. Signal Process.

    (2015)
  • N. Kumar et al.

    Random permutation principal component analysis for cancelable biometric recognition

    Appl. Intell.

    (2018)
  • Cited by (3)

    Sisi Wang received the M.S. degree from Northwestern Polytechnical University, Xi’an, China, where she is currently pursuing the Ph.D. degree with the School of Computer Science and the School of Artificial Intelligence, OPtics and ElectroNics (iOPEN). Her current research interests are machine learning and its applications, such as dimensionality reduction, feature selection, object detection, and anomaly detection.

    Feiping Nie received the Ph.D. degree in computer science from Tsinghua University, Beijing, China, in 2009. He is currently a Full Professor with Northwestern Polytechnical University, Xi’an, China. His research interests are machine learning and its applications, such as pattern recognition, data mining, computer vision, image processing, and information retrieval. He has published more than 100 papers in the following journals and conferences: TPAMI, TIP, TNNLS, TKDE, ICML, NIPS, KDD, IJCAI, AAAI, ICCV. His papers have been cited more than 20000 times and the H-index is 84. He is now serving as Associate Editor or PC member for several prestigious journals and conferences in the related fields.

    Zheng Wang received the M.S. degree from Anhui University, Hefei, China. He is currently pursuing the Ph.D. degree with the School of Computer Science and the School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xi’an, China. He has published several articles in journals, such as TPAMI, TCyb, TKDE, TKDD, and PR. His research interests mainly focus on representation learning for generic data and its applications.

    Rong Wang Received the B.S. degree in information engineering, the M.S. degree in signal and information processing, and the Ph.D. degree in computer science from Xi’an Research Institute of Hi-Tech, Xi’an, China, in 2004, 2007 and 2013, respectively. He also studied at the Department of Automation, Tsinghua University, Beijing, China, in 2007 and 2013, for his Ph.D. degree. He is currently an associate professor with the School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xi’an, China. His research interests focus on machine learning and its applications.

    Xuelong Li (M 0 02-SM 0 07-F 0 12) is a full professor with the School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xi’an, China.

    View full text