Max–Min Robust Principal Component Analysis
Introduction
Principal Component Analysis [1], [2] (PCA) is a very popular unsupervised dimensionality reduction method [3], [4], which is often used in image denoising [5], [6], [7], [8], image compression [9], [10], [11], subspace learning [12], [13], etc. In recent years, with the development of science and technology, it has also been widely used in biology and chemistry, such as cancelable biometrics [14], biometric cryptosystems [15] and chemometrics [16], etc. Therefore, it has an important place in many fields.
PCA learns projection matrix using maximum projection variance or minimum reconstruction error as cost function [17], [18]. To be specific, suppose the data matrix , where d and n represent the dimensionality and number of data, respectively. Without loss of generality, the data have been centralized, ., . Let be a projection matrix, and then the traditional PCA focuses on solving the following optimization problem:Since the above objective functions are based on squared -norm, they are equivalent, but because of this, PCA is very sensitive to outliers [19]. Many recent research efforts have been proposed to alleviate this drawback.
In [20], Wright et al. proposed L1PCA, which uses the -norm to minimize the reconstruction error and try to solve the following problem:In [21], Kwak et al. proposed PCAL1 based on the maximum projected variance, which focuses on solving the following problem:In [22], Ding et al. proposed R1PCA by applying -norm and -norm to the data and spatial dimensions, respectively, which attempts to optimize the following problem:In [23], Wang et al. proposed -PCA using the -norm, which correlates PCA and R1PCA by setting different p values, and then this problem is optimized by the following problem:Similar studies also include RPCA-OM [24], TRPCA [25], KPCA [26] and more [27], [28], [29], [30], [31]. Through the above analysis, although these existing PCA works mitigate the effects of outliers to some extent, they usually focus on the minimum reconstruction error or maximum projection variance without considering the connection between the two. In fact, the traditional PCA has this property, that is, , which means that the projection variance and reconstruction error of data are considered at the same time, while those improved methods do not have the above equivalence relationship, resulting in limited model performance.
In this paper, we propose a novel Max–Min Robust Principal Component Analysis via binary weight (MMRPCA), which simultaneously considers reconstruction error and projection variance, so that this model not only obtains strong reconstruction ability, but also makes the data after dimensionality reduction more separable. Furthermore, to improve the robustness of model, we carefully design binary weight to make this model treat normal samples and outliers differently, and eliminate the negative effects of outliers to obtain a more accurate projection matrix. Interestingly, we also find that the ability of anomaly detection can be achieved by binary weight, which is not considered by existing PCA methods. To solve this model, we explore an efficient iterative optimization algorithm and strictly guarantee its convergence. Extensive experimental results show that our proposed method outperforms other methods.
Notations: In this paper, we use uppercase letters, bold lowercase letters and lowercase letters represent matrices, vectors and scalars, respectively. For matrix represents the i-th column, is the transpose of Q, is the trace operator of Q, and I denotes identity matrix. For vector , where and are the -norm and -norm of vector , respectively.
Section snippets
Methodology
Let and represent projection variance and reconstruction error of data , respectively. As shown in Fig. 1, we take the two-dimensional space as an example. For the data point , its projection variance can be represented by the short side of right triangle, and the reconstruction error can be represented by the long side of right triangle, the essence of PCA is to hope that the larger is, the better, and the smaller is, the better, so that the important
Optimization
To solve objective function(7), we first introduce a related theorem [32], [33]: Theorem 1 For any two functions and , both related to W, then:we can obtain the solution of problem(8) by optimizing the following problem:where is also related to W.
Now, let’s solve objective function(7). Obviously, there are two variables and W in problem(7) that need to be optimized. By the above Theorem 1, let , then solving problem (7) is
Convergence analysis
Theorem 1 Algorithm 1 will monotonically decrease the value of objective function(7) until convergence.
Proof The Lagrangian function of problem (10) is:where and are Lagrange multipliers, taking the derivative W and setting it to zero, we have the KKT condition [35] of problem(10) is:According to fourth step of Algorithm 1, the Lagrangian function of problem(13) is:Taking the derivative W and setting it
Experiment
In this section, we verify the superiority of our algorithm and compared it with other algorithms, including synthetical experiment, visualization experiment, reconstruction experiment, classification experiment, running time, and anomaly detection experiment, etc.
Conclusion
In this paper, we propose a novel Max–Min Robust Principal Component Analysis (MMRPCA), which combines reconstruction error and projection variance simultaneously through -norm, and fully considers the reconstructability and separability of model, this is not considered by other improved PCA methods. To improve the robustness of model, we design binary weight to remove outliers, so that this model has anomaly detection ability. To solve this problem, we explore an efficient iterative
CRediT authorship contribution statement
Sisi Wang: Conceptualization, Methodology, Writing - original draft. Feiping Nie: Data curation, Validation, Funding acquisition. Zheng Wang: Writing - review & editing, Visualization, Investigation. Rong Wang: Supervision, Formal analysis. Xuelong Li: Project administration.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This work was supported in part by the National Natural Science Foundation of China under Grant 62236001, in part by the Natural Science Basic Research Program of Shaanxi under Program 2021JM-071, in part by the National Natural Science Foundation of China under Grant 62176212, Grant 61936014 and Grant 61772427, and in part by the Fundamental Research Funds for the Central Universities under Grant G2019KY0501.
Sisi Wang received the M.S. degree from Northwestern Polytechnical University, Xi’an, China, where she is currently pursuing the Ph.D. degree with the School of Computer Science and the School of Artificial Intelligence, OPtics and ElectroNics (iOPEN). Her current research interests are machine learning and its applications, such as dimensionality reduction, feature selection, object detection, and anomaly detection.
References (46)
- et al.
Principal component analysis
Chemometr. Intell. Labor. Syst.
(1987) - et al.
A novel dimensionality reduction method: Similarity order preserving discriminant analysis
Signal Process.
(2021) - et al.
Adaptive graph weighting for multi-view dimensionality reduction
Signal Process.
(2019) - et al.
Limited-energy output formation for multiagent systems with intermittent interactions
J. Franklin Inst.
(2021) - et al.
Color image compression using PCA and backpropagation learning
Pattern Recogn.
(2000) - et al.
A fast method for robust principal components with applications to chemometrics
Chemometr. Intell. Labor. Syst.
(2002) - et al.
PCA document reconstruction for email classification
Comput. Stat. Data Anal.
(2012) - et al.
Algorithms for projection pursuit robust principal component analysis
Chemometr. Intell. Labor. Syst.
(2007) - et al.
L1-norm-based principal component analysis with adaptive regularization
Pattern Recogn
(2016) - et al.
Robust principal component analysis via optimal mean by joint ℓ_2,1) and schatten p-norms minimization
Neurocomputing
(2018)
Robust neighborhood embedding for unsupervised feature selection
Knowl-based Syst
Flexible orthogonal semisupervised learning for dimension reduction with image classification
Neurocomputing
Genetic programming techniques for hand written digit recognition
Signal Process
Curvelet based face recognition via dimension reduction
Signal Process
Principal component analysis
WIREs Comput. Stat.
PCA based image denoising
Signal Image Process.
BM3D image denoising with Shape-adaptive principal component analysis
Fuzzy embedded clustering based on bipartite graph for Large-scale hyperspectral image
IEEE Geosci. Remote Sens. Lett.
Low-complexity principal component analysis for hyperspectral image compression
Int. J. High Perform. Comput. Appl.
Robust subspace learning: Robust PCA, robust subspace tracking, and robust subspace recovery
IEEE Signal Process. Mag.
Robust PCA with partial subspace knowledge
IEEE Trans. Signal Process.
Random permutation principal component analysis for cancelable biometric recognition
Appl. Intell.
Cited by (3)
Diagnosis and staging of cervical cancer using label-free surface-enhanced Raman spectroscopy and BWRPCA-TLNN model
2023, Vibrational Spectroscopy
Sisi Wang received the M.S. degree from Northwestern Polytechnical University, Xi’an, China, where she is currently pursuing the Ph.D. degree with the School of Computer Science and the School of Artificial Intelligence, OPtics and ElectroNics (iOPEN). Her current research interests are machine learning and its applications, such as dimensionality reduction, feature selection, object detection, and anomaly detection.
Feiping Nie received the Ph.D. degree in computer science from Tsinghua University, Beijing, China, in 2009. He is currently a Full Professor with Northwestern Polytechnical University, Xi’an, China. His research interests are machine learning and its applications, such as pattern recognition, data mining, computer vision, image processing, and information retrieval. He has published more than 100 papers in the following journals and conferences: TPAMI, TIP, TNNLS, TKDE, ICML, NIPS, KDD, IJCAI, AAAI, ICCV. His papers have been cited more than 20000 times and the H-index is 84. He is now serving as Associate Editor or PC member for several prestigious journals and conferences in the related fields.
Zheng Wang received the M.S. degree from Anhui University, Hefei, China. He is currently pursuing the Ph.D. degree with the School of Computer Science and the School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xi’an, China. He has published several articles in journals, such as TPAMI, TCyb, TKDE, TKDD, and PR. His research interests mainly focus on representation learning for generic data and its applications.
Rong Wang Received the B.S. degree in information engineering, the M.S. degree in signal and information processing, and the Ph.D. degree in computer science from Xi’an Research Institute of Hi-Tech, Xi’an, China, in 2004, 2007 and 2013, respectively. He also studied at the Department of Automation, Tsinghua University, Beijing, China, in 2007 and 2013, for his Ph.D. degree. He is currently an associate professor with the School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xi’an, China. His research interests focus on machine learning and its applications.
Xuelong Li (M 0 02-SM 0 07-F 0 12) is a full professor with the School of Artificial Intelligence, OPtics and ElectroNics (iOPEN), Northwestern Polytechnical University, Xi’an, China.