Integrative sparse principal component analysis

https://doi.org/10.1016/j.jmva.2018.02.002Get rights and content
Under an Elsevier user license
open archive

Abstract

In the analysis of data with high-dimensional covariates and small sample sizes, dimension reduction techniques have been extensively employed. Principal component analysis (PCA) is perhaps the most popular dimension reduction technique. To remove noise effectively and generate more interpretable results, the sparse PCA (SPCA) technique has been developed. In high dimension, the analysis of a single dataset often generates unsatisfactory results. In a series of studies under the “regression analysis + variable selection” setting, it has been shown that integrative analysis provides an effective way of pooling information from multiple independent datasets and outperforms single-dataset analysis and many alternative multi-datasets analyses, especially including the classic meta-analysis. In this study, with multiple independent datasets, we propose conducting dimension reduction using a novel iSPCA (integrative SPCA) approach. Penalization is adopted for regularized estimation and selection of important loadings. Advancing from the existing integrative analysis studies, we further impose contrasted penalties, which may generate more accurate estimation/selection. Multiple settings on the similarity across datasets are comprehensively considered. Consistency properties of the proposed approach are established, and effective computational algorithms are developed. A wide spectrum of simulations demonstrate competitive performance of iSPCA over the alternatives. Two sets of data analysis further establish its practical applicability.

AMS subject classifications

62H25

Keywords

Contrasted penalization
Integrative analysis
Sparse PCA

Cited by (0)