Skip to main content

CHLPCA: Correntropy-Based Hypergraph Regularized Sparse PCA for Single-Cell Type Identification

  • Conference paper
  • First Online:
Bioinformatics Research and Applications (ISBRA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 14248))

Included in the following conference series:

  • 654 Accesses

Abstract

Over the past decade, high-throughput sequencing technologies have driven a dramatic increase in single-cell RNA sequencing (scRNA-seq) data. The study of scRNA-seq data has widened the scope and depth of researchers’ understanding of cellular heterogeneity. A prerequisite for studying heterogeneous cell populations is accurate cell type identification. However, the highly noisy and high-dimensional nature of scRNA-seq data poses a challenge to existing methods to further improve the success rate of cell type identification. Principal component analysis (PCA) is an important data analysis technique that is widely used to identify cell subpopulations. On the basis of PCA, we propose correntropy-based hypergraph regularized sparse PCA (CHLPCA) for accurate cell type identification. In addition to using correntropy to reduce the effect of noise, CHLPCA also considers higher-order relationships between samples by constructing the hypergraph, which compensates for the lack of local structure capture ability of PCA. Furthermore, we introduce the L2,1/5-norm into the model to enhance the interpretability of principal components (PCs), which further improves the model performance. CHLPCA has superior clustering accuracy and outperforms the best comparative method by 5.13% and 8.00% for ACC and NMI metrics, respectively. The results of clustering visualization experiments also confirm that CHLPCA can better perform the cell type recognition task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Raman, P., et al.: A comparison of survival analysis methods for cancer gene expression RNA-sequencing data. Cancer Genet. 235, 1–12 (2019)

    Article  PubMed  Google Scholar 

  2. Park, S., Zhao, H.: Spectral clustering based on learning similarity matrix. Bioinformatics 34, 2069–2076 (2018)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Zheng, R., Li, M., Liang, Z., Wu, F.-X., Pan, Y., Wang, J.: SinNLRR: a robust subspace clustering method for cell type detection by non-negative and low-rank representation. Bioinformatics 35, 3642–3650 (2019)

    Article  CAS  PubMed  Google Scholar 

  4. Wang, B., Zhu, J., Pierson, E., Ramazzotti, D., Batzoglou, S.: Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat. Methods 14, 414–416 (2017)

    Article  CAS  PubMed  Google Scholar 

  5. Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdiscip. Rev.: Comput. Stat. 2, 433–459 (2010)

    Article  Google Scholar 

  6. Lall, S., Sinha, D., Bandyopadhyay, S., Sengupta, D.: Structure-aware principal component analysis for single-cell RNA-seq data. J. Comput. Biol. 25, 1365–1373 (2018)

    Article  CAS  Google Scholar 

  7. Pierson, E., Yau, C.: ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis. Genome Biol. 16, 1–10 (2015)

    Article  Google Scholar 

  8. Liu, W., Pokharel, P.P., Principe, J.C.: Correntropy: properties and applications in non-Gaussian signal processing. IEEE Trans. Sig. Process. 55, 5286–5298 (2007)

    Article  Google Scholar 

  9. He, R., Hu, B.-G., Zheng, W.-S., Kong, X.-W.: Robust principal component analysis based on maximum correntropy criterion. IEEE Trans. Image Process. 20, 1485–1494 (2011)

    Article  PubMed  Google Scholar 

  10. Yu, N., Wu, M.-J., Liu, J.-X., Zheng, C.-H., Xu, Y.: Correntropy-based hypergraph regularized NMF for clustering and feature selection on multi-cancer integrated data. IEEE Trans. Cybern. 51, 3952–3963 (2020)

    Article  Google Scholar 

  11. Wang, T.-G., Shang, J.-L., Liu, J.-X., Li, F., Yuan, S., Wang, J.: Joint L2,p-norm and random walk graph constrained PCA for single-cell RNA-seq data. Comput. Methods Biomech. Biomed. Eng. 1–14 (2023)

    Google Scholar 

  12. Nikolova, M., Chan, R.H.: The equivalence of half-quadratic minimization and the gradient linearization iteration. IEEE Trans. Image Process. 16, 1623–1627 (2007)

    Article  PubMed  Google Scholar 

  13. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 3, 1–122 (2011)

    Google Scholar 

  14. Cai, D., He, X., Han, J.: Document clustering using locality preserving indexing. IEEE Trans. Knowl. Data Eng. 17, 1624–1637 (2005)

    Article  Google Scholar 

  15. McDaid, A.F., Greene, D., Hurley, N.: Normalized mutual information to evaluate overlapping community finding algorithms. arXiv preprint arXiv:1110.2515 (2011)

  16. Zheng, G.X., et al.: Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Pollen, A.A., et al.: Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex. Nat. Biotechnol. 32, 1053–1058 (2014)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Grover, A., et al.: Single-cell RNA sequencing reveals molecular and functional platelet bias of aged haematopoietic stem cells. Nat. Commun. 7, 11075 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Buettner, F., et al.: Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat. Biotechnol. 33, 155–160 (2015)

    Article  CAS  PubMed  Google Scholar 

  20. Engel, I., et al.: Innate-like functions of natural killer T cell subsets result from highly divergent gene programs. Nat. Immunol. 17, 728–739 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Deng, Q., Ramsköld, D., Reinius, B., Sandberg, R.: Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343, 193–196 (2014)

    Article  CAS  PubMed  Google Scholar 

  22. Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a data set via the gap statistic. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 63, 411–423 (2001)

    Article  Google Scholar 

  23. Jiang, B., Ding, C., Luo, B., Tang, J.: Graph-Laplacian PCA: closed-form solution and robustness. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3492–3498. (2011)

    Google Scholar 

  24. Zhang, W., Xue, X., Zheng, X., Fan, Z.: NMFLRR: clustering scRNA-seq data by integrating nonnegative matrix factorization with low rank representation. IEEE J. Biomed. Health Inform. 26, 1394–1405 (2021)

    Article  Google Scholar 

  25. Feng, C.-M., Gao, Y.-L., Liu, J.-X., Zheng, C.-H., Yu, J.: PCA based on graph Laplacian regularization and P-norm for gene selection and clustering. IEEE Trans. Nanobiosci. 16, 257–265 (2017)

    Article  Google Scholar 

  26. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9 (2008)

    Google Scholar 

  27. Van Der Maaten, L.: Fast optimization for t-SNE. In: Neural Information Processing Systems (NIPS) 2010 Workshop on Challenges in Data Visualization. Citeseer (2010)

    Google Scholar 

Download references

Acknowledgment

This work is supported by the National Natural Science Foundation of China (Grant Nos. 62172253).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wang, TG., Kong, XZ., Li, SJ., Wang, J. (2023). CHLPCA: Correntropy-Based Hypergraph Regularized Sparse PCA for Single-Cell Type Identification. In: Guo, X., Mangul, S., Patterson, M., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2023. Lecture Notes in Computer Science(), vol 14248. Springer, Singapore. https://doi.org/10.1007/978-981-99-7074-2_44

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7074-2_44

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7073-5

  • Online ISBN: 978-981-99-7074-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics