Skip to main content

scGASI: A Graph Autoencoder-Based Single-Cell Integration Clustering Method

  • Conference paper
  • First Online:
Bioinformatics Research and Applications (ISBRA 2023)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 14248))

Included in the following conference series:

  • 680 Accesses

Abstract

Single-cell RNA sequencing (scRNA-seq) technology offers the opportunity to study biological issues at the cellular level. The identification of single-cell types by unsupervised clustering is a basic goal of scRNA-seq data analysis. Although there have been a number of recent proposals for single-cell clustering methods, only a few of these have considered both shallow and deep potential information. Therefore, we propose a graph autoencoder-based single-cell integration clustering method, scGASI. Based on multiple feature sets, scGASI unifies deep feature embedding and data affinity recovery in a uniform framework to learn a consensus affinity matrix between cells. scGASI first constructs multiple feature sets. Then, to extract the deep potential information embedded in the data, scGASI uses a graph autoencoder (GAEs) to learn the low-dimensional latent representation of the data. Next, to effectively fuse the deep potential information in the embedding space and the shallow information in the raw space, we design a multi-layer kernel self-expression integration strategy. This strategy uses a kernel self-expression model with multi-layer similarity fusion to learn a similarity matrix shared by the raw and embedding spaces of a given feature set, and a consensus learning mechanism to learn a consensus affinity matrix across all feature sets. Finally, the consensus affinity matrix is used for spectral clustering, visualization, and identification of gene markers. Large-scale validation on real datasets shows that scGASI has higher clustering accuracy than many popular clustering methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cui, Y., Zhang, S., Liang, Y., Wang, X., Ferraro, T.N., Chen, Y.: Consensus clustering of single-cell RNA-seq data by enhancing network affinity. Briefings Bioinform. 22, bbab236 (2021)

    Google Scholar 

  2. Sinaga, K.P., Yang, M.-S.: Unsupervised K-means clustering algorithm. IEEE Access 8, 80716–80727 (2020)

    Article  Google Scholar 

  3. Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)

    Article  Google Scholar 

  4. Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemom. Intell. Lab. Syst. 2, 37–52 (1987)

    Article  CAS  Google Scholar 

  5. Laurens, V.D.M., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    Google Scholar 

  6. McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018)

  7. Wang, C.-Y., Gao, Y.-L., Kong, X.-Z., Liu, J.-X., Zheng, C.-H.: Unsupervised cluster analysis and gene marker extraction of scRNA-seq data based on non-negative matrix factorization. IEEE J. Biomed. Health Inform. 26, 458–467 (2021)

    Article  CAS  Google Scholar 

  8. Mei, Q., Li, G., Su, Z.: Clustering single-cell RNA-seq data by rank constrained similarity learning. Bioinformatics 37, 3235–3242 (2021)

    Article  CAS  PubMed  Google Scholar 

  9. Wang, B., Zhu, J., Pierson, E., Ramazzotti, D., Batzoglou, S.: Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat. Methods 14, 414–416 (2017)

    Article  CAS  PubMed  Google Scholar 

  10. Park, S., Zhao, H.: Spectral clustering based on learning similarity matrix. Bioinformatics 34, 2069–2076 (2018)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Zheng, R., Li, M., Liang, Z., Wu, F.-X., Pan, Y., Wang, J.: SinNLRR: a robust subspace clustering method for cell type detection by non-negative and low-rank representation. Bioinformatics 35, 3642–3650 (2019)

    Article  CAS  PubMed  Google Scholar 

  12. Zhang, W., Xue, X., Zheng, X., Fan, Z.: NMFLRR: clustering scRNA-seq data by integrating nonnegative matrix factorization with low rank representation. IEEE J. Biomed. Health Inform. 26, 1394–1405 (2021)

    Article  Google Scholar 

  13. Kiselev, V.Y., et al.: SC3: consensus clustering of single-cell RNA-seq data. Nat. Methods 14, 483–486 (2017)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Stuart, T.: Comprehensive integration of single-cell data. Cell 177, 1888–1902. e1821 (2019)

    Google Scholar 

  15. Tian, T., Wan, J., Song, Q., Wei, Z.: Clustering single-cell RNA-seq data with a model-based deep learning approach. Nat. Mach. Intell. 1, 191–198 (2019)

    Article  Google Scholar 

  16. Yu, B., et al.: scGMAI: a Gaussian mixture model for clustering single-cell RNA-Seq data based on deep autoencoder. Briefings Bioinform. 22, bbaa316 (2021)

    Google Scholar 

  17. Wang, C., Pan, S., Long, G., Zhu, X., Jiang, J.: MGAE: marginalized graph autoencoder for graph clustering. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 889–898. (2017)

    Google Scholar 

  18. Zhang, D.-J., Gao, Y.-L., Zhao, J.-X., Zheng, C.-H., Liu, J.-X.: A new graph autoencoder-based consensus-guided model for scRNA-seq cell type detection. IEEE Trans. Neural Netw. Learn. Syst. (2022)

    Google Scholar 

  19. Huang, J., Nie, F., Huang, H.: A new simplex sparse learning model to measure data similarity for clustering. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)

    Google Scholar 

  20. Strehl, A., Ghosh, J.: Cluster ensembles–-a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

    Google Scholar 

  21. Meilă, M.: Comparing clusterings—an information based distance. J. Multivar. Anal. 98, 873–895 (2007)

    Article  Google Scholar 

  22. Li, J., et al.: Single-cell transcriptomes reveal characteristic features of human pancreatic islet cell types. EMBO Rep. 17, 178–187 (2016)

    Article  CAS  PubMed  Google Scholar 

  23. Goolam, M., et al.: Heterogeneity in Oct4 and Sox2 targets biases cell fate in 4-cell mouse embryos. Cell 165, 61–74 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Deng, Q., Ramsköld, D., Reinius, B., Sandberg, R.: Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343, 193–196 (2014)

    Article  CAS  PubMed  Google Scholar 

  25. Engel, I., et al.: Innate-like functions of natural killer T cell subsets result from highly divergent gene programs. Nat. Immunol. 17, 728–739 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Usoskin, D., et al.: Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing. Nat. Neurosci. 18, 145–153 (2015)

    Article  CAS  PubMed  Google Scholar 

  27. Kolodziejczyk, A.A., et al.: Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell 17, 471–485 (2015)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Tasic, B., et al.: Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346 (2016)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Condamine, T., et al.: Tmem176B and Tmem176A are associated with the immature state of dendritic cells. J. Leukoc. Biol. 88, 507–515 (2010)

    Article  CAS  PubMed  Google Scholar 

  30. Castillejo-López, C., et al.: Genetic and physical interaction of the B-cell systemic lupus erythematosus-associated genes BANK1 and BLK. Ann. Rheum. Dis. 71, 136–142 (2012)

    Article  PubMed  Google Scholar 

Download references

Acknowledgment

This work is supported by the National Natural Science Foundation of China (No. 62172253), and jointly supported by the Program for Youth Innovative Research Team in the University of Shandong Province in China (No.2022KJ179).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Qiao, TJ., Li, F., Yuan, S., Dai, LY., Wang, J. (2023). scGASI: A Graph Autoencoder-Based Single-Cell Integration Clustering Method. In: Guo, X., Mangul, S., Patterson, M., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2023. Lecture Notes in Computer Science(), vol 14248. Springer, Singapore. https://doi.org/10.1007/978-981-99-7074-2_14

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7074-2_14

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7073-5

  • Online ISBN: 978-981-99-7074-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics