Abstract
Single-cell RNA sequencing (scRNA-seq) technology offers the opportunity to study biological issues at the cellular level. The identification of single-cell types by unsupervised clustering is a basic goal of scRNA-seq data analysis. Although there have been a number of recent proposals for single-cell clustering methods, only a few of these have considered both shallow and deep potential information. Therefore, we propose a graph autoencoder-based single-cell integration clustering method, scGASI. Based on multiple feature sets, scGASI unifies deep feature embedding and data affinity recovery in a uniform framework to learn a consensus affinity matrix between cells. scGASI first constructs multiple feature sets. Then, to extract the deep potential information embedded in the data, scGASI uses a graph autoencoder (GAEs) to learn the low-dimensional latent representation of the data. Next, to effectively fuse the deep potential information in the embedding space and the shallow information in the raw space, we design a multi-layer kernel self-expression integration strategy. This strategy uses a kernel self-expression model with multi-layer similarity fusion to learn a similarity matrix shared by the raw and embedding spaces of a given feature set, and a consensus learning mechanism to learn a consensus affinity matrix across all feature sets. Finally, the consensus affinity matrix is used for spectral clustering, visualization, and identification of gene markers. Large-scale validation on real datasets shows that scGASI has higher clustering accuracy than many popular clustering methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cui, Y., Zhang, S., Liang, Y., Wang, X., Ferraro, T.N., Chen, Y.: Consensus clustering of single-cell RNA-seq data by enhancing network affinity. Briefings Bioinform. 22, bbab236 (2021)
Sinaga, K.P., Yang, M.-S.: Unsupervised K-means clustering algorithm. IEEE Access 8, 80716–80727 (2020)
Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)
Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemom. Intell. Lab. Syst. 2, 37–52 (1987)
Laurens, V.D.M., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)
McInnes, L., Healy, J., Melville, J.: UMAP: uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018)
Wang, C.-Y., Gao, Y.-L., Kong, X.-Z., Liu, J.-X., Zheng, C.-H.: Unsupervised cluster analysis and gene marker extraction of scRNA-seq data based on non-negative matrix factorization. IEEE J. Biomed. Health Inform. 26, 458–467 (2021)
Mei, Q., Li, G., Su, Z.: Clustering single-cell RNA-seq data by rank constrained similarity learning. Bioinformatics 37, 3235–3242 (2021)
Wang, B., Zhu, J., Pierson, E., Ramazzotti, D., Batzoglou, S.: Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nat. Methods 14, 414–416 (2017)
Park, S., Zhao, H.: Spectral clustering based on learning similarity matrix. Bioinformatics 34, 2069–2076 (2018)
Zheng, R., Li, M., Liang, Z., Wu, F.-X., Pan, Y., Wang, J.: SinNLRR: a robust subspace clustering method for cell type detection by non-negative and low-rank representation. Bioinformatics 35, 3642–3650 (2019)
Zhang, W., Xue, X., Zheng, X., Fan, Z.: NMFLRR: clustering scRNA-seq data by integrating nonnegative matrix factorization with low rank representation. IEEE J. Biomed. Health Inform. 26, 1394–1405 (2021)
Kiselev, V.Y., et al.: SC3: consensus clustering of single-cell RNA-seq data. Nat. Methods 14, 483–486 (2017)
Stuart, T.: Comprehensive integration of single-cell data. Cell 177, 1888–1902. e1821 (2019)
Tian, T., Wan, J., Song, Q., Wei, Z.: Clustering single-cell RNA-seq data with a model-based deep learning approach. Nat. Mach. Intell. 1, 191–198 (2019)
Yu, B., et al.: scGMAI: a Gaussian mixture model for clustering single-cell RNA-Seq data based on deep autoencoder. Briefings Bioinform. 22, bbaa316 (2021)
Wang, C., Pan, S., Long, G., Zhu, X., Jiang, J.: MGAE: marginalized graph autoencoder for graph clustering. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 889–898. (2017)
Zhang, D.-J., Gao, Y.-L., Zhao, J.-X., Zheng, C.-H., Liu, J.-X.: A new graph autoencoder-based consensus-guided model for scRNA-seq cell type detection. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Huang, J., Nie, F., Huang, H.: A new simplex sparse learning model to measure data similarity for clustering. In: Twenty-Fourth International Joint Conference on Artificial Intelligence (2015)
Strehl, A., Ghosh, J.: Cluster ensembles–-a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)
Meilă, M.: Comparing clusterings—an information based distance. J. Multivar. Anal. 98, 873–895 (2007)
Li, J., et al.: Single-cell transcriptomes reveal characteristic features of human pancreatic islet cell types. EMBO Rep. 17, 178–187 (2016)
Goolam, M., et al.: Heterogeneity in Oct4 and Sox2 targets biases cell fate in 4-cell mouse embryos. Cell 165, 61–74 (2016)
Deng, Q., Ramsköld, D., Reinius, B., Sandberg, R.: Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343, 193–196 (2014)
Engel, I., et al.: Innate-like functions of natural killer T cell subsets result from highly divergent gene programs. Nat. Immunol. 17, 728–739 (2016)
Usoskin, D., et al.: Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing. Nat. Neurosci. 18, 145–153 (2015)
Kolodziejczyk, A.A., et al.: Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell 17, 471–485 (2015)
Tasic, B., et al.: Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346 (2016)
Condamine, T., et al.: Tmem176B and Tmem176A are associated with the immature state of dendritic cells. J. Leukoc. Biol. 88, 507–515 (2010)
Castillejo-López, C., et al.: Genetic and physical interaction of the B-cell systemic lupus erythematosus-associated genes BANK1 and BLK. Ann. Rheum. Dis. 71, 136–142 (2012)
Acknowledgment
This work is supported by the National Natural Science Foundation of China (No. 62172253), and jointly supported by the Program for Youth Innovative Research Team in the University of Shandong Province in China (No.2022KJ179).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Qiao, TJ., Li, F., Yuan, S., Dai, LY., Wang, J. (2023). scGASI: A Graph Autoencoder-Based Single-Cell Integration Clustering Method. In: Guo, X., Mangul, S., Patterson, M., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2023. Lecture Notes in Computer Science(), vol 14248. Springer, Singapore. https://doi.org/10.1007/978-981-99-7074-2_14
Download citation
DOI: https://doi.org/10.1007/978-981-99-7074-2_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7073-5
Online ISBN: 978-981-99-7074-2
eBook Packages: Computer ScienceComputer Science (R0)