Skip to main content
Log in

Graph attention autoencoder model with dual decoder for clustering single-cell RNA sequencing data

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Single-cell ribonucleic acid sequencing (scRNA-seq) allows researchers to study cell heterogeneity and diversity at the individual cell level. Cell clustering is an essential component of scRNA-seq data processing. However, the high dimensionality and high noise characteristics of scRNA-seq data may pose problems during data processing. Although many methods are available for scRNA-seq clustering analysis, most of them ignore the topological relationships of scRNA-seq data and do not fully utilize the potential associations between cells. In this study, we present scGAD, a graph attention autoencoder model with a dual decoder structure for clustering scRNA-seq data. We utilize a graph attention autoencoder with two decoders to learn feature representations of cells in latent space. To ensure that the learned latent feature representation maintains node properties and graph structure, we use an inner product decoder and a learnable graph attention decoder to reconstruct graph structure and node properties, respectively. On the 12 real scRNA-seq datasets, the average NMI and ARI scores of scGAD are 0.762 and 0.695, respectively, outperforming state-of-the-art single-cell clustering approaches. Biological analysis shows that the cell labels predicted by scGAD can assist in the downstream analysis of scRNA-seq data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The datasets used in this study can be found at https://github.com/ZzzOctopus/scGAD.

Code Availability

scGAD is implemented in Python and the source code may be found on https://github.com/ZzzOctopus/scGAD.

References

  1. Clarke ZA et al (2021) Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods. Nat Protoc 16:2749–2764

    Article  Google Scholar 

  2. Kiselev VY, Andrews TS, Hemberg M (2019) Challenges in unsupervised clustering of single-cell rna-seq data. Nat Rev Genet 20:273–282

    Article  Google Scholar 

  3. Qian K, Fu S, Li H, Li WV (2022) scinsight for interpreting single-cell gene expression from biologically heterogeneous data. Genome Biol 23:1–23

    Google Scholar 

  4. Sheng J, Li WV (2021) Selecting gene features for unsupervised analysis of single-cell gene expression data. Brief Bioinform 22:bbab295

  5. Li WV (2022) Phitest for analyzing the homogeneity of single-cell populations. Bioinformatics 38:2639–2641

    Article  Google Scholar 

  6. Li Y et al (2022) Cellular heterogeneity and immune microenvironment revealed by single-cell transcriptome in venous malformation and cavernous venous malformation. J Mol Cell Biol 162:130–143

  7. Geldhof V et al (2022) Single cell atlas identifies lipid-processing and immunomodulatory endothelial cells in healthy and malignant breast. Nat Commun 13:5511

    Article  Google Scholar 

  8. Twigger A-J et al (2022) Transcriptional changes in the mammary gland during lactation revealed by single cell sequencing of cells from human milk. Nat Commun 13:562

    Article  Google Scholar 

  9. Dai H, Li L, Zeng T, Chen L (2019) Cell-specific network constructed by single-cell rna sequencing data. Nucleic Acids Res 47:e62–e62

    Article  Google Scholar 

  10. Petegrosso R, Li Z, Kuang R (2020) Machine learning and statistical methods for clustering single-cell rna-sequencing data. Brief Bioinform 21:1209–1223

    Article  Google Scholar 

  11. Qi R, Ma A, Ma Q, Zou Q (2020) Clustering and classification methods for single-cell rna-sequencing data. Brief Bioinform 21:1196–1208

    Article  Google Scholar 

  12. Wang B, Zhu J, Pierson E, Ramazzotti D, Batzoglou S (2017) Visualization and analysis of single-cell rna-seq data by kernel-based similarity learning. Nat Methods 14:414–416

  13. Kiselev VY et al (2017) Sc3: consensus clustering of single-cell rna-seq data. Nat Methods 14:483–486

  14. Cui Y et al (2021) Consensus clustering of single-cell rna-seq data by enhancing network affinity. Brief Bioinform 22:bbab236

  15. Deng Y, Bao F, Dai Q, Wu LF, Altschuler SJ (2019) Scalable analysis of cell-type composition from single-cell transcriptomics using deep recurrent learning. Nat Methods 16:311–314

  16. Deng Y, Bao F, Dai Q, Wu LF, Altschuler SJ (2019) Scalable analysis of cell-type composition from single-cell transcriptomics using deep recurrent learning. Nat Methods 16:311–314

    Article  Google Scholar 

  17. Yu B et al (2021) scgmai: a gaussian mixture model for clustering single-cell rna-seq data based on deep autoencoder. Brief Bioinform 22:bbaa316

  18. Choi Y, Li R, Quon G (2023) sivae: interpretable deep generative models for single-cell transcriptomes. Genome Biol 24:29

    Article  Google Scholar 

  19. Grønbech CH et al (2020) scvae: variational auto-encoders for single-cell gene expression data. Bioinformatics 36:4415–4422

    Article  Google Scholar 

  20. Wang H-Y, Zhao J-P, Zheng C-H, Su Y-S (2023) scgmaae: Gaussian mixture adversarial autoencoders for diversification analysis of scrna-seq data. Brief Bioinform 24:bbac585

  21. Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis, pp 478–487 (PMLR)

  22. Tian T, Wan J, Song Q, Wei Z (2019) Clustering single-cell rna-seq data with a model-based deep learning approach. Nat Mach Intell 1:191–198

    Article  Google Scholar 

  23. Chen L, Wang W, Zhai Y, Deng M (2020) Deep soft k-means clustering with self-training for single-cell rna sequence data. NAR Genom Bioinform 2:lqaa039

  24. He X et al (2023) scace: an adaptive embedding and clustering method for single-cell gene expression data. Bioinformatics 39:btad546

  25. Wang J et al (2021) scgnn is a novel graph neural network framework for single-cell rna-seq analyses. Nat Commun 12:1882

    Article  Google Scholar 

  26. Gan Y, Huang X, Zou G, Zhou S, Guan J (2022) Deep structural clustering for single-cell rna-seq data jointly through autoencoder and graph neural network. Brief Bioinform 23:bbac018

  27. Cheng Y, Ma X (2022) scgac: a graph attentional architecture for clustering single-cell rna-seq data. Bioinformatics 38:2187–2193

    Article  Google Scholar 

  28. Ting DT et al (2014) Single-cell rna sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells. Cell reports 8:1905–1918

    Article  Google Scholar 

  29. Buettner F et al (2015) Computational analysis of cell-to-cell heterogeneity in single-cell rna-sequencing data reveals hidden subpopulations of cells. Nat Biotechnol 33:155–160

    Article  Google Scholar 

  30. Pollen AA et al (2014) Low-coverage single-cell mrna sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex. Nat Biotechnol 32:1053–1058

    Article  Google Scholar 

  31. Darmanis S et al (2015) A survey of human brain transcriptome diversity at the single cell level. Proc Natl Acad Sci 112:7285–7290

    Article  Google Scholar 

  32. Kolodziejczyk AA et al (2015) Single cell rna-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell 17:471–485

    Article  Google Scholar 

  33. Baron M et al (2016) A single-cell transcriptomic map of the human and mouse pancreas reveals inter-and intra-cell population structure. Cell Syst 3:346–360

    Article  Google Scholar 

  34. Muraro MJ et al (2016) A single-cell transcriptome atlas of the human pancreas. Cell Syst 3:385–394

    Article  Google Scholar 

  35. Klein AM et al (2015) Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161:1187–1201

    Article  Google Scholar 

  36. Han X et al (2018) Mapping the mouse cell atlas by microwell-seq. Cell 172:1091–1107

    Article  Google Scholar 

  37. Zheng GX et al (2017) Massively parallel digital transcriptional profiling of single cells. Nat Commun 8:14049

    Article  Google Scholar 

  38. Young MD et al (2018) Single-cell transcriptomes from human kidneys reveal the cellular identity of renal tumors. Science 361:594–599

    Article  Google Scholar 

  39. Schaum N et al (2018) Single-cell transcriptomics of 20 mouse organs creates a tabula muris: The tabula muris consortium. Nature 562:367

    Article  Google Scholar 

  40. Wolf FA, Angerer P, Theis FJ (2018) Scanpy: large-scale single-cell gene expression data analysis. Genome Biol 19:1–5

Download references

Funding

This work was supported by the National Key Research and Development Project of China (2021YFA1000102, 2021YFA1000103).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yulin Zhang.

Ethics declarations

Competing of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Financial interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 692 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, S., Zhang, Y., Zhang, Y. et al. Graph attention autoencoder model with dual decoder for clustering single-cell RNA sequencing data. Appl Intell 54, 5136–5146 (2024). https://doi.org/10.1007/s10489-024-05442-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05442-w

Keywords

Navigation