Abstract
Single-cell RNA-seq (scRNA-seq) is a powerful technique for assaying transcriptional profile of individual cells. However, high dropout rate and overdispersion inherent in scRNA-seq hinders the reliable quantification of genes. Recent bioinformatic studies switched the conventional gene-level analysis to APA (alternative polyadenylation) isoform level, and revealed cell-to-cell heterogeneity in APA usages and APA dynamics in different cell types. The additional layer of APA isoforms creates immense potential to develop cost-efficient approaches for dissecting cell types by integrating multiple modalities derived from existing scRNA-seq experiments. Here we proposed a pipeline called scAPAfuse for enhancing cell type clustering and identifying of novel/rare cell types by combing gene expression and APA profiles from the same scRNA-seq data. scAPAfuse first maps gene expression and APA profiles to a shared low-dimensional space using partial least squares. Then anchors (i.e., similar cells) between gene and APA profiles were identified by constructing the nearest neighbors of cells in the low-dimensional space, using algorithms like hyperplane local sensitive hash and shared nearest neighbor. Finally, gene and APA profiles were integrated to a fused matrix, using the Gaussian kernel function. Applying scAPAfuse on four public scRNA-seq datasets including human peripheral blood mononuclear cells (PBMCs) and Arabidopsis roots, new subpopulations of cells that were undetectable using the gene expression or APA profile alone were found. scAPAfuse provides a unique strategy to mitigate the high sparsity of scRNA-seq by fusing gene expression and APA profiles to improve cell type clustering, which can be included in many other routine scRNA-seq pipelines.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Butler, A., et al.: Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411 (2018)
Kharchenko, P.V., et al.: Bayesian approach to single-cell differential expression analysis. Nat. Methods 11, 740 (2014)
Grun, D., et al.: Validation of noise models for single-cell transcriptomics. Nat. Methods 11, 637–640 (2014)
Saliba, A.-E., et al.: Single-cell RNA-seq: advances and future challenges. Nucleic Acids Res. 42, 8845–8860 (2014)
Chen, W., et al.: Alternative polyadenylation: methods, findings, and impacts. Genomics Proteomics Bioinf. 15, 287–300 (2017)
Ye, C., et al.: Discovery of alternative polyadenylation dynamics from single cell types. Comput. Struct. Biotechnol. J. 18, 1012–1019 (2020)
Zheng, G.X., et al.: Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017)
Macosko, E.Z., et al.: Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015)
Hashimshony, T., et al.: CEL-seq: single-cell RNA-seq by multiplexed linear amplification. Cell Rep. 2, 666–673 (2012)
Wu, X., et al.: scAPAtrap: identification and quantification of alternative polyadenylation sites from single-cell RNA-seq data. Brief. Bioinform. 22 (2021)
Patrick, R., et al.: Sierra: discovery of differential transcript usage from polyA-captured single-cell RNA-seq data. Genome Biol. 21, 167 (2020)
Shulman, E.D., Elkon, R.: Cell-type-specific analysis of alternative polyadenylation using single-cell transcriptomics data. Nucleic Acids Res. 47, 10027–10039 (2019)
Ye, W., et al.: A survey on methods for predicting polyadenylation sites from DNA sequences, bulk RNA-seq, and single-cell RNA-seq. Genomic Proteomics Bioinf. 21, 63–79 (2023)
Ji, G., et al.: stAPAminer: mining spatial patterns of alternative polyadenylation for spatially resolved transcriptomic studies. Genomic Proteomics Bioinf. (2023)
Wendrich, J.R., et al.: Vascular transcription factors guide plant epidermal responses to limiting phosphate conditions. Science 370 (2020)
Hie, B., et al.: Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat. Biotechnol. 37, 685–691 (2019)
Levine, J.H., et al.: Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell 162, 184–197 (2015)
Haghverdi, L., et al.: Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421 (2018)
Stuart, T., et al.: Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019). e1821
Love, M.I., et al.: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 1–21 (2014)
Shahan, R., et al.: A single cell Arabidopsis root atlas reveals developmental trajectories in wild type and cell identity mutants. Front. Genet. 370 (2020)
Ryu, K.H., et al.: Single-cell RNA sequencing resolves molecular relationships among individual plant cells. Plant Physiol. 179, 1444–1456 (2019)
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant No. T2222007 to XW).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Xu, S., Kang, L., Bi, X., Wu, X. (2023). Integrative Analysis of Gene Expression and Alternative Polyadenylation from Single-Cell RNA-seq Data. In: Guo, X., Mangul, S., Patterson, M., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2023. Lecture Notes in Computer Science(), vol 14248. Springer, Singapore. https://doi.org/10.1007/978-981-99-7074-2_24
Download citation
DOI: https://doi.org/10.1007/978-981-99-7074-2_24
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7073-5
Online ISBN: 978-981-99-7074-2
eBook Packages: Computer ScienceComputer Science (R0)