Abstract
Single-cell RNA sequencing allows expression profiling of hundreds of thousands of individual cells in a single experiment. The main drawback is that on the single-cell level observed proportion of zero counts is much higher than on the bulk level. In this study, we performed the analysis of potential sources of excessive zeros using multi-omics data from a homogenous breast cancer cell line. A comparison of the expression data at the population and single-cell level showed that variability between sequencing platforms is higher than when comparing replicates on the same platform. The non-linear model was used to estimate the difference in the expected and observed number of zeros per gene. Then, using gene set enrichment analysis, we discovered some biological pathways containing genes with an increased or reduced number of zeros, like ribosomal genes. Finally, we analyzed different technical factors potentially influencing the dropout rate, and found that the number of transcripts per gene, low mappability and difference in transcript coverage uniformity might cause fluctuations in gene expression estimate on a single-cell level.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ding, J., et al.: Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat. Biotechnol. 38, 737–746 (2020)
Zheng, G.X.Y., et al.: Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017)
Buettner, F., et al.: Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat. Biotechnol. 33, 155–160 (2015)
Ziegenhain, C., et al.: Comparative analysis of single-cell RNA sequencing methods. Mol. Cell 65, 631-643.e634 (2017)
Jiang, R., Sun, T., Song, D., Li, J.J.: Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biol. 23, 31 (2022)
Silverman, J.D., Roche, K., Mukherjee, S., David, L.A.: Naught all zeros in sequence count data are the same. Comput. Struct. Biotechnol. J. 18, 2789–2798 (2020)
Jaksik, R., Marczyk, M., Polanska, J., Rzeszowska-Wolny, J.: Sources of High variance between probe signals in affymetrix short oligonucleotide microarrays. Sensors (Basel) 14, 532–548 (2013)
Van den Berge, K., et al.: Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications. Genome Biol. 19, 24 (2018)
Marczyk, M., et al.: Multi-omics investigation of innate navitoclax resistance in triple-negative breast cancer cells. Cancers 12, 2551 (2020)
Buenrostro, J.D., Wu, B., Chang, H.Y., Greenleaf, W.J.: ATAC-seq: a method for assaying chromatin accessibility genome-wide. Curr. Protoc. Mol. Biol. 109(1), 21.29.1–21.29.9 (2015)
Zyla, J., Marczyk, M., Domaszewska, T., Kaufmann, S.H.E., Polanska, J., Weiner, J.: Gene set enrichment for reproducible science: comparison of CERNO and eight other algorithms. Bioinformatics 35, 5146–5154 (2019)
Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y., Morishima, K.: KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45, D353–D361 (2017)
Korotkevich, G., Sukhov, V., Sergushichev, A.: Fast gene set enrichment analysis. bioRxiv 060012 (2019)
Wang, L., et al.: Measure transcript integrity using RNA-seq data. BMC Bioinf. 17, 58 (2016)
Zerbino, D.R., et al.: Ensembl 2018. Nucleic Acids Res. 46, D754–D761 (2018)
Karimzadeh, M., Ernst, C., Kundaje, A., Hoffman, M.M.: Umap and Bismap: quantifying genome and methylome mappability. Nucleic Acids Res. 46, e120 (2018)
Acknowledgments
This work was financed by the Silesian University of Technology grant no. 02/070/BK22/0033 for maintaining and developing research potential (MM, JZ).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Slowik, H., Zyla, J., Marczyk, M. (2022). Investigating Sources of Zeros in 10× Single-Cell RNAseq Data. In: Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2022. Lecture Notes in Computer Science(), vol 13347. Springer, Cham. https://doi.org/10.1007/978-3-031-07802-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-07802-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07801-9
Online ISBN: 978-3-031-07802-6
eBook Packages: Computer ScienceComputer Science (R0)