Abstract
Single-cell RNA sequencing (scRNA-seq) technology has recently brought a new insight into identifying and characterizing novel cell types and gene expression patterns. The study of single-cell RNA sequencing is irreplaceable for the exploration of biology and also faces significant challenges. This has led to the emergence of abundant methods and tools that aim to automatically process specific problems associated with scRNA-seq data. However, a prominent feature of scRNA-seq data is noisy and of high dimensionality due to the large-scale expression datasets with hundreds of thousands of genes. So, the raw single-cell datasets are generally processed for dimensionality reduction before being classified for analysis. There are many methods for dimensionality reduction, such as Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE). Each method has different effects on the results of single-cell classification. But, choosing an appropriate method for gene expression data to perform dimensionality reduction in scRNA-seq classification is an unsolved problem. This paper integrated seven dimensionality reduction methods to process gene expression profiles based on data fusion and ensemble learning. Moreover, we determined the optimal number of low-dimensional components for each dimensionality reduction method and each dataset before integration. Different from existing dimensionality reduction techniques, the proposed method implements data fusion and ensemble learning schemes that utilize massive weak learners for accurate classification. Our analysis demonstrates that classification using different data representations may capture a more complete data relationship, and integrating various methods to reduce the dimension of the data promotes better performance for single-cell classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Keren-Shaul, H., et al.: A unique microglia type associated with restricting development of alzheimer’s disease. Cell 169, 1276-1290.e17 (2017)
Stephenson, W., et al.: Single-cell RNA-seq of rheumatoid arthritis synovial tissue using low-cost microfluidic instrumentation. Nat. Commun. 9(1), 791 (2018)
Haghverdi, L., Büttner, M., Wolf, F.A., Buettner, F., Theis, F.J.: Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13(4), 845–848 (2016)
Moignard, V., et al.: Decoding the regulatory network of early blood development from single-cell gene expression measurements. Nat. Biotechnol. 33(3), 269–276 (2015)
Gladka, M.M., et al.: Single-cell sequencing of the healthy and diseased heart reveals cytoskeleton-associated protein 4 as a new modulator of fibroblasts activation. Cardiovasc. Res. 114(suppl_1), S61–S61 (2018)
Eraslan, G., Simon, L.M., Mircea, M., Mueller, N.S., Theis, F.J.: Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun. 10(1), 1–14 (2019)
Gao, W., Li, Y., Fang, C., Fan, W., Peng, H.: SCMAG: a semisupervised single-cell clustering method based on matrix aggregation graph convolutional neural network. Comput. Math. Methods Med. 2021, 6842752 (2021)
Li, X., Wang, K., Lyu, Y., Pan, H., Li, M.: Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis. Nat. Commun. 11(1), 2338 (2020)
Li, Y., Luo, P., Lu, Y., Wu, F.-X.: Identifying cell types from single-cell data based on similarities and dissimilarities between cells. BMC Bioinform. 22(3), 1–18 (2021)
Tenenbaum, J.B., Silva, V.D., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2001)
Köppen, M.: The curse of dimensionality. In: 5th online world conference on soft computing in industrial applications (WSC5), pp. 4–8 (2000)
Qi, R., Ma, A., Ma, Q., Zou, Q.: Clustering and classification methods for single-cell RNA-sequencing data. Brief. Bioinform. 21(4), 1196–1208 (2019)
Laurens, V.D.M., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(2605), 2579–2605 (2008)
Luke, Z., Belinda, P., Alicia, O., Dina, S.: Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database. Plos Comput. Biol. 14(6), e1006245 (2018)
Butler, A., Hoffman, P., Smibert, P., Papalexi, E., Satija, R.: Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36(5), 411–420 (2018)
Schaub, M., et al.: SC3-consensus clustering of single-cell RNA-Seqdata. Nat. Methods: Tech. Life Sci. Chem. 14 (2016)
Lin, P., Troup, M., Ho, J.: CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data. Genome Biol. 18(1), 59 (2017)
Gan, Y., Li, N., Zou, G., Xin, Y., Guan, J.: Identification of cancer subtypes from single-cell RNA-seq data using a consensus clustering method. BMC Med. Genomics 11(S6), 117 (2018)
Senabouth, A., et al.: ascend: R package for analysis of single-cell RNA-seq data. GigaScience 8(8), giz087 (2019)
Welch, J.D., Hartemink, A.J., Prins, J.F.: SLICER: inferring branched, nonlinear cellular trajectories from single cell RNA-seq data. Genome Biol. 17(1), 106 (2016)
Trapnell, C., et al.: The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32(4), 381–386 (2014)
Cao, J., et al.: The single-cell transcriptional landscape of mammalian organogenesis. Nature 566(7745), 496–502 (2019)
Acknowledgements
This work was supported by the National Nature Science Foundation of China under Grant No.12001408, and the Science Foundation of Wuhan Institute of Technology under Grant No.K201746.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Fang, C., Li, Y. (2022). SCDF: A Novel Single-Cell Classification Method Based on Dimension-Reduced Data Fusion. In: Huang, DS., Jo, KH., Jing, J., Premaratne, P., Bevilacqua, V., Hussain, A. (eds) Intelligent Computing Theories and Application. ICIC 2022. Lecture Notes in Computer Science, vol 13394. Springer, Cham. https://doi.org/10.1007/978-3-031-13829-4_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-13829-4_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-13828-7
Online ISBN: 978-3-031-13829-4
eBook Packages: Computer ScienceComputer Science (R0)