Skip to main content
Log in

Gene selection in a single cell gene decision space based on class-consistent technology and fuzzy rough iterative computation model

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This study explores gene selection in a single cell gene decision space (scgd-space) based on class-consistent technology and fuzzy rough iterative computation model (FRIC-model). Gene expression data (ge-data) exhibit characteristics such as limited sample size, high dimensionality, and noise. Due to their high dimensionality, gene selection must be carried out before clustering and classifying them. The existing gene selection methods based on equivalence relation are not effective for ge-data owing to the strictness of the equality between gene expression values. In order to overcome this weakness, class-consistent technology of replacing equality with approximate equality between gene expression values is first proposed. Then, “the class consistency between gene expression values is fed back to the gene set” is considered with the help of class-consistent technology, and fuzzy symmetric relations on the cell set of a scgd-space are induced. In addition, fuzzy rough approximations in a scgd-space are defined. Next, FRIC-model is given. This model employs the iterative computation strategy to define fuzzy rough approximations and dependency functions. A gene selection algorithm based on this model is designed. Finally, the designed algorithm is testified in several publicly open ge-data sets to estimate its performance. The experimental results show that the designed algorithm is more effective than some existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Benesty J, Chen J, Huang Y, Cohen I (2009) Pearson correlation coefficient, Noise reduction in speech processing, Springer, pp. 1–4

  2. Biase F, Cao X, Zhong S (2014) Cell fate inclination within 2-cell and 4-cell mouse embryos revealed by single-cell RNA sequencing. Genome Res 24:1787–1796

    Article  Google Scholar 

  3. Buettner F, Natarajan KN, Casale FP, Proserpio V, Scialdone A, Theis FJ, Teichmann SA, Marioni JC, Stegle O (2015) Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells. Nat Biotechnol 33:1–8

    Article  Google Scholar 

  4. Bommert A, Welchowski T, Schmid M, Rahnenf\(\ddot{u}\)hrer J (2022) Benchmark of filter methods for feature selection in high-dimensional gene expression survival data. Brief Bioinform 23:bbab354

  5. Cornelis C, Jensen R, Martin GH, Slezak D (2010) Attribute selection with fuzzy decision reducts. Inf Sci 180:209–224

    Article  MathSciNet  Google Scholar 

  6. Demisar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  Google Scholar 

  7. Derrac J, Garc\(\acute{i}\)a S, Molina D, Herrera F, (2011) A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol Comput 1:3–18

  8. Dai JH, Hu H, Wu WZ, Qian YH, Huang DB (2018) Maximal-discernibility-pair-based approach to attribute reduction in fuzzy rough sets. IEEE Trans Fuzzy Syst 26(4):2175–2187

    Article  Google Scholar 

  9. Deng Q, Ramskld D, Reinius B, Sandberg R (2014) Single-cell RNA-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science 343:193–196

    Article  Google Scholar 

  10. Engel I, Seumois G, Chavez L, Samaniego-Castruita D, White B, Chawla A, Mock D, Vijayanand P, Kronenberg M (2016) Innate-like functions of natural killer T cell subsets result from highly divergent gene programs. Nat Immunol 17:728C739

  11. Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Annals Math Stat 11:86–92

  12. Fan X, Zhang X, Wu X, Guo H, Hu Y, Tang F, Huang Y (2015) Single-cell RNA-seq transcriptome analysis of linear and circular RNAs in mouse preimplantation embryos. Genome Biol 16(148):1–17

    Google Scholar 

  13. Gao L, Cai MJ, Li QG (2023) A relative granular ratio-based outlier detection method in heterogeneous data. Inf Sci 622:710–731

    Article  Google Scholar 

  14. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    Google Scholar 

  15. Grover A, Sanjuan-Pla A, Thongjuea S, Carrelha J, Giustacchini A, Gambardella A, Macaulay I, Mancini E, Luis TC, Mead A (2016) Single-cell RNA sequencing reveals molecular and functional platelet bias of aged haematopoietic stem cells. Nat Commun 7:11075–11075

    Article  Google Scholar 

  16. Huang D, Chen YY, Liu F, Li ZW (2023) Feature selection for multiset-valued data based on fuzzy conditional information entropy using iterative model and matrix operation. Appl Soft Comput 142:110345

    Article  Google Scholar 

  17. Kolodziejczyk AA, Kim JK, Tsang JC, Ilicic T, Henriksson J, Natarajan KN, Tuck AC, Gao X, Bıhler M, Liu P (2015) Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell 17:471–485

    Article  Google Scholar 

  18. Kimmerling RJ, Szeto GL, Li JW, Genshaft AS, Kazer SW, Payer KR, de Riba Borrajo J, Blainey PC, Irvine DJ, Shalek AK (2016) A microfluidic platform enabling single-cell RNA-seq of multigenerational lineages. Nat Commun 7:1–7

    Article  Google Scholar 

  19. Leng N, Chu L, Barry C, Li Y, Choi J, Li X, Jiang P, Stewart RM, Thomson JA, Kendziorski C (2015) Oscope identifies oscillatory genes in unsynchronized single-cell RNA-seq experiments. Nat Methods 12:947C950

  20. Li X, Cui X, Wang J, Wang Y, Li Y, Wang L, Wan H, Li T, Feng G, Shuai L (2016) Generation and application of mouse-rat allodiploid embryonic stem cells. Cell 164:279–292

    Article  Google Scholar 

  21. Li Z, Feng J, Zhang J, Liu F, Wang P, Wen C (2022) Gaussian kernel based gene selection in a single cell gene decision space. Inf Sci 610:1029–1057

  22. Li ZW, Liu XF, Dai JH, Chen JL, Fujita H (2020) Measures of uncertainty based on Gaussian kernel for a fully fuzzy information system. Knowl-Based Syst 196:105791

    Article  Google Scholar 

  23. Li ZW, Qu LD, Zhang GQ, Xie NX (2021) Attribute selection for heterogeneous data based on information entropy. Int J Gen Syst 50(5):548–566

    Article  MathSciNet  Google Scholar 

  24. Li ZW, Zhang PF, Ge X, Xie NX, Zhang GQ, Wen CF (2019) Uncertainty measurement for a fuzzy relation information system. IEEE Trans Fuzzy Syst 27(12):2338–2352

    Google Scholar 

  25. Meng ZQ, Shi ZZ (2009) A fast approach to attribute reduction in incomplete decision systems with tolerance relation-based rough sets. Inf Sci 179:2774–2793

    Article  MathSciNet  Google Scholar 

  26. Mwangi B, Tian TS, Soares JC (2014) A review of feature reduction techniques in neuroimaging. Neuroinformatics 12:229–244

    Article  Google Scholar 

  27. Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11:341–356

    Article  Google Scholar 

  28. Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht

    Book  Google Scholar 

  29. Robnik-\(\check{S}\)ikonja M, Kononenko I, (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53:23–69

  30. \(\breve{S}\)id\(\acute{a}\)k Z, (1967) Rectangular confidence regions for the means of multivariate normal distributions. J Am Stat Assoc 62:626–633

  31. Sheng J, Li WV (2021) Selecting gene features for unsupervised analysis of single-cell gene expression data. Brief Bioinform 22:bbab295

  32. Sharma A, Rani R (2019) C-HMOSHSSA: gene selection for cancer classification using multi-objective meta-heuristic and machine learning methods. Comput Methods Prog Biomed 178:219–235

    Article  Google Scholar 

  33. Singh S, Shreevastava S, Som T, Somani G (2020) A fuzzy similarity-based rough set approach for attribute selection in set-valued information systems. Soft Comput 24(6):4675–4691

    Article  Google Scholar 

  34. Sun L, Zhang XY, Qian YH, Xu JC, Zhang SG, Tian Y (2019) Joint neighborhood entropy-based gene selection method with fisher score for tumor classification. Appl Intell 49:1245–1259

    Article  Google Scholar 

  35. Treutlein B, Brownfield DG, Wu AR, Neff NF, Mantalas GL, Espinoza FH, Desai TJ, Krasnow MA, Quake SR (2014) Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA-seq. Nature 509:371–375

    Article  Google Scholar 

  36. Trapnell C, Cacchiarelli D, Grimsby J, Pokharel P, Li S, Morse M, Lennon NJ, Livak KJ, Mikkelsen TS, Rinn JL (2014) The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol 32:381–386

    Article  Google Scholar 

  37. Trabelsi S, Elouedi Z (2010) Heuristic method for attribute selection from partially uncertain data using rough sets. Int J Gen Syst 39(3):271–290

    Article  Google Scholar 

  38. Tabakhi S, Moradi P, Akhlaghian F (2014) An unsupervised feature selection algorithm based on ant colony optimization. Eng Appl Artif Intell 32:112–123

    Article  Google Scholar 

  39. Ting DT, Wittner BS, Ligorio M, Jordan NV, Shah AM, Miyamoto DT, Aceto N, Bersani F, Brannigan BW, Xega K (2014) Single-cell RNA sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells. Cell Rep 8:1905–1918

    Article  Google Scholar 

  40. Wang YB, Chen XJ, Dong K (2019) Attribute reduction via local conditional entropy. Int J Mach Learn Cybern 10(12):3619–3634

    Article  Google Scholar 

  41. Wang CZ, Huang Y, Shao MW, Hu QH, Chen DG (2020) Feature selection based on neighborhood self-information. IEEE Trans Cybern 50:4031–4042

    Article  Google Scholar 

  42. Wang CZ, Wang Y, Shao MW, Qian YH, Chen DG (2020) Fuzzy rough attribute reduction for categorical data. IEEE Trans Fuzzy Syst 28(5):818–830

    Article  Google Scholar 

  43. Xu F, Cai MJ, Song H, Dai JH (2022) The selection of feasible strategies based on consistency measurement of cliques. Inf Sci 583:33–55

  44. Yang D, Cai MJ, Li QG, Xu F (2022) Multigranulation fuzzy probabilistic rough set model on two universes. Int J Approx Reason 145:18–35

    Article  MathSciNet  Google Scholar 

  45. Yang Y, Huh R, Houston WC, Lin Y, Michael IL, Li Y (2019) SAFE-clustering: single-cell aggregated (from ensemble) clustering for single-cell RNA-seq data. Bioinforma 35:1269–1277

    Article  Google Scholar 

  46. Yang W, Wang K, Zuo W (2012) Neighborhood component feature selection for high-dimensional data. J Comput 7:161–168

    Article  Google Scholar 

  47. Yao YY, Zhang XY (2017) Class-specific attribute reducts in rough set theory. Inf Sci 418–419:601–618

    Article  Google Scholar 

  48. Zadeh LA (1965) Fuzzy sets. Inf. Control 8:338–356

    Article  MathSciNet  Google Scholar 

  49. Zhang J, Zhang GQ, Li ZW, Qu LD, Wen CF (2021) Feature selection in a neighborhood decision information system with application to single cell RNA data classification. Appl Soft Comput 113:107876

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the editors and the anonymous reviewers for their valuable comments and suggestions, which have helped immensely in improving the quality of the paper. This work is supported by Guangxi First-class Discipline Statistics Construction Project Fund, Natural Science Foundation of Guangxi Province (2021GXNSFAA220076, 2021GXNSFAA220114), Key Fields Project of Universities in Guangdong Province (2023ZDZX1063, 2023ZDZX1065, 2021ZDZX4109) and Scientific Research Platform of Guangdong Songshan Polytechnic (2022xjkypt02).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Guangji Yu or Dan Huang.

Ethics declarations

Conflicts of interest

All authors declare that there is no conflict of interests regarding the publication of this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Yu, G., Huang, D. et al. Gene selection in a single cell gene decision space based on class-consistent technology and fuzzy rough iterative computation model. Appl Intell 53, 30113–30132 (2023). https://doi.org/10.1007/s10489-023-05115-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-05115-0

Keywords

Navigation