Abstract
In this article, we propose a novel type-2 fuzzy multigranulation-based SVM model for gene expression pattern classification on human breast cancer dataset. Firstly, a type-2 fuzzy multigranulation system has been designed for the classification task dealing with noisy and nonlinear microarray gene expression patterns. Thereafter, the fuzzy if–then rules have been devised on the feature vectors of microarray to enable accurate bilinear classification process. The fuzzy if–then rules in the domain of type-2 fuzzy multigranulation system are able to identify efficient expression patterns that have been deferentially expressed from normal state to carcinogenic state. The proposed method reduces the structural complexity of the fuzzy if–then rules (type 1) since it works on upper and lower membership functions instead of a single membership function. In addition, a fuzzy rough approximation has been utilized in the model to reduce the computational cost. Lastly, the association among genes consisting of significantly different expression patterns from normal state to malignant state has been recognized with respect to their nature. The effectiveness of the proposed method has been implemented on eight microarray gene expression datasets for human breast cancer patients. Moreover, we have validated the results by F-score and NCBI database which signify that the proposed model performs better in comparison with the existing methods.
Similar content being viewed by others
References
Siegel R, DeSantis C, Jemal A (2014) Colorectal cancer statistics, 2014. CA: Cancer J Clin 64(2):104–117
Cornelisse CJ, Cornelis RS, Devilee P (1996) Genes responsible for familial breast cancer. Pathol Res Pract 192(7):684–693
Liu H, Li J, Wong L (2002) A comparative study on feature selection and classification methods using gene expression profiles and proteomic patterns. Gene Inform 13:51–60
Dudoit S, Fridlyand J, Speed T (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97(457):77–87
Ding C, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol 3(2):185–205
Statnikov A, Aliferis S et al (2005) A comprehensive evaluation of multicategory classification methods for microarray expression cancer diagnosis. Bioinformatics 21(5):631–643
Gilbert-Diamond D, Moore JH (2011) Analysis of gene–gene interactions. Curr Prot Hum Genet 2011:1.14. https://doi.org/10.1002/0471142905.hg0114s70
Moore JH (2003) The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum Hered 56:73–82. https://doi.org/10.1159/000073735
Tan AC, Naiman DQ et al (2005) Simple decision rules for classifying human cancers from gene expression profiles. Bioinformatics 21(20):3896–3904
Hong JH, Cho SB (2009) Gene boosting for cancer classification based on gene expression profiles. Pattern Recogn 42(9):1761–1767
Lu H, Chen J et al (2017) A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 256:56–62
Gao L, Ye M et al (2017) Hybrid method based on information gain and support vector machine for gene selection in cancer classification. Genom Proteomics Bioinform 15:389–395
Das R, Kalita J, Bhattacharyya DK (2011) A pattern matching approach for clustering gene expression data. Int J Data Min Model Manag 3(2):130–149
Banerjee M, Mitra S, Banka H (2007) Evolutionary rough feature selection in gene expression data. IEEE Trans Syst Man Cybern Part C: Appl Rev 37:622–632
Tong M, Liu KH et al (2013) An ensemble of SVM classifiers based on gene pairs. Comput Biol Med 43(6):729–737
Latkowski T, Osowski S (2015) Computerized system for recognition of autism on the basis of gene expression microarray data. Comput Biol Med 56:82–88
Jia L, Peng Q et al (2016) A multi-objective heuristic algorithm for gene expression microarray data classification. Expert Syst Appl 59:13–19
Danaee P, Hendrix DA (2017) A deep learning approach for cancer detection and relevant gene identification. In: Pacific symposium on biocomputing, pp 219–229
Liu J, Wang X, Cheng Y, Zhang L (2017) Tumor gene expression data classification via sample expansion based deep learning. Oncotarget 8(65):109646–109660
Sarah MA, Saleh AI, Labib M (2019) Gene expression cancer classification using modified K-nearest neighbors technique. Biosystems 176:41–51
Woolf P, Wang Y (2000) A fuzzy logic approach to analyzing gene expression data. Physiol Genom 3:9–15
Vinterbo S, Kim EY et al (2005) Small, fuzzy and interpretable gene expression based classifiers. Bioinformatics 21(9):1964–1970
Khashei M, Hamadani Z et al (2012) A fuzzy intelligent approach to the classification problem in gene expression data analysis. Knowl Based Syst 27:465–474
Ghosh A, De RK (2016) Fuzzy correlation association mining: selection altered associations among the genes, and some possible marker genes mediating certain cancers. Appl Soft Comput 38:587–605
Qu Y, Shen Q, Mac-Parthalain N, Shang C, Wu W (2012) Fuzzy similarity-based nearest-neighbour classification as alternatives to their fuzzy-rough parallels. Int J Approx Reason 54(1):184–195
Sun B, Ma W, Qian Y (2015) Multigranulation rough set theory over two universes. Intell Fuzzy Syst 28(3):1251–1269
Sun B, Ma W, Qian Y (2017) Multigranulation fuzzy rough set over two universes and its application to decision making. Knowl Based Syst 123:61–74
Nayak RK, Mishra D, Shaw K, Mishra S (2012) Rough set based attribute clustering for sample classification of gene expression data. In: International conference on modeling optimization and computing, vol 38
Jensen R, Parthalain NM (2015) Towards scalable fuzzy-rough feature selection. Inf Sci 15:1–15
Sun L, Kong X, Xu J et al (2019) A hybrid gene selection method based on ReliefF and ant colony optimization algorithm for tumor classification. Sci Rep 9:8978. https://doi.org/10.1038/s41598-019-45223-x
Mendel JM (2007) Type-2 fuzzy sets and systems: an overview. IEEE Comput Intell Mag 2(1):20–29
Yeh CY, Jeng WH, Lee SJ (2011) An enhanced type-reduction algorithm for type-2 fuzzy sets. IEEE Trans Fuzzy Syst 19(2):227–240
Nguyen T, Nahavandi S (2016) Modified AHP for gene selection and cancer classification using type-2 fuzzy logic. IEEE Trans Fuzzy Syst 24(2):273–287
Lee KH (2005) First course on fuzzy theory and applications. Advances in Soft Computing, vol 27. Springer, Berlin
Pawlak Z (1991) Rough sets: theoretical aspects of reasoning about data. Kluwer Academic Publishers, New York
Maji P, Pal SK (2007) Protein sequence analysis using relational soft clustering algorithms. Int J Comput Math 84(5):599–617
Schaefer G, Nakashima T (2010) Data mining of gene expression data by fuzzy and hybrid fuzzy methods. IEEE Trans Inf Technol Biomed 14(1):23–29
Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, Hoboken, NJ
Tang F, Adam L, Si B (2018) Group feature selection with multi-class support vector machine. Neurocomputing 317(23):42–49
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
Kumar MA, Gopal M (2009) Least squares twin support vector machines for pattern classification. Expert Syst Appl 36(4):7535–7543
Melin P, Castillo O (2014) A review on type-2 fuzzy logic applications in clustering, classification and pattern recognition. Appl Soft Comput 21:568–577
Ghosh SK, Ghosh A, Chakrabarti A (2018) VEA: vessel extraction algorithm by active contour model and a novel wavelet analyzer for diabetic retinopathy detection. Int J Image Graph 18(2):18500081–185000820
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors hereby declare that there is no conflict of interest with any research data or other.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ghosh, S.K., Ghosh, A. Classification of gene expression patterns using a novel type-2 fuzzy multigranulation-based SVM model for the recognition of cancer mediating biomarkers. Neural Comput & Applic 33, 4263–4281 (2021). https://doi.org/10.1007/s00521-020-05241-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-020-05241-7