Abstract
Gene expression clustering methods for building gene co-expression networks suffer greatly from the biological complexity of cells. This paper proposes a fuzzy soft subspace clustering method for detecting overlapped clusters of locally co-expressed genes that may participate in multiple cellular processes and take on different biological functions. Process-specific cluster subspaces and interactions among different gene clusters can be extracted by this method, providing useful information for gene co-expression networks analysis. Experiments on the yeast cell cycle benchmark microarray data have shown that this method is effective in extracting underlying biological relationships between genes, and enhancing gene co-expression network inference.
Similar content being viewed by others
References
Eason G, Noble B, Sneddon IN (1955) On certain integrals of Lipschitz-Hankel type involving products of Bessel functions. Phil Trans R Soc Lond A247:529–551
D’haeseleer P, Liang S, Somogyi R (2000) Genetic network inference: from co-expression clustering to reverse engineering. Bioinfromatics 16(8):707–726
Horvath S, Dong J (2008) Geometric Interpretation of gene coexpression network analysis. PLoS Comput Biol 4(8):e1000117
Barabasi AL, Oltvai ZN (2004) Network biology: understanding the cells functional organization. Nat Rev Genet 5:101–113
Petti AA, Church GM (2005) A network of transcriptionally coordinated functional modules in Saccharomyces cerevisiae. Genome Res 15:1298–1306
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. PNAS 95(25):14863–14868
Ge H, Liu Z, Church GM, Vidal M (2003) Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nat Genet 33:15–16
Dong J, Horvath S (2007) Understanding network concepts in modules. BMC Syst Biol 1(24):1–20
Mjolsness E, Mann T, Castaño R, Wold B (2000) From coexpression to coregulation: an approach to inferring transcriptional regulation among gene classes from large-scale expression data. In: Solla SA, Leen TK, Muller KR (eds) Advances in Neural Information Processing Systems 12. MIT Press, Cambridge, MA, pp 928–934
Arnone MI, Davidson EH (1997) he hardwiring of development: organization and function of genomic regulatory systems. Development 124:1851–1864
Miklos GL, Rubin GM (1996) The role of the genome project in determining gene function: insights from model organisms. Cell 86(4):521–529
Candillier L, Tellier I, Torre F, Bousquet O (2005) SSC: statistical subspace clustering. MLDM 2005:100–109
Ding C, Li T (2007) Adaptive dimension reduction using discriminant analysis and k-means clustering. In: Proceedings of the 24th international conference on machine learning
Kailing K, Kriegel HP, Kröger P (2004) Density-connected subspace clustering for high-dimensional data. In: Proceedings of 4th SIAM international conference on data mining, pp 246–257
Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE ACM Trans Comput Biol Bioinf 1:24–45
Parsons L, Haque E, Liu H (2004) Subspace clustering for high dimensional data: a review. SIGKDD Explor 6(1):90–105
Prelic A, Prelic S, Zimmermann P, Wille A, Bühlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1129
Wang H, Chu F, Fan W, Yu PS, Pei J (2004) A fast algorithm for subspace clustering by pattern similarity. In: SSDBM, pp 51–60
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York
Woolf PJ, Wang Y (2000) A fuzzy logic approach to analyzing gene expression data. Physiol Genomics 3:9–15
Gasch AP, Eisen MB (2002) Exploring the conditional coregulation of yeast gene expression through fuzzy K-means clustering. Genome Biol 3:1–22
Dembélé D, Kastner P (2003) Fuzzy C-means method for clustering microarray data. Bioinfromatics 19(8):973–980
Wang Q, Ye YM, Huang ZX (2008) Fuzzy k-means with variable weighting in high dimensional data analysis. In: The ninth international conference on web-age information management, pp 365–372
Chen Y, Church GM (2000) Biclustering of expression data. ISMB, pp 93–103
Getz G, Levine E, Domany E (2002) Coupled two-way clustering analysis of gene microarray data. Proc Natl Acad Sci USA 97(22):12079–12084
Lazzeroni L, Owen A (2002) Plaid models for gene expression data. Stat Sin 12(1):61–86
Yang J, Wang W, Wang H, Yu PS (2002) delta-cluster: capturing subspace correlation in a large data set. ICDE 2002:517–528
Cho H, Dhillon IS (2008) Coclustering of human cancer microarrays using minimum sum-squared residue coclustering. IEEE ACM Trans Comput Biol Bioinf 5(3):385–400
Jahangheer SS, Mohammed Y (2009) Fuzzy-adaptive-subspace-iteration-based two-way clustering of microarray data. IEEE ACM Trans Comput Biol Bioinf 6(2):244–259
Desarbo WS, Carroll JD, Clark LA, Green PE (1984) Synthesized clustering: a method for amalgamating clustering bases with differential weighting variables. Psychometrika 49:57–78
Friedman JH, Meulman JJ (2004) Clustering objects on subsets of attributes. J Roy Stat Soc Ser B 66(4):815-849
Huang JZ, Ng MK, Rong H, Li Z (2005) Automated variable weighting in k-means type clustering. IEEE Trans PAMI 27(5):1–12
Jing L, Ng MK, Huang JZ (2007) An entropy weighting k-means algorithm for subspace clustering of high-dimensional sparse data. IEEE TKDE 19(8):1–16
Hall LO, Ozyurt IB, Bezdek JC (1999) Clustering with a genetically optimized approach. IEEE Trans Evol Comput 3(2):103–112
Nasser S, Alkhaldi R, Vert G (2006) A modified fuzzy K-means clustering using expectation maximization. In: IEEE international conference on fuzzy systems Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada, vol 86, no 4, pp 16–21, July 2006
Qu Y, Xu S (2004) Supervised cluster analysis for microarray data based on multivariate Gaussian mixture. Bioinformatics 20(12):1905–1913
Yeung KY, Fraley C, Murua A, Raftery E, Ruzzo WL (2001) Model-based clustering and data transformations for gene expression data. Bioinformatics 17(10):977–987
Zhang B, Horvath S (2005) General framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4(1):1–45
Cho RJ, Campbell M, Winzeler E, Steinmets L, Conway A, Wodicka L, Wolfsberg T, Gabrielian A, Landsman D, Lockhart D, Davis R (1998) A genomi-wide transcriptional analysis of the mitotic cell-cycle. Mol Cell 2:65–73
You ZH, Zhu L, Zheng CH, Yu HJ, Deng SP, Ji Z (2014) Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set. BMC Bioinform 15(Suppl 15):S9
You ZH, Yu JZ, Zhu L, Li S, Wen ZK (2014) A Mapreduce based parallel SVM for large scale predicting protein-protein interactions. Neurocomputing 145:37–43
You ZH, Lei YK, Zhu L, Xia JF, Wang B (2013) Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal component analysis. BMC Bioinform 14(Suppl 8):S10
Du Z, Wang Y, Ji Z (2008) PK-means: a new algorithm for gene clustering. Comput Biol Chem 32(4):243–247
Wang XZ, Ashfaq RAR, Fu AM (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29(3):1185–1196
Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
Wang XZ (2015) Uncertainty in learning from big data-editorial. J Intell Fuzzy Syst 28(5):2329–2330
Acknowledgments
This work is supported by China Postdoctoral Science Foundation (2015M572361) and National Natural Science Foundations of China (61503252 and 61170040).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, Q., Chen, G. Fuzzy soft subspace clustering method for gene co-expression network analysis. Int. J. Mach. Learn. & Cyber. 8, 1157–1165 (2017). https://doi.org/10.1007/s13042-015-0486-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-015-0486-7