Skip to main content

Advertisement

Log in

Qualitative assessment of functional module detectors on microarray and RNASeq data

  • Review Article
  • Published:
Network Modeling Analysis in Health Informatics and Bioinformatics Aims and scope Submit manuscript

Abstract

A set of correlated and co-expressed genes, often referred as a functional module, play a synergistic role during any disease or any biological activities. Genes participating in a common module may cause clinically similar diseases and share a common genetic origin of their associated disease phenotypes. Identifying such modules may be helpful in system-level understanding of biological and cellular processes or pathophysiologic basis of associated diseases. As a result detecting such functional modules is an active research issue in the area of computational biology. Some techniques have been proposed so far to find functional modules based on gene co-regulation or co-expression data. These methods are broadly categorized into non-network based gene expression clustering techniques and network-based methods that extract modules from gene co-expression networks using expression data sources. We survey main approaches for obtaining modules, and we evaluate their performance regarding finding biologically significant gene modules in light of both microarray and RNASeq data. No prior effort, other than independent assessment, has been made so far to evaluate their performances in an integrated way in the light of both microarray and RNASeq data. We assess the significance of the modules in terms of gene ontology and pathway analysis. We select a few of the best performers to access their capability in finding disease-specific modules. Our comparison reveals that no single algorithm is a winner in all respects. Moreover, performances vary widely with microarray and RNASeq data. Relatively, biclustering performs better, when we consider microarray expression data, but fails to perform well in case of RNASeq data. Network-based techniques work better in RNASeq.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

References

  • Ahmad W, Khokhar A (2008) Phoenix: privacy preserving biclustering on horizontally partitioned data. Privacy, Security, and Trust in KDD pp. 14–32

  • Bader GD, Hogue CW (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform 4(1):2

    Google Scholar 

  • Barage SH, Sonawane KD (2015) Amyloid cascade hypothesis: pathogenesis and therapeutic strategies in alzheimer’s disease. Neuropeptides 52:1–18

    Google Scholar 

  • Barkow S, Bleuler S, Prelić A, Zimmermann P, Zitzler E (2006) Bicat: a biclustering analysis toolbox. Bioinformatics 22(10):1282–1283

    Google Scholar 

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B (Methodological) 57(1):289–300

    MathSciNet  MATH  Google Scholar 

  • Berriz GF, King OD, Bryant B, Sander C, Roth FP (2003) Characterizing gene sets with funcassociate. Bioinformatics 19(18):2502–2504

    Google Scholar 

  • Bhattacharya A, De RK (2008) Divisive correlation clustering algorithm (DCCA) for grouping of genes: detecting varying patterns in expression profiles. Bioinformatics 24(11):1359–1366

    Google Scholar 

  • Brohee S, Van Helden J (2006) Evaluation of clustering algorithms for protein–protein interaction networks. BMC Bioinform 7(1):488

    Google Scholar 

  • Bye CR, Jönsson ME, Björklund A, Parish CL, Thompson LH (2015) Transcriptome analysis reveals transmembrane targets on transplantable midbrain dopamine progenitors. Proc Natl Acad Sci 112(15):E1946–E1955

    Google Scholar 

  • Cannataro M, Guzzi PH, Veltri P (2010) Protein-to-protein interactions: technologies, databases, and algorithms. ACM Comput Surveys (CSUR) 43(1):1

    Google Scholar 

  • Cheng Y, Church GM (2000) Biclustering of expression data. In: Proceedings of the International Conference on Intelligent Systems for Molecular Biology,  pp 93–103

  • Cho YR, Mina M, Lu Y, Kwon N, Guzzi PH (2013) M-finder: uncovering functionally associated proteins from interactome data integrated with go annotations. Proteome Sci. 11(1):S3

    Google Scholar 

  • van Dam S, Võsa U, van der Graaf A, Franke L, de Magalhães JP (2018) Gene co-expression analysis for functional classification and gene–disease predictions. Brief Bioinform 19(4):575–592

    Google Scholar 

  • Davidson E, Levin M (2005) Gene regulatory networks. Proc Nati Acad Sci USA 102(14):4935

    Google Scholar 

  • Dhillon IS (2001) Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp 269–274. ACM

  • Enright AJ, Van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30(7):1575–1584

    Google Scholar 

  • Fortunato S (2010) Community detection in graphs. Phys Rep 486(3):75–174

    MathSciNet  Google Scholar 

  • George T, Merugu S (2005) A scalable collaborative filtering framework based on co-clustering. In: ICDM '05 proceedings of the fifth IEEE International Conference on Data Mining, IEEE Computer Society Washington, DC, USA, pp 625–628

  • Gibbons FD, Roth FP (2002) Judging the quality of gene expression-based clustering methods using gene annotation. Genome Res 12(10):1574–1581

    Google Scholar 

  • Gonçalves JP, Madeira SC, Oliveira AL (2009) Biggests: integrated environment for biclustering analysis of time series gene expression data. BMC Res Notes 2(1):124

    Google Scholar 

  • Gremalschi S, Altun G, Astrovskaya I, Zelikovsky A (2009) Mean square residue biclustering with missing data and row inversions. In: International symposium on bioinformatics research and applications. Springer, Berlin, pp 28–39

    Google Scholar 

  • Guzzi PH (2016) Microarray data analysis: methods and applications. Humana Press, New York City

    MATH  Google Scholar 

  • Guzzi PH, Milenković T (2017) Survey of local and global biological network alignment: the need to reconcile the two sides of the same coin. Brief Bioinform 19(3):472–481

    Google Scholar 

  • Guzzi PH, Masciari E, Mazzeo GM, Zaniolo C (2014) A discussion on the biological relevance of clustering results. In: Information technology in bio- and medical informatics—5th international conference, ITBAM 2014, Munich, Germany, September 2, 2014. Proceedings, pp 30–44

  • Hartigan JA, Hartigan J (1975) Clustering algorithms, vol 209. Wiley, New York

    MATH  Google Scholar 

  • Henriques R, Ferreira FL, Madeira SC (2017) Bicpams: software for biological data analysis with pattern-based biclustering. BMC Bioinform 18(1):82

    Google Scholar 

  • Huang DW, Sherman BT, Lempicki RA (2009) Systematic and integrative analysis of large gene lists using david bioinformatics resources. Nature Protocols 4(1):44–57

    Google Scholar 

  • Immermann F, Huang Y (2003) An introduction to cluster analysis. In: Burczynski ME (ed) An introduction to toxicogenomics, vol 200. CRC Press, Boca Raton, pp 45–78

    Google Scholar 

  • Jiang D, Pei J, Zhang A (2003) Dhc: a density-based hierarchical clustering method for time series gene expression data. In: Proceedings. Third IEEE symposium on bioinformatics and bioengineering, 2003, pp 393–400. IEEE

  • Jiang D, Tang C, Zhang A (2004) Cluster analysis for gene expression data: a survey. IEEE Trans Knowl Data Eng 16(11):1370–1386

    Google Scholar 

  • Langfelder P, Horvath S (2008) Wgcna: an R package for weighted correlation network analysis. BMC Bioinform 9(1):559

    Google Scholar 

  • Liu R, Cheng Y, Yu J, Lv QL, Zhou HH (2015) Identification and validation of gene module associated with lung cancer through coexpression network analysis. Gene 563(1):56–62

    Google Scholar 

  • Liu Z, Song Yq, Xie Ch, Tang Z (2016) A new clustering method of gene expression data based on multivariate gaussian mixture models. Signal Image Video Process 10(2):359–368

    Google Scholar 

  • MacQueen J, et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, Oakland, CA, USA, vol 1. pp 281–297

  • Madeira SC, Oliveira AL (2004) Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinf (TCBB) 1(1):24–45

    Google Scholar 

  • Mahanta P, Ahmed HA, Bhattacharyya DK, Ghosh A (2014) Fumet: a fuzzy network module extraction technique for gene expression data. J Biosci 39(3):351–364

    Google Scholar 

  • Mahanta P, Ahmed HA, Bhattacharyya DK, Kalita JK (2012) An effective method for network module extraction from microarray data. BMC Bioinf 13(13):S4

    Google Scholar 

  • Manners HN, Jha M, Guzzi PH, Veltri P, Roy S (2016) Computational methods for detecting functional modules from gene regulatory network. In: Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, ACM, p 3:1–3:6

  • Mardis ER (2008) The impact of next-generation sequencing technology on genetics. Trends Genet 24(3):133–141

    Google Scholar 

  • Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A (2006) Aracne: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinform 7(1):S7

    Google Scholar 

  • Masellis M, Collinson S, Freeman N, Tampakeras M, Levy J, Tchelet A, Eyal E, Berkovich E, Eliaz RE, Abler V et al (2016) Dopamine d2 receptor gene variants and response to rasagiline in early parkinsons disease: a pharmacogenetic study. Brain 139(7):2050–2062

    Google Scholar 

  • Montojo J, Zuberi K, Rodriguez H, Kazi F, Wright G, Donaldson SL, Morris Q, Bader GD (2010) Genemania cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics 26(22):2927–2928

    Google Scholar 

  • Newman AM, Cooper JB (2010) Autosome: a clustering method for identifying gene expression modules without prior knowledge of cluster number. BMC Bioinform 11(1):1

    Google Scholar 

  • O’Brien RJ, Wong PC (2011) Amyloid precursor protein processing and Alzheimer’s disease. Annu Rev Neurosci 34:185–204

    Google Scholar 

  • Orilieri E, Cappellano G, Clementi R, Cometa A, Ferretti M, Cerutti E, Cadario F, Martinetti M, Larizza D, Calcaterra V et al (2008) Variations of the perforin gene in patients with type 1 diabetes. Diabetes 57(4):1078–1083

    Google Scholar 

  • Prelić A, Bleuler S, Zimmermann P, Wille A, Bühlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E (2006) A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9):1122–1129

    Google Scholar 

  • Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabási AL (2002) Hierarchical organization of modularity in metabolic networks. Science 297(5586):1551–1555

    Google Scholar 

  • Reiss DJ, Baliga NS, Bonneau R (2006) Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinform 7(1):280

    Google Scholar 

  • Reiss DJ, Plaisier CL, Wu WJ, Baliga NS (2015) cMonkey2: automated, systematic, integrated detection of co-regulated gene modules for any organism. Nucleic Acids Res 43(13):e87

    Google Scholar 

  • Richard H, Schulz MH, Sultan M, Nurnberger A, Schrinner S, Balzereit D, Dagand E, Rasche A, Lehrach H, Vingron M (2010) Prediction of alternative isoforms from exon expression levels in RNA-seq experiments. Nucleic Acids Res 38(10):e112–e112

    Google Scholar 

  • Roy S, Bhattacharyya DK, Kalita JK (2013) Cobi: pattern based co-regulated biclustering of gene expression data. Pattern Recognit Lett 34(14):1669–1678

    Google Scholar 

  • Roy S, Bhattacharyya DK, Kalita JK (2014) Reconstruction of gene co-expression network from microarray data using local expression patterns. BMC Bioinform 15(7):S10

    Google Scholar 

  • Roy S, Bhattacharyya DK, Kalita JK (2015) Analysis of gene expression patterns using biclustering. In: Microarray Data Analysis. Humana Press, New York, pp 91–103

    Google Scholar 

  • Ruan J, Zhang W (2007) Identification and evaluation of functional modules in gene co-expression networks. In: Ideker T, Bafna V (eds) Systems Biology and Computational Proteomics. RSB 2006, RCP 2006, vol 4532. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg

  • Shamir R, Maron-Katz A, Tanay A, Linhart C, Steinfeld I, Sharan R, Shiloh Y, Elkon R (2005) Expander-an integrative program suite for microarray data analysis. BMC Bioinform 6(1):232

    Google Scholar 

  • Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504

    Google Scholar 

  • Sharan R, Shamir R (2000) CLICK: a clustering algorithm with applications to gene expression analysis. In: Proceedings of the international conference on intelligent systems for molecular biology, pp 307–316

  • Sherlock G (2000) Analysis of large-scale gene expression data. Curr Opin Immunol 12(2):201–205

    Google Scholar 

  • Shiba-Fukushima K, Ishikawa KI, Inoshita T, Izawa N, Takanashi M, Sato S, Onodera O, Akamatsu W, Okano H, Imai Y, Hattori N (2017) Evidence that phosphorylated ubiquitin signaling is involved in the etiology of Parkinson’s disease. Hum Mol Genet 26(16):3172–3185

    Google Scholar 

  • Solinas G, Becattini B (2017) JNK at the crossroad of obesity, insulin resistance, and cell stress response. Mole Metab 6(2):174

    Google Scholar 

  • Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E, Lander ES, Golub TR (1999) Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci 96(6):2907–2912

    Google Scholar 

  • Tanay A, Sharan R, Shamir R (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics 18(suppl 1):S136–S144

    Google Scholar 

  • Tang MX, Stern Y, Marder K, Bell K, Gurland B, Lantigua R, Andrews H, Feng L, Tycko B, Mayeux R (1998) The apoe- 4 allele and the risk of Alzheimer disease among African Americans, Whites, and Hispanics. JAMA 279(10):751–755

    Google Scholar 

  • Thalamuthu A, Mukhopadhyay I, Zheng X, Tseng GC (2006) Evaluation and comparison of gene clustering methods in microarray analysis. Bioinformatics 22(19):2405–2412

    Google Scholar 

  • Van Dongen SM (2000) Graph clustering by flow simulation (Doctoral dissertation)

  • Veugelen S, Saito T, Saido TC, Chávez-Gutiérrez L, De Strooper B (2016) Familial alzheimers disease mutations in presenilin generate amyloidogenic a\(\beta\) peptide seeds. Neuron 90(2):410–416

    Google Scholar 

  • Wang Z, Gerstein M, Snyder M (2009) Rna-seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10(1):57–63

    Google Scholar 

  • Weissmann L, Quaresma PG, Santos AC, de Matos AH, Pascoal VDB, Zanotto TM, Castro G, Guadagnini D, da Silva JM, Velloso LA et al (2014) Ikk\(\varepsilon\) is key to induction of insulin resistance in the hypothalamus, and its inhibition reverses obesity. Diabetes 63(10):3334–3345

    Google Scholar 

  • Wu Fx (2008) Genetic weighted k-means algorithm for clustering large-scale gene expression data. BMC Bioinform 9(6):S12

    Google Scholar 

  • Wu G, Stein L (2012) A network module-based method for identifying cancer prognostic signatures. Genome Biol 13(12):R112

    Google Scholar 

  • Yeung KY, Haynor DR, Ruzzo WL (2001) Validating clustering for gene expression data. Bioinformatics 17(4):309–318

    Google Scholar 

  • Zhang Y, Nam CS, Zhou G, Jin J, Wang X, Cichocki A (2018) Temporally constrained sparse group spatial patterns for motor imagery bci. IEEE Trans Cybern 99:1–11

    Google Scholar 

  • Zhao Y, Li H, Fang S, Kang Y, Hao Y, Li Z, Bu D, Sun N, Zhang MQ, Chen R (2016) Noncode 2016: an informative and valuable data source of long non-coding RNAs. Nucleic Acids Res 44(D1):D203–D208

    Google Scholar 

  • Zhou G, Zhao Q, Zhang Y, Adalı T, Xie S, Cichocki A (2016) Linked component analysis from matrices to high-order tensors: applications to biomedical data. Proc IEEE 104(2):310–331

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Swarup Roy.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jha, M., Guzzi, P.H. & Roy, S. Qualitative assessment of functional module detectors on microarray and RNASeq data. Netw Model Anal Health Inform Bioinforma 8, 1 (2019). https://doi.org/10.1007/s13721-018-0180-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13721-018-0180-2

Keywords

Navigation