Abstract
Network based analyses are commonly used as powerful tools to interpret the findings of genome-wide association studies (GWAS) in a functional context. In particular, identification of disease-associated functional modules, i.e., highly connected protein-protein interaction (PPI) subnetworks with high aggregate disease association, are shown to be promising in uncovering the functional relationships among genes and proteins associated with diseases. An important issue in this regard is the scoring of subnetworks by integrating two quantities that are not readily compatible: disease association of individual gene products and network connectivity among proteins. Current scoring schemes either disregard the level of connectivity and focus on the aggregate disease association of connected proteins or use a linear combination of these two quantities. However, such scoring schemes may produce arbitrarily large subnetworks which are often not statistically significant, or require tuning of parameters that are used to weigh the contributions of network connectivity and disease association. Here, we propose a parameter-free scoring scheme that aims to score subnetworks by assessing the disease association of pairwise interactions and incorporating the statistical significance of network connectivity and disease association. We test the proposed scoring scheme on a GWAS dataset for type II diabetes (T2D). Our results suggest that subnetworks identified by commonly used methods may fail tests of statistical significance after correction for multiple hypothesis testing. In contrast, the proposed scoring scheme yields highly significant subnetworks, which contain biologically relevant proteins that cannot be identified by analysis of genome-wide association data alone.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adie, E.A., Adams, R.R., et al.: Speeding disease gene discovery by sequence based candidate prioritization. BMC Bioinformatics, 6 (2005)
Adie, E.A., Adams, R.R., et al.: SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics, 22 (2006)
Baranzini, S.E., Galwey, N.W., Wang, J., Khankhanian, P., et al.: Pathway and network-based analysis of genome-wide association studies in multiple sclerosis. Hum. Mol. Genet. 18, 2078–2090 (2009)
Obberghen, E.V., Grunfeld, C., Baird, K., Kahn, C.R.: Glucocorticoid-induced insulin resistance in vitro: Evidence for both receptor and postreceptor defects. Endocrinology 109, 1723–1730 (1981)
Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev, E 70 (2004)
W. T. C. C. Consortium: Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007)
Deng, J.Y., Hsieh, P.S., Huang, J.P., et al.: Activation of estrogen receptor is crucial for resveratrol-stimulating muscular glucose uptake via both insulin-dependent and -independent pathways. Diabetes 57, 1814–1823 (2008)
Driel, M.A., Cuelenaere, K., Kemmeren, P.P., et al.: GeneSeeker: extraction and integration of human disease-related information from web-based genetic databases. Nucleic Acids Res., 33 (2005)
Gallagher, C.J., Langerfeld, C.D., Gordon, C.J., et al.: Association of the estrogen receptor-gene with the metabolic syndrome and its component traits in african-american families. Diabetes 56, 2135–2141 (2007)
Ideker, T., Ozier, O., Schwikowski, B., Siegel, A.F.: Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18, 233–240 (2002)
Jia, P., Zheng, S., Long, J., Zheng, W., Zhao, Z.: dmGWAS: dense module searching for genome-wide association studies in protein-protein interaction networks. Bioinformatics 27, 95–102 (2011)
Lim, J., Hong, K., Jin, H., Kim, Y., Park, H., Oh, B.: Type 2 diabetes genetic association database manually curated for the study design and odds ratio. BMC Medical Informatics and Decision Making (2010)
Linderman, G.C., Chance, M.R., Bebek, Gurkan.: MicroArray Gene expression and Network Evaluation Toolkit. Nucl. Acids Res., MAGNET (2012)
Lopez-Bigas, N., Ouzounis, C.A.: Genome-wide identification of genes likely to be involved in human genetic disease. Nucleic Acids Res., 32 (2004)
Ma, H., Schadt, E., Kaplan, L.M., Zhao, H.: COSINE: COndition-SpecIfic sub-NEtwork identification using a global optimization method. Bioinformatics (2011)
Maglott, D., Ostell, J., Pruitt, K.D., Tatusova, T.: Entrez gene: gene-centered information at NCBI. Nucl. Acids Res., 35 (2007)
Moore, J.H., Asselbergs, F.W., Williams, S.M.: Bioinformatics challenges for genome-wide association studies. Bioinformatics 26(4), 445–455 (2010)
Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev, E 69(066133) (2004)
Perez-Iratxeta, C., Wjst, M., Bork, P., Andrade, M.A.: G2D: a tool for mining genes associated with disease. BMC Genet., 6 (2005)
Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics 81, 559–575 (2007)
Ritchie, M.D.: Using biological knowledge to uncover the mystery in the search for epistasis in genome-wide association studies. Annals of Human Genetics 75(1), 172–182 (2011)
Scott, L.J.: A Genome-Wide Association Study of Type 2 Diabetes in Finns Detects Multiple Susceptibility Variants. Science 316(5829), 1341–1345 (2007)
Tiffin, N., Adie, E., Turner, F., et al.: Computational disease gene identification: a concert of methods prioritizes type 2 diabetes and obesity candidate genes. Nucleic Acids Res. (2006)
Tiffin, N., Kelso, J.F., et al.: Integration of text- and data-mining using ontologies successfully selects disease gene candidates. Nucleic Acids Res., 33 (2005)
Turner, F.S., Clutterbuck, D.R., Semple, C.A.: POCUS: mining genomic sequence annotation to predict disease genes. Genome Biol., 4 (2003)
Xia, Y., Wang, Y.: Condition specific subnetwork identification using an optimization model. In: Proceedings of The Second International Symposium on Optimization and Systems Biology, pp. 333–340 (2008)
Zhang, Y., Zhao, X., Yang, F.: The mediator complex and lipid metabolism. Journal of Biochemical and Pharmacological Research 1, 51–55 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ayati, M., Erten, S., Koyutürk, M. (2014). What Do We Learn from Network-Based Analysis of Genome-Wide Association Data?. In: Esparcia-Alcázar, A., Mora, A. (eds) Applications of Evolutionary Computation. EvoApplications 2014. Lecture Notes in Computer Science(), vol 8602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45523-4_70
Download citation
DOI: https://doi.org/10.1007/978-3-662-45523-4_70
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45522-7
Online ISBN: 978-3-662-45523-4
eBook Packages: Computer ScienceComputer Science (R0)