Abstract
Recent genome sequencing studies have shown that the somatic mutations that drive cancer development are distributed across a large number of genes. This mutational heterogeneity complicates efforts to distinguish functional mutations from sporadic, passenger mutations. Since cancer mutations are hypothesized to target a relatively small number of cellular signaling and regulatory pathways, a common approach is to assess whether known pathways are enriched for mutated genes. However, restricting attention to known pathways will not reveal novel cancer genes or pathways. An alterative strategy is to examine mutated genes in the context of genome-scale interaction networks that include both well characterized pathways and additional gene interactions measured through various approaches. We introduce a computational framework for de novo identification of subnetworks in a large gene interaction network that are mutated in a significant number of patients. This framework includes two major features. First, we introduce a diffusion process on the interaction network to define a local neighborhood of “influence” for each mutated gene in the network. Second, we derive a two-stage multiple hypothesis test to bound the false discovery rate (FDR) associated with the identified subnetworks. We test these algorithms on a large human protein-protein interaction network using mutation data from two recent studies: glioblastoma samples from The Cancer Genome Atlas and lung adenocarcinoma samples from the Tumor Sequencing Project. We successfully recover pathways that are known to be important in these cancers, such as the p53 pathway. We also identify additional pathways, such as the Notch signaling pathway, that have been implicated in other cancers but not previously reported as mutated in these samples. Our approach is the first, to our knowledge, to demonstrate a computationally efficient strategy for de novo identification of statistically significant mutated subnetworks. We anticipate that our approach will find increasing use as cancer genome studies increase in size and scope.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Axelson, H.: Notch signaling and cancer: emerging complexity. Semin. Cancer Biol. 14, 317–319 (2004)
Bader, G.D., Donaldson, I., Wolting, C., Ouellette, B.F., Pawson, T., Hogue, C.W.: BIND–The Biomolecular Interaction Network Database. Nucleic Acids Res. 29, 242–245 (2001)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate. J. Royal Statistical Society, Series B 57, 289–300 (1995)
Benjamini, Y., Yekutieli, D.: The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29(4), 1165–1188 (2001)
Chuang, H.Y., Lee, E., Liu, Y.T., Lee, D., Ideker, T.: Network-based classification of breast cancer metastasis. Mol. Syst. Biol. 3, 140 (2007)
Chung, F.: The heat kernel as the pagerank of a graph. Proceedings of the National Academy of Sciences 104(50), 19735 (2007)
Collins, B.J., Kleeberger, W., Ball, D.W.: Notch in lung development and lung cancer. Semin. Cancer Biol. 14, 357–364 (2004)
Ding, L., et al.: Somatic mutations affect key pathways in lung adenocarcinoma. Nature 455(7216), 1069–1075 (2008)
Doyle, P.G., Snell, J.L.: Random Walks and Electric Networks. The Mathematical Association of America (1984)
Feige, U., Kortsarz, G., Peleg, D.: The dense k-subgraph problem. Algorithmica 29, 2001 (1999)
Greenman, C., et al.: Patterns of somatic mutation in human cancer genomes. Nature 446, 153–158 (2007)
Hahn, W.C., Weinberg, R.A.: Modelling the molecular circuitry of cancer. Nat. Rev. Cancer 2(5), 331–341 (2002)
Hescott, B.J., Leiserson, M.D.M., Cowen, L.J., Slonim, D.K.: Evaluating between-pathway models with expression data. In: Batzoglou, S. (ed.) RECOMB 2009. LNCS, vol. 5541, pp. 372–385. Springer, Heidelberg (2009)
Hochbaum, D.S. (ed.): Approximation algorithms for NP-hard problems. PWS Publishing Co., Boston (1997)
Hodges, E., et al.: Genome-wide in situ exon capture for selective resequencing. Nat. Genet. 39, 1522–1527 (2007)
Ideker, T., Ozier, O., Schwikowski, B., Siegel, A.F.: Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18(suppl. 1), S233–S240
Jensen, L.J., et al.: STRING 8–a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 37, D412–D416 (2009)
Jones, S., et al.: Core signaling pathways in human pancreatic cancers revealed by global genomic analyses. Science 321(5897), 1801–1806 (2008)
Jonsson, P.F., Bates, P.A.: Global topological features of cancer proteins in the human interactome. Bioinformatics 22, 2291–2297 (2006)
Kanehisa, M., Goto, S.: KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000)
Karni, S., Soreq, H., Sharan, R.: A network-based method for predicting disease-causing genes. J. Comput. Biol. 16, 181–189 (2009)
Keshava Prasad, T.S., et al.: Human Protein Reference Database–2009 update. Nucleic Acids Res. 37, D767–D772 (2009)
Kirsch, A., Mitzenmacher, M., Pietracaprina, A., Pucci, G., Upfal, E., Vandin, F.: An efficient rigorous approach for identifying statistically significant frequent itemsets. In: PODS, pp. 117–126 (2009)
Kondor, R.I., Lafferty, J.: Diffusion kernels on graphs and other discrete structures. In: Proceedings of the ICML, pp. 315–322 (2002)
Lin, J., et al.: A multidimensional analysis of genes mutated in breast and colorectal cancers. Genome Res. 17, 1304–1318 (2007)
Liu, M., et al.: Network-based analysis of affected biological processes in type 2 diabetes models. PLoS Genet. 3, e96 (2007)
Lovász, L.: Random walks on graphs: A survey (1993)
Ma, X., Lee, H., Wang, L., Sun, F.: CGI: a new approach for prioritizing genes by combining gene expression and protein-protein interaction data. Bioinformatics 23, 215–221 (2007)
Nabieva, E., Jim, K., Agarwal, A., Chazelle, B., Singh, M.: Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21(suppl. 1), i302–i310 (2005)
Nacu, S., Critchley-Thorne, R., Lee, P., Holmes, S.: Gene expression network analysis and applications to immunology. Bioinformatics 23, 850–858 (2007)
The Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455(7216), 1061–1068 (2008)
Parsons, D.W., et al.: An integrated genomic analysis of human glioblastoma multiforme. Science 321(5897), 1807–1812 (2008)
Qi, Y., Suhail, Y., Lin, Y.Y., Boeke, J.D., Bader, J.S.: Finding friends and enemies in an enemies-only network: a graph diffusion kernel for predicting novel genetic interactions and co-complex membership from yeast genetic interactions. Genome Res. 18, 1991–2004 (2008)
Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U., Eisenberg, D.: The Database of Interacting Proteins: 2004 update. Nucleic Acids Res. 32, D449–D451 (2004)
Shuai, T.-P., Hu, X.: Connected set cover problem and its applications. In: Cheng, S.-W., Poon, C.K. (eds.) AAIM 2006. LNCS, vol. 4041, pp. 243–254. Springer, Heidelberg (2006)
Sjoblom, T., et al.: The consensus coding sequences of human breast and colorectal cancers. Science 314(5797), 268–274 (2006)
Tsuda, K., Noble, W.S.: Learning kernels from biological networks by maximizing entropy. Bioinformatics 20(suppl. 1), i326–i333 (2004)
Ulitsky, I., Karp, R.M., Shamir, R.: Detecting disease-specific dysregulated pathways via analysis of clinical expression profiles. In: Vingron, M., Wong, L. (eds.) RECOMB 2008. LNCS (LNBI), vol. 4955, pp. 347–359. Springer, Heidelberg (2008)
Vogelstein, B., Kinzler, K.W.: Cancer genes and the pathways they control. Nat. Med. 10, 789–799 (2004)
Wood, L.D., et al.: The genomic landscapes of human breast and colorectal cancers. Science 318(5853), 1108–1113 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vandin, F., Upfal, E., Raphael, B.J. (2010). Algorithms for Detecting Significantly Mutated Pathways in Cancer. In: Berger, B. (eds) Research in Computational Molecular Biology. RECOMB 2010. Lecture Notes in Computer Science(), vol 6044. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12683-3_33
Download citation
DOI: https://doi.org/10.1007/978-3-642-12683-3_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12682-6
Online ISBN: 978-3-642-12683-3
eBook Packages: Computer ScienceComputer Science (R0)