Skip to main content

Connectedness Profiles in Protein Networks for the Analysis of Gene Expression Data

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2007)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4453))

  • 1554 Accesses

Abstract

Knowledge about protein function is often encoded in the form of large and sparse undirected graphs where vertices are proteins and edges represent their functional relationships. One elementary task in the computational utilization of these networks is that of quantifying the density of edges, referred to as connectedness, inside a prescribed protein set. For instance, many functional modules can be identified because of their high connectedness. Since individual proteins can have very different numbers of interactions, a connectedness measure should be well-normalized for vertex degree. Namely, its distribution across random sets of vertices should not be affected when these sets are biased for hubs. We show that such degree-robustness can be achieved via an analytical framework based on a model of random graph with given expected degrees. We also introduce the concept of connectedness profile, which characterizes the relation between adjacency in a graph and a prescribed order of its vertices. A straightforward application to gene expression data and protein networks is the identification of tissue-specific functional modules or cellular processes perturbed in an experiment. The strength of the mapping between gene-expression score and interaction in the network is measured by the area of the connectedness profile. Deriving the distribution of this area under the random graph enables us to define degree-robust statistics that can be computed in \(O \left( M \right)\), M being the network size. These statistics can identify groups of microarray experiments that are pathway-coherent, and more generally, vertex attributes that relate to adjacency in a graph.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bader, G., Betel, D., Hogue, C.: Bind: the biomolecular interaction network database. Nucleic Acids Res. 31, 248–250 (2003)

    Article  Google Scholar 

  2. Peri, S., Navarro, J., Amanchy, R., Kristiansen, T., Jonnalagadda, C., Surendranath, V., Niranjan, V., Muthusamy, B., Gandhi, T., Gronborg, M., Ibarrola, N., Deshpande, N., Shanker, K., Shivashankar, H., Rashmi, B., Ramya, M., Zhao, Z., Chandrika, K., Padma, N., Harsha, H., Yatish, A., Kavitha, M., Menezes, M., Choudhury, D., Suresh, S., Ghosh, N., Saravana, R., Chandran, S., Krishna, S., Joy, M., Anand, S., Madavan, V., Joseph, A., Wong, G., Schiemann, W., Constantinescu, S., Huang, L., Khosravi-Far, R., Steen, H., Tewari, M., Ghaffari, S., Blobe, G., Dang, C., Garcia, J., Pevsner, J., Jensen, O., Roepstorff, P., Deshpande, K., Chinnaiyan, A., Hamosh, A., Chakravarti, A., Pandey, A.: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 13, 2363–2371 (2003)

    Article  Google Scholar 

  3. Han, J., Dupuy, D., Bertin, N., Cusick, M., Vidal, M.: Effect of sampling on topology predictions of protein-protein interaction networks. Nat. Biotechnol. 23, 839–844 (2005)

    Article  Google Scholar 

  4. Maslov, S., Sneppen, K.: Specificity and stability in topology of protein networks. Science 296, 910–913 (2002)

    Article  Google Scholar 

  5. Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network motifs: simple building blocks of complex networks. Science 298, 824–827 (2002)

    Article  Google Scholar 

  6. Sharan, R., Ideker, T., Kelley, B., Shamir, R., RM, K.: Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data. J. Comp. Biol. 12(6), 835–846 (2005)

    Article  Google Scholar 

  7. Koyutürk, M., Grama, A., Szpankowski, W.: Assessing significance of connectivity and conservation in protein interaction networks. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P., Waterman, M. (eds.) RECOMB 2006. LNCS (LNBI), vol. 3909, pp. 45–59. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  8. Itzkovitz, S., Milo, R., Kashtan, N., Ziv, G., Alon, U.: Subgraphs in random networks. Phys. Rev. E 68, 026127 (2003)

    Google Scholar 

  9. Bender, E., Canfield, E.: The asymptotic number of labelled graphs with given degree sequences. J. Combin. Theory (A) 24, 296–307 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  10. Molloy, M., Reed, B.: The size of the giant component of a random graph with a given degree sequence. Comb. Prob. Comp. 7, 295–305 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  11. Newman, M., Strogatz, S., Watts, D.: Random graphs with arbitrary degree distributions and their applications. Phys. Rev. E 64, 026118 (2001)

    Google Scholar 

  12. Park, J., Newman, M.: The statistical mechanics of networks. Phys. Rev. E 70, 066117 (2004)

    Google Scholar 

  13. Chung, F., Lu, L.: The average distance in random graphs with given expected degrees. Proc. Natl. Acad. Sci. USA 99, 15879–15882 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  14. Pradines, J., Farutin, V., Rowley, S., Dančík, V.: Analyzing protein lists with large networks: edge-count probabilities in random graphs with given expected degrees. J. Comp. Biol. 12(2), 113–128 (2005)

    Article  Google Scholar 

  15. Farutin, V., Robison, K., Lightcap, E., Dancik, V., Ruttenberg, A., Letovsky, S., Pradines, J.: Edge-count probabilities for the identification of local protein communities and their organization. Proteins 62(3), 800–818 (2006)

    Article  Google Scholar 

  16. Newman, M.: Mixing patterns in networks. Phys. Rev. E 67, 026126 (2003)

    Google Scholar 

  17. Barrett, T., Suzek, T., Troup, D., Wilhite, S., Ngau, W., Ledoux, P., Rudnev, D., Lash, A., Fujibuchi, W., Edgar, R.: Ncbi geo: mining millions of expression profiles–database and tools. Nucleic Acids Res. 33, D562–D566 (2005)

    Article  Google Scholar 

  18. Goto, S., Okuno, Y., Hattori, M., Nishioka, T., Kanehisa, M.: Ligand: database of chemical compounds and reactions in biological pathways. Nucleic Acids Res. 30(1), 402–404 (2002)

    Article  Google Scholar 

  19. Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K., Itoh, M., Kawashima, S., Katayama, T., Araki, M., Hirakawa, M.: From genomics to chemical genomics: new developments in kegg. Nucleic Acids Res. 34, D354–D357 (2006)

    Article  Google Scholar 

  20. Hakimi, S.: On realizability of a set of integers as degrees of the vertices of a linear graph. J. Soc. Ind. Appl. Math. 10, 496–506 (1962)

    Article  MATH  MathSciNet  Google Scholar 

  21. Soffer, S., Vazquez, A.: Clustering coefficient without degree correlations biases. Phys. Rev. E 71(5 Pt 2), 057101 (2005)

    Google Scholar 

  22. Le Cam, L.: An approximation theorem for the poisson binomial distribution. Pacif. J. Math. 10, 1181–1197 (1960)

    MATH  Google Scholar 

  23. Kerstan, J.: Verallgemeinerung eines satzes von prochorow und le cam. Z Wahrscheinlichkeitstheorie und Verw. Gebiete 2, 173–179 (1964)

    Article  MATH  MathSciNet  Google Scholar 

  24. Su, A., Wiltshire, T., Batalov, S., Lapp, H., Ching, K., Block, D., Zhang, J., Soden, R., Hayakawa, M., Kreiman, G., Cooke, M., Walker, J., JB, H.: A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. USA 101(16), 6062–6067 (2004)

    Article  Google Scholar 

  25. Newman, M.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69, 066133 (2004)

    Google Scholar 

  26. Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70, 066111 (2004)

    Google Scholar 

  27. Ideker, T., Ozier, O., Schwikowski, B., Siegel, A.: Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18(Suppl. 1), S233–S240 (2002)

    Google Scholar 

  28. Pradines, J., Rudolph-Owen, L., Hunter, J., Leroy, P., Cary, M., Coopersmith, R., Dancik, V., Eltsefon, Y., Farutin, V., Leroy, C., Rees, J., Rose, D., Rowley, S., Ruttenberg, A., Wieghardt, P., Sander, C., Reich, C.: Detection of activity centers in cellular pathways using transcript profiling. J. Biopharm. Stat. 14, 1–21 (2004)

    Article  MathSciNet  Google Scholar 

  29. Grigoriev, A.: A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage t7 and the yeast saccharomyces cerevisiae. Nucleic Acids Res. 29(17), 3513–3519 (2001)

    Article  Google Scholar 

  30. Jansen, R., Greenbaum, D., Gerstein, M.: Relating whole-genome expression data with protein-protein interactions. Genome Res. 12(1), 37–46 (2002)

    Article  Google Scholar 

  31. Tian, E., Zhan, F., Walker, R., Rasmussen, E., Ma, Y., Barlogie, B., Shaughnessy, J.: The role of the wnt-signaling antagonist dkk1 in the development of osteolytic lesions in multiple myeloma. N. Engl. J. Med. 349(26), 2483–2494 (2003)

    Article  Google Scholar 

  32. Feller, W.: XI. In: An introduction to probability theory and its applications, vol. 1, pp. 254–255. John Wiley & Sons, New York (1970)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Terry Speed Haiyan Huang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Pradines, J., Dančík, V., Ruttenberg, A., Farutin, V. (2007). Connectedness Profiles in Protein Networks for the Analysis of Gene Expression Data. In: Speed, T., Huang, H. (eds) Research in Computational Molecular Biology. RECOMB 2007. Lecture Notes in Computer Science(), vol 4453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71681-5_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71681-5_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71680-8

  • Online ISBN: 978-3-540-71681-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics