Connectedness Profiles in Protein Networks for the Analysis of Gene Expression Data

Pradines, Joël; Dančík, Vlado; Ruttenberg, Alan; Farutin, Victor

doi:10.1007/978-3-540-71681-5_21

Joël Pradines¹,
Vlado Dančík^1,2,
Alan Ruttenberg¹ &
…
Victor Farutin^1,3

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4453))

Included in the following conference series:

Annual International Conference on Research in Computational Molecular Biology

1554 Accesses

Abstract

Knowledge about protein function is often encoded in the form of large and sparse undirected graphs where vertices are proteins and edges represent their functional relationships. One elementary task in the computational utilization of these networks is that of quantifying the density of edges, referred to as connectedness, inside a prescribed protein set. For instance, many functional modules can be identified because of their high connectedness. Since individual proteins can have very different numbers of interactions, a connectedness measure should be well-normalized for vertex degree. Namely, its distribution across random sets of vertices should not be affected when these sets are biased for hubs. We show that such degree-robustness can be achieved via an analytical framework based on a model of random graph with given expected degrees. We also introduce the concept of connectedness profile, which characterizes the relation between adjacency in a graph and a prescribed order of its vertices. A straightforward application to gene expression data and protein networks is the identification of tissue-specific functional modules or cellular processes perturbed in an experiment. The strength of the mapping between gene-expression score and interaction in the network is measured by the area of the connectedness profile. Deriving the distribution of this area under the random graph enables us to define degree-robust statistics that can be computed in \(O \left( M \right)\), M being the network size. These statistics can identify groups of microarray experiments that are pathway-coherent, and more generally, vertex attributes that relate to adjacency in a graph.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bader, G., Betel, D., Hogue, C.: Bind: the biomolecular interaction network database. Nucleic Acids Res. 31, 248–250 (2003)
Article Google Scholar
Peri, S., Navarro, J., Amanchy, R., Kristiansen, T., Jonnalagadda, C., Surendranath, V., Niranjan, V., Muthusamy, B., Gandhi, T., Gronborg, M., Ibarrola, N., Deshpande, N., Shanker, K., Shivashankar, H., Rashmi, B., Ramya, M., Zhao, Z., Chandrika, K., Padma, N., Harsha, H., Yatish, A., Kavitha, M., Menezes, M., Choudhury, D., Suresh, S., Ghosh, N., Saravana, R., Chandran, S., Krishna, S., Joy, M., Anand, S., Madavan, V., Joseph, A., Wong, G., Schiemann, W., Constantinescu, S., Huang, L., Khosravi-Far, R., Steen, H., Tewari, M., Ghaffari, S., Blobe, G., Dang, C., Garcia, J., Pevsner, J., Jensen, O., Roepstorff, P., Deshpande, K., Chinnaiyan, A., Hamosh, A., Chakravarti, A., Pandey, A.: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 13, 2363–2371 (2003)
Article Google Scholar
Han, J., Dupuy, D., Bertin, N., Cusick, M., Vidal, M.: Effect of sampling on topology predictions of protein-protein interaction networks. Nat. Biotechnol. 23, 839–844 (2005)
Article Google Scholar
Maslov, S., Sneppen, K.: Specificity and stability in topology of protein networks. Science 296, 910–913 (2002)
Article Google Scholar
Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network motifs: simple building blocks of complex networks. Science 298, 824–827 (2002)
Article Google Scholar
Sharan, R., Ideker, T., Kelley, B., Shamir, R., RM, K.: Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data. J. Comp. Biol. 12(6), 835–846 (2005)
Article Google Scholar
Koyutürk, M., Grama, A., Szpankowski, W.: Assessing significance of connectivity and conservation in protein interaction networks. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P., Waterman, M. (eds.) RECOMB 2006. LNCS (LNBI), vol. 3909, pp. 45–59. Springer, Heidelberg (2006)
Chapter Google Scholar
Itzkovitz, S., Milo, R., Kashtan, N., Ziv, G., Alon, U.: Subgraphs in random networks. Phys. Rev. E 68, 026127 (2003)
Google Scholar
Bender, E., Canfield, E.: The asymptotic number of labelled graphs with given degree sequences. J. Combin. Theory (A) 24, 296–307 (1978)
Article MATH MathSciNet Google Scholar
Molloy, M., Reed, B.: The size of the giant component of a random graph with a given degree sequence. Comb. Prob. Comp. 7, 295–305 (1998)
Article MATH MathSciNet Google Scholar
Newman, M., Strogatz, S., Watts, D.: Random graphs with arbitrary degree distributions and their applications. Phys. Rev. E 64, 026118 (2001)
Google Scholar
Park, J., Newman, M.: The statistical mechanics of networks. Phys. Rev. E 70, 066117 (2004)
Google Scholar
Chung, F., Lu, L.: The average distance in random graphs with given expected degrees. Proc. Natl. Acad. Sci. USA 99, 15879–15882 (2002)
Article MATH MathSciNet Google Scholar
Pradines, J., Farutin, V., Rowley, S., Dančík, V.: Analyzing protein lists with large networks: edge-count probabilities in random graphs with given expected degrees. J. Comp. Biol. 12(2), 113–128 (2005)
Article Google Scholar
Farutin, V., Robison, K., Lightcap, E., Dancik, V., Ruttenberg, A., Letovsky, S., Pradines, J.: Edge-count probabilities for the identification of local protein communities and their organization. Proteins 62(3), 800–818 (2006)
Article Google Scholar
Newman, M.: Mixing patterns in networks. Phys. Rev. E 67, 026126 (2003)
Google Scholar
Barrett, T., Suzek, T., Troup, D., Wilhite, S., Ngau, W., Ledoux, P., Rudnev, D., Lash, A., Fujibuchi, W., Edgar, R.: Ncbi geo: mining millions of expression profiles–database and tools. Nucleic Acids Res. 33, D562–D566 (2005)
Article Google Scholar
Goto, S., Okuno, Y., Hattori, M., Nishioka, T., Kanehisa, M.: Ligand: database of chemical compounds and reactions in biological pathways. Nucleic Acids Res. 30(1), 402–404 (2002)
Article Google Scholar
Kanehisa, M., Goto, S., Hattori, M., Aoki-Kinoshita, K., Itoh, M., Kawashima, S., Katayama, T., Araki, M., Hirakawa, M.: From genomics to chemical genomics: new developments in kegg. Nucleic Acids Res. 34, D354–D357 (2006)
Article Google Scholar
Hakimi, S.: On realizability of a set of integers as degrees of the vertices of a linear graph. J. Soc. Ind. Appl. Math. 10, 496–506 (1962)
Article MATH MathSciNet Google Scholar
Soffer, S., Vazquez, A.: Clustering coefficient without degree correlations biases. Phys. Rev. E 71(5 Pt 2), 057101 (2005)
Google Scholar
Le Cam, L.: An approximation theorem for the poisson binomial distribution. Pacif. J. Math. 10, 1181–1197 (1960)
MATH Google Scholar
Kerstan, J.: Verallgemeinerung eines satzes von prochorow und le cam. Z Wahrscheinlichkeitstheorie und Verw. Gebiete 2, 173–179 (1964)
Article MATH MathSciNet Google Scholar
Su, A., Wiltshire, T., Batalov, S., Lapp, H., Ching, K., Block, D., Zhang, J., Soden, R., Hayakawa, M., Kreiman, G., Cooke, M., Walker, J., JB, H.: A gene atlas of the mouse and human protein-encoding transcriptomes. Proc. Natl. Acad. Sci. USA 101(16), 6062–6067 (2004)
Article Google Scholar
Newman, M.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69, 066133 (2004)
Google Scholar
Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. E 70, 066111 (2004)
Google Scholar
Ideker, T., Ozier, O., Schwikowski, B., Siegel, A.: Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18(Suppl. 1), S233–S240 (2002)
Google Scholar
Pradines, J., Rudolph-Owen, L., Hunter, J., Leroy, P., Cary, M., Coopersmith, R., Dancik, V., Eltsefon, Y., Farutin, V., Leroy, C., Rees, J., Rose, D., Rowley, S., Ruttenberg, A., Wieghardt, P., Sander, C., Reich, C.: Detection of activity centers in cellular pathways using transcript profiling. J. Biopharm. Stat. 14, 1–21 (2004)
Article MathSciNet Google Scholar
Grigoriev, A.: A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage t7 and the yeast saccharomyces cerevisiae. Nucleic Acids Res. 29(17), 3513–3519 (2001)
Article Google Scholar
Jansen, R., Greenbaum, D., Gerstein, M.: Relating whole-genome expression data with protein-protein interactions. Genome Res. 12(1), 37–46 (2002)
Article Google Scholar
Tian, E., Zhan, F., Walker, R., Rasmussen, E., Ma, Y., Barlogie, B., Shaughnessy, J.: The role of the wnt-signaling antagonist dkk1 in the development of osteolytic lesions in multiple myeloma. N. Engl. J. Med. 349(26), 2483–2494 (2003)
Article Google Scholar
Feller, W.: XI. In: An introduction to probability theory and its applications, vol. 1, pp. 254–255. John Wiley & Sons, New York (1970)
Google Scholar

Download references

Author information

Authors and Affiliations

Computational Sciences, Millennium Pharmaceuticals Inc., 40 Landsdowne Street, Cambridge, MA 02139, USA
Joël Pradines, Vlado Dančík, Alan Ruttenberg & Victor Farutin
Mathematical Institute, Slovak Academy of Sciences, Grešákova 6, 040 01 Košice, Slovakia
Vlado Dančík
Pfizer Global R&D, Research Technology Center, 620 Memorial Drive, Cambridge, MA 02139, USA
Victor Farutin

Authors

Joël Pradines
View author publications
You can also search for this author in PubMed Google Scholar
Vlado Dančík
View author publications
You can also search for this author in PubMed Google Scholar
Alan Ruttenberg
View author publications
You can also search for this author in PubMed Google Scholar
Victor Farutin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Terry Speed Haiyan Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pradines, J., Dančík, V., Ruttenberg, A., Farutin, V. (2007). Connectedness Profiles in Protein Networks for the Analysis of Gene Expression Data. In: Speed, T., Huang, H. (eds) Research in Computational Molecular Biology. RECOMB 2007. Lecture Notes in Computer Science(), vol 4453. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71681-5_21

Download citation

DOI: https://doi.org/10.1007/978-3-540-71681-5_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71680-8
Online ISBN: 978-3-540-71681-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics