Abstract
The increasing availability of large-scale protein-protein interaction (PPI) data has made it possible to understand the basic components and organization of cell machinery from the network level. Many studies have shown that clustering protein interaction network (PIN) is an effective approach for identifying protein complexes or functional modules. A significant number of proteins in such PIN remain uncharacterized and predicting their function remains a major challenge in system biology. We propose a protein annotation method based on spectral clustering, which first transforms the PIN using the normalized Laplacian of the PIN graph, and then employs a classic clustering algorithm like k-means. Protein functions are assigned based on cluster information. Experiments were performed on PPI data from the bakers’ yeast and since the network is noisy and still incomplete, we use pre-processing and purifying. We also performed network weighting based on the annotation correlation between nodes. Results reveal improvement over previous techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
von Mering, C., Krause, R., Sne, B., et al.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417(6887), 399–403 (2002)
Hakes, L., Lovell, S.C., Oliver, S.G., et al.: Specificity in protein interactions and its relationship with sequence diversity and coevolution. PNAS 104(19), 7999–8004 (2007)
Harwell, L.H., Hopfield, J.J., Leibler, S., Murray, A.W.: From molecular to modular cell biology. Nature 402, c47–c52 (1999)
Brohée, S., van Helden, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 7, 48 (2006)
Barabasi, A.L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5, 101–113 (2004)
Arnau, V., Mars, S., Marin, I.: Iterative cluster analysis of protein interaction data. Bioinformatics 21, 364–378 (2005)
Rives, A.W., Galitski, T.: Modular organization of cellular networks. PNAS 100, 1128–1133 (2003)
Friedel, C.C., Zimmer, R.: Inferring topology from clustering coefficients in protein-protein interaction networks. BMC Bioinformatics 7, 519 (2006)
Pereira-Leal, J.B., Enright, A.J., Ouzounis, C.A.: Detection of functional modules from protein interaction networks. Proteins 54, 49–57 (2004)
Dunn, R., Dudbridge, F., Sanderson, C.M.: The use of edge-betweenness clustering to investigate biological function in PINs. BMC Bioinformatics 6, 39 (2005)
Luo, F., Yang, Y., Chen, C.F., Chang, R., Zhou, J., et al.: Modular organization of protein interaction networks. Bioinformatics 23, 207–214 (2007)
Newman, M.E., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E. Stat. Nonlin. Soft. Matter Phys. 69, 026113 (2004)
Asur, S., Ucar, D., Parthasarathy, S.: An ensemble framework for clustering protein-protein interaction networks. Bioinformatics 23, i29–i40 (2007)
Bader, G.D., Hogue, C.W.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4, 2 (2003)
King, A.D., Przulj, N., Jurisica, I.: Protein complex prediction via cost-based clustering. Bioinformatics 20, 3013–3020 (2004)
Spirin, V., Mirny, L.A.: Protein complexes and functional modules in molecular networks. PNAS 100(21) (2003)
Gagneur, J., Krause, R., Bouwmeester, T., Casari, G.: Modular decomposition of protein-protein interaction networks. Genome. Biol. 5, R57 (2004)
Morrison, J.L., Breitling, R., Higham, D.J., Gilbert, D.R.: A lock-and-key model for protein-protein interactions. Bioinformatics 22, 2012–2019 (2006)
Andreopoulos, B., An, A., Wang, X., Faloutsos, M., Schroeder, M.: Clustering by common friends finds locally significant proteins mediating modules. Bioinformatics 23, 1124–1131 (2007)
Royer, L., Reimann, M., Andreopoulos, B., Schroeder, M.: Unraveling protein networks with power graph analysis. PLoS Comput. Biol. 4, e1000108 (2008)
Belkin, M., Niyogi, P.: Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. Neural Computation 15, 1373–1396 (2003)
Chen, J., Yuan, B.: Detecting Functional Modules in the Yeast Protein-Protein Interaction Network. Bioinformatics 18(22), 2283–2290 (2006)
Lancichinetti, A., Fortunato, S., Radicchi, F.: Benchmark Graphs for testing Community Detection Algorithms. Physical Review E78, 046110 (2008)
Dwight, S., Harris, M., Dolinski, K., Ball, C., Unkley, G.B., Christie, K., Fisk, D., Issel-Tarver, L., Schroeder, M., Sherlock, G., Sethuraman, A., Weng, S., Botstein, D., Cherry, J.M.: Saccharomyces Genome Database (SGD) provides secondary gene annotation using Gene Ontology (GO). Nucleic Acids Research 30(1) (2002)
The gene ontology consortium: Gene ontology: Tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)
Fortunato, S.: Community Detection in Graphs. Physics Reports 486, 75–174 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Trivodaliev, K., Cingovska, I., Kalajdziski, S. (2011). Protein Function Prediction by Spectral Clustering of Protein Interaction Network. In: Kim, Th., et al. Database Theory and Application, Bio-Science and Bio-Technology. BSBT DTA 2011 2011. Communications in Computer and Information Science, vol 258. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27157-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-27157-1_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27156-4
Online ISBN: 978-3-642-27157-1
eBook Packages: Computer ScienceComputer Science (R0)