Abstract
The increased availability of large-scale protein-protein interaction (PPI) data has made it possible to have a network level understanding of the basic components and organization of the cell machinery. A significant number of proteins in protein interaction networks (PIN) remain uncharacterized and predicting their function remains a major challenge. We propose a novel distance metric for PIN clustering. First we augment the graph representing the PIN with weights derived from Gene Ontology (GO) semantic similarity and we use this augmented representation in a random walk with restarts (RWR) process. The distance between a pair of proteins is calculated from the steady state distribution of the RWR. We validate our approach by function prediction via clustering in a purified and reliable Saccharomyces cerevisiae PIN. We show that the rise of function prediction performance when using the novel distance metric is significant, as compared to traditional approaches.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
von Mering, C., Krause, R., Sne, B., et al.: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417(6887), 399–403 (2002)
Hakes, L., Lovell, S.C., Oliver, S.G., et al.: Specificity in protein interactions and its relationship with sequence diversity and coevolution. PNAS 104(19), 7999–8004 (2007)
Harwell, L.H., Hopfield, J.J., Leibler, S., Murray, A.W.: From molecular to modular cell bi-ology. Nature 402, c47–c52 (1999)
The gene ontology consortium: Gene ontology: Tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)
Pesquita, C., Faria, D., Bastos, H., Ferreira, A., Falcão, A.O., Couto, F.M.: Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinformatic 9(5), S4 (2008)
Brohée, S., van Helden, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 7, 48 (2006)
Barabasi, A.L., Oltvai, Z.N.: Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 5, 101–113 (2004)
Arnau, V., Mars, S., Marin, I.: Iterative cluster analysis of protein interaction data. Bioinformatics 21, 364–378 (2005)
Rives, A.W., Galitski, T.: Modular organization of cellular networks. PNAS 100, 1128–1133 (2003)
Friedel, C.C., Zimmer, R.: Inferring topology from clustering coefficients in protein-protein interaction networks. BMC Bioinformatics 7, 519 (2006)
Pereira-Leal, J.B., Enright, A.J., Ouzounis, C.A.: Detection of functional modules from protein interaction networks. Proteins 54, 49–57 (2004)
Luo, F., Yang, Y., Chen, C.F., Chang, R., Zhou, J., et al.: Modular organization of protein interaction networks. Bioinformatics 23, 207–214 (2007)
Bader, G.D., Hogue, C.W.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4, 2 (2003)
King, A.D., Przulj, N., Jurisica, I.: Protein complex prediction via cost-based clustering. Bioinformatics 20, 3013–3020 (2004)
Enright, A.J., Dongen, S.V., Ouzounis, C.A.: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30(7), 1575–1584 (2002)
Mukhopadhyay, A., Ray, S., De, M.: Detecting Protein Complexes in PPI Network: A Gene Ontology-based Multiobjective Evolutionary Approach. Molecular BioSystems 8(11), 3036–3048 (2012)
Zhang, Y., Lin, H., Yang, Z., Wang, J., Li, Y., Xu, B.: Protein Complex Prediction in Large Ontology Attributed Protein-Protein Interaction Networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics 10(3), 729–741 (2013)
Uetz, P., et al.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403(6770), 623–627 (2000)
Ito, T., et al.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Genetics 98(8), 4569–4574 (2001)
Ho, Y., et al.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415(6868), 180–183 (2002)
Krogan, N.J., et al.: Global Landscape of Protein Complexes in the Yeast Saccharomyces cerevisiae. Nature 440(7084), 637–643 (2006)
Gavin, A.C., et al.: Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084), 631–636 (2006)
Dwight, S.S., et al.: Saccharomyces Genome Database (SGD) provides secondary gene annotation using Gene Ontology (GO). Nucleic Acids Research 30(1), 69–72 (2002)
Ivanoska, I., Trivodaliev, K., Kalajdziski, S.: Protein Function Prediction Using Semantic Driven K-Medoids Clustering Algorithm. International Journal of Machine Learning and Computing 4(1), 52–56 (2014)
Resnik, P.: Using information content to evaluate semantic similarity. In: IJCAI 2005, pp. 448–453 (1995)
Witsenburg, T., Blockeel, H.: K-means based approaches to clustering nodes in annotated graphs. In: Kryszkiewicz, M., Rybinski, H., Skowron, A., Raś, Z.W. (eds.) ISMIS 2011. LNCS (LNAI), vol. 6804, pp. 346–357. Springer, Heidelberg (2011)
Langfelder, P., Zhang, B., Horvath, S.: Defining clusters from a hierarchical cluster tree: the Dynamic Tree Cut package for R. Bioinformatics 24(5), 719–720 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Trivodaliev, K., Ivanoska, I., Kalajdziski, S., Kocarev, L. (2015). Novel Gene Ontology Based Distance Metric for Function Prediction via Clustering in Protein Interaction Networks. In: Bogdanova, A., Gjorgjevikj, D. (eds) ICT Innovations 2014. ICT Innovations 2014. Advances in Intelligent Systems and Computing, vol 311. Springer, Cham. https://doi.org/10.1007/978-3-319-09879-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-09879-1_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09878-4
Online ISBN: 978-3-319-09879-1
eBook Packages: EngineeringEngineering (R0)