Abstract
Interactions between proteins are key to most biological processes, but thorough testing can be costly in terms of money and time. Computational approaches for predicting such interactions are an important alternative. This study presents a novel approach to this prediction using calibrated synthetic networks as input for training a decision tree ensemble model with relevant topological information. This trained model is later used for predicting interactions on the human interactome, as a case study. Results show that deterministic metrics perform better than their stochastic counterparts, although a random forest model shows a feature combination case with comparable precision results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chang, J.W., Zhou, Y.Q., Ul Qamar, M.T., Chen, L.L., Ding, Y.D.: Prediction of protein-protein interactions by evidence combining methods. Int. J. Mol. Sci. 17 (2016)
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785-794. KDD ’16, ACM, New York, NY, USA (2016)
Chung, F., Lu, L., Dewey, T.G., Galas, D.J.: Duplication models for biological networks. J. Comput. Biol. J. Comput. Mol. Cell Biol. 10, 677–87 (2003)
Grover, A., Leskovec, J.: node2vec: Scalable feature learning for networks (2016). https://arxiv.org/abs/1607.00653
Gysi, D.M., Ítalo do Valle, Zitnik, M., Ameli, A., Gan, X., Varol, O., Ghiassian, S.D., Patten, J.J., Davey, R.A., Loscalzo, J., Barabási, A.L.: Network medicine framework for identifying drug-repurposing opportunities for COVID-19. Proc. Nat. Acad. Sci. 118(19) (2021)
Halder, A.K., Bandyopadhyay, S.S., Chatterjee, P., Nasipuri, M., Plewczynski, D., Basu, S.: JUPPI: A multi-level feature based method for PPI prediction and a refined strategy for performance assessment. IEEE/ACM Trans. Comput. Biol. Bioinform. 19, 531–542 (2022)
Hyafil, L., Rivest, R.L.: Constructing optimal binary decision trees is NP-complete. Inform. Process. Lett. 5(1), 15–17 (1976)
Ispolatov, I., Krapivsky, P.L., Yuryev, A.: Duplication-divergence model of protein interaction network. Physical review. E-Stat. Nonlin. Soft Matt. Phys. 71, 061911 (2005)
Kim, J., Krapivsky, P.L., Kahng, B., Redner, S.: Infinite-order percolation and giant fluctuations in a protein interaction network. Physical review. E-Stat. Nonlin. Soft Matt. Phys. 66, 055101 (2002)
Kovács, I.A., Luck, K., Spirohn, K., Wang, Y., Pollis, C., Schlabach, S., Bian, W., Kim, D.K., Kishore, N., Hao, T., Calderwood, M.A., Vidal, M., Barabási, A.L.: Network-based prediction of protein interactions. Nat. Commun. 10, 1240 (2019)
Laraia, L., McKenzie, G., Spring, D.R., Venkitaraman, A.R., Huggins, D.J.: Overcoming chemical, biological, and computational challenges in the development of inhibitors targeting protein-protein interactions. Chem. Biol. 22, 689–703 (2015)
Lin, J.S., Lai, E.M.: Protein-protein interactions: Co-immunoprecipitation. In: Journet, L., Cascales, E. (eds.) Bacterial Protein Secretion Systems: Methods and Protocols, pp. 211–219. Springer, New York, New York, NY (2017)
Luck, K., Kim, D.K., Lambourne, L., Spirohn, K., Begg, B.E., Bian, W., Brignall, R., Cafarelli, T., Campos-Laborie, F.J., Charloteaux, B., Choi, D., Coté, A.G., Daley, M., Deimling, S., Desbuleux, A., Dricot, A., Gebbia, M., Hardy, M.F., Kishore, N., Knapp, J.J., Kovács, I.A., Lemmens, I., Mee, M.W., Mellor, J.C., Pollis, C., Pons, C., Richardson, A.D., Schlabach, S., Teeking, B., Yadav, A., Babor, M., Balcha, D., Basha, O., Bowman-Colin, C., Chin, S.F., Choi, S.G., Colabella, C., Coppin, G., D’Amata, C., De Ridder, D., De Rouck, S., Duran-Frigola, M., Ennajdaoui, H., Goebels, F., Goehring, L., Gopal, A., Haddad, G., Hatchi, E., Helmy, M., Jacob, Y., Kassa, Y., Landini, S., Li, R., van Lieshout, N., MacWilliams, A., Markey, D., Paulson, J.N., Rangarajan, S., Rasla, J., Rayhan, A., Rolland, T., San-Miguel, A., Shen, Y., Sheykhkarimli, D., Sheynkman, G.M., Simonovsky, E., Taşan, M., Tejeda, A., Tropepe, V., Twizere, J.C., Wang, Y., Weatheritt, R.J., Weile, J., Xia, Y., Yang, X., Yeger-Lotem, E., Zhong, Q., Aloy, P., Bader, G.D., De Las Rivas, J., Gaudet, S., Hao, T., Rak, J., Tavernier, J., Hill, D.E., Vidal, M., Roth, F.P., Calderwood, M.A.: A reference map of the human binary protein interactome. Nature 580(7803), 402–408 (2020)
Ma, C.Y., Liao, C.S.: A review of protein-protein interaction network alignment: From pathway comparison to global alignment. Comput. Struct. Biotechnol. J. 18, 2647–2656 (2020)
Macalino, S.J.Y., Basith, S., Clavio, N.A.B., Chang, H., Kang, S., Choi, S.: Evolution of In Silico Strategies for Protein-Protein Interaction Drug Discovery. Molecules. Basel, Switzerland, pp. 23 (2018)
Mudunuri, U., Che, A., Yi, M., Stephens, R.M.: biodbnet: The biological database network. Bioinformatics 25, 555–6 (2009)
Muscoloni, A., Abdelhamid, I., Cannistraci, C.V.: Local-community network automata modelling based on length-three-paths for prediction of complex network structures in protein interactomes, food webs and more. BioRxiv (2018)
Pastor-Satorras, R., Smith, E., Solé, R.V.: Evolving protein interaction networks through gene duplication. J. Theor. Biol. 222, 199–210 (2003)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Rajagopala, S.V.: Mapping the protein-protein interactome networks using yeast two-hybrid screens. Adv. Experiment. Med. Biol. 883, 187–214 (2015)
Rozemberczki, B., Kiss, O., Sarkar, R.: Karate Club: An API oriented open-source python framework for unsupervised learning on graphs. In: Proceedings of the 29th ACM International Conference on Information and Knowledge Management (CIKM ’20), pp. 3125-3132. ACM (2020)
Sarkar, D., Saha, S.: Machine-learning techniques for the prediction of protein-protein interactions. J. Biosci. 44 (2019)
Schweiger, R., Linial, M., Linial, N.: Generative probabilistic models for protein-protein interaction networks-the biclique perspective. Bioinformatics 27, i142-8 (2011)
Shao, M., Yang, Y., Guan, J., Zhou, S.: Choosing appropriate models for protein-protein interaction networks: a comparison study. Brief. Bioinform. 15, 823–38 (2014)
Shokri, L., Inukai, S., Hafner, A., Weinand, K., Hens, K., Vedenko, A., Gisselbrecht, S.S., Dainese, R., Bischof, J., Furger, E., Feuz, J.D., Basler, K., Deplancke, B., Bulyk, M.L.: A comprehensive drosophila melanogaster transcription factor interactome. Cell Rep. 27, 955-970.e7 (2019)
Sreedharan, J.K., Turowski, K., Szpankowski, W.: Revisiting parameter estimation in biological networks: Influence of symmetries. IEEE/ACM Trans. Comput. Biol. Bioinform. 18, 836–849 (2021)
Stumpf, M.P.H., Thorne, T., de Silva, E., Stewart, R., An, H.J., Lappe, M., Wiuf, C.: Estimating the size of the human interactome. Proc. Nat. Acad. Sci. U.S.A. 105, 6959–64 (2008)
Sun, T., Zhou, B., Lai, L., Pei, J.: Sequence-based prediction of protein protein interaction using a deep-learning algorithm. BMC Bioinform. 18, 277 (2017)
Xiao, Z., Deng, Y.: Graph embedding-based novel protein interaction prediction via higher-order graph convolutional network. PLOS ONE 15, e0238915 (2020)
Yao, Y., Du, X., Diao, Y., Zhu, H.: An integration of deep learning with feature embedding for protein-protein interaction prediction. Peer J. 7, e7126 (2019)
Zahiri, J., Emamjomeh, A., Bagheri, S., Ivazeh, A., Mahdevar, G., Sepasi Tehrani, H., Mirzaie, M., Fakheri, B.A., Mohammad-Noori, M.: Protein complex prediction: a survey. Genomics 112, 174–183 (2020)
Zhang, Y., Gao, P., Yuan, J.: Plant protein-protein interaction network and interactome. Curr. Genom. 11(1), 40–46 (2010)
Zhong, X., Rajapakse, J.C.: Graph embeddings on gene ontology annotations for protein-protein interaction prediction. BMC Bioinform. 21, 560 (2020)
Acknowledgments
This work was funded by the OMICAS program: Optimización Multiescala In-silico de Cultivos Agrícolas Sostenibles (Infraestructura y Validación en Arroz y Caña de Azúcar), anchored at the Pontificia Universidad Javeriana in Cali and funded within the Colombian Scientific Ecosystem by The World Bank, the Colombian Ministry of Science, Technology and Innovation, the Colombian Ministry of Education, and the Colombian Ministry of Industry and Tourism, and ICETEX, under GRANT ID: FP44842-217-2018.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
López-Rozo, N., Finke, J., Rocha, C. (2023). Using the Duplication-Divergence Network Model to Predict Protein-Protein Interactions. In: Cherifi, H., Mantegna, R.N., Rocha, L.M., Cherifi, C., Miccichè, S. (eds) Complex Networks and Their Applications XI. COMPLEX NETWORKS 2016 2022. Studies in Computational Intelligence, vol 1077. Springer, Cham. https://doi.org/10.1007/978-3-031-21127-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-031-21127-0_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21126-3
Online ISBN: 978-3-031-21127-0
eBook Packages: EngineeringEngineering (R0)