Skip to main content

GA-PPI-Net: A Genetic Algorithm for Community Detection in Protein-Protein Interaction Networks

  • Conference paper
  • First Online:
Software Technologies (ICSOFT 2019)

Abstract

Community detection has become an important research direction for data mining in complex networks. It aims to identify topological structures and discover patterns in complex networks, which presents an important problem of great significance. In this paper, we are interested in the detection of communities in the Protein-Protein or Gene-gene Interaction (PPI) networks. These networks represent a set of proteins or genes that collaborate at the same cellular function. The goal is to identify such semantic and topological communities from gene annotation sources such as Gene Ontology. We propose a Genetic Algorithm (GA) based approach to detect communities having different sizes from PPI networks. For this purpose, we introduce three specific components to the GA: a fitness function based on a similarity measure and the interaction value between proteins or genes, a solution for representing a community with dynamic size and a specific mutation operator. In the computational tests carried out in this work, the introduced algorithm achieved excellent results to detect existing or even new communities from PPI networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The degree of a node is the number of edges incident to the node.

  2. 2.

    https://pythonhosted.org/inspyred/.

References

  1. Agrawal, R.: Bi-objective community detection (BOCD) in networks using genetic algorithm. In: Aluru, S., et al. (eds.) IC3 2011. CCIS, vol. 168, pp. 5–15. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22606-9_5

    Chapter  Google Scholar 

  2. Ashburner, M., et al.: Gene ontology: tool for the unification of biology. Gene Ontol. Consortium. Nat. Genet. 25(1), 25–29 (2000). https://doi.org/10.1038/75556

    Article  Google Scholar 

  3. Atay, Y., Koc, I., Babaoglu, I., Kodaz, H.: Community detection from biological and social networks: a comparative analysis of metaheuristic algorithms. Appl. Soft Comput. 50, 194–211 (2017). https://doi.org/10.1016/j.asoc.2016.11.025

    Article  Google Scholar 

  4. Becker, K.G., White, S.L., Muller, J., Engel, J.: BBID: the biological biochemical image database. Bioinformatics 16(8), 745–746 (2000). https://doi.org/10.1093/bioinformatics/16.8.745

    Article  Google Scholar 

  5. Ben M’barek, M., Borgi, A., Bedhiafi, W., Hmida, S.B.: Genetic algorithm for community detection in biological networks. Procedia Computer Science 126, 195–204 (2018)

    Google Scholar 

  6. Ben M’barek, M., Borgi, A., Hmida, S.B., Rukoz, M.: Genetic algorithm to detect different sizes’ communities from protein-protein interaction networks. In: Proceedings of the 14th International Conference on Software Technologies - Volume 1: ICSOFT, pp. 359–370. INSTICC, SciTePress (2019). https://doi.org/10.5220/0007836703590370

  7. Cai, Q., Ma, L., Gong, M., Tian, D.: A survey on network community detection based on evolutionary computation. Int. J. Bio-Inspired Comput. 8(2), 84–98 (2016). https://doi.org/10.1504/IJBIC.2016.076329

    Article  Google Scholar 

  8. Camon, E., et al.: The Gene Ontology Annotation (GOA) Project: Implementation of GO in SWISS-PROT, TrEMBL, and InterPro. Genome Res. 13(4), 662–672 (2003). https://doi.org/10.1101/gr.461403

    Article  Google Scholar 

  9. Croft, D., et al.: Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. 39(Database issue), D691–697 (2011). https://doi.org/10.1093/nar/gkq1018

  10. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  Google Scholar 

  11. Fortunato, S., Barthélemy, M.: Resolution limit in community detection. PNAS 104(1), 36–41 (2007). https://doi.org/10.1073/pnas.0605965104

    Article  Google Scholar 

  12. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. U.S.A. 99(12), 7821–7826 (2002). https://doi.org/10.1073/pnas.122653799

    Article  MathSciNet  MATH  Google Scholar 

  13. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning, 1st edn. Addison-Wesley Longman Publishing Co. Inc., Boston (1989)

    MATH  Google Scholar 

  14. Goldberg, D.E., Deb, K.: A comparative analysis of selection schemes used in genetic algorithms. In: Foundations of Genetic Algorithms, pp. 69–93. Morgan Kaufmann (1991)

    Google Scholar 

  15. Green, M.L., Karp, P.D.: Genome annotation errors in pathway databases due to semantic ambiguity in partial EC numbers. Nucleic Acids Res. 33(13), 4035–4039 (2005). https://doi.org/10.1093/nar/gki711. https://academic.oup.com/nar/article/33/13/4035/1094428d

    Article  Google Scholar 

  16. Guo, X., Liu, R., Shriver, C.D., Hu, H., Liebman, M.N.: Assessing semantic similarity measures for the characterization of human regulatory pathways. Bioinformatics 22(8), 967–973 (2006). https://doi.org/10.1093/bioinformatics/btl042

    Article  Google Scholar 

  17. Hill, D.P., Smith, B., McAndrews-Hill, M.S., Blake, J.A.: Gene Ontology annotations: what they mean and where they come from. BMC Bioinformatics 9(5), S2 (2008). https://doi.org/10.1186/1471-2105-9-S5-S2

    Article  Google Scholar 

  18. Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. arXiv:cmp-lg/9709008, September 1997. arXiv: cmp-lg/9709008

  19. Kanehisa, M., Goto, S.: KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28(1), 27–30 (2000)

    Article  Google Scholar 

  20. Lancichinetti, A., Fortunato, S., Kertesz, J.: Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 11(3), 033015 (2009)

    Article  Google Scholar 

  21. Li, Z., Zhang, S., Wang, R.S., Zhang, X.S., Chen, L.: Quantitative function for community detection. Phys. Rev. E 77(3), 036109 (2008)

    Article  Google Scholar 

  22. Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning, pp. 296–304. Morgan Kaufmann (1998)

    Google Scholar 

  23. Liu, X., Li, D., Wang, S., Tao, Z.: Effective algorithm for detecting community structure in complex networks based on GA and clustering. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2007. LNCS, vol. 4488, pp. 657–664. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72586-2_95

    Chapter  Google Scholar 

  24. Mering, C.V., Huynen, M., Jaeggi, D., Schmidt, S., Bork, P., Snel, B.: STRING: a database of predicted functional associations between proteins. Nucl. Acids Res. 31(1), 258–261 (2003). https://doi.org/10.1093/nar/gkg034

  25. National Human Genome Research Institute (NHGRI): Biological Pathways Fact Sheet (2015). https://www.genome.gov/27530687/Biological-Pathways-Fact-Sheet

  26. Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev. E 69(6) (2004). https://doi.org/10.1103/PhysRevE.69.066133, arXiv: cond-mat/0309508

  27. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2) (2004). https://doi.org/10.1103/PhysRevE.69.026113, arXiv: cond-mat/0308217

  28. Nibbe, R.K., Chowdhury, S.A., Koyutürk, M., Ewing, R., Chance, M.R.: Protein-protein interaction networks and subnetworks in the biology of disease. Wiley Interdiscip. Rev. Syst. Biol. Med. 3(3), 357–367 (2011)

    Article  Google Scholar 

  29. Nishimura, D.: BioCarta. Biotech Softw. Internet Rep. 2(3), 117–120 (2001). https://doi.org/10.1089/152791601750294344

    Article  Google Scholar 

  30. Pesquita, C., Faria, D., Falcão, A.O., Lord, P., Couto, F.M.: Semantic Similarity in Biomedical Ontologies. PLoS Comput. Biol. 5(7) (2009). https://doi.org/10.1371/journal.pcbi.1000443

  31. Petrowski, A., Ben-Hamida, S.: Evolutionary Algorithms. Wiley, Hoboken, April 2017. google-Books-ID: fvRRCgAAQBAJ

    Google Scholar 

  32. Pizzuti, C.: Evolutionary computation for community detection in networks: a review. IEEE Trans. Evol. Comput. 22(3), 464–483 (2018). https://doi.org/10.1109/TEVC.2017.2737600

    Article  Google Scholar 

  33. Pizzuti, C.: GA-Net: a genetic algorithm for community detection in social networks. In: Rudolph, G., Jansen, T., Beume, N., Lucas, S., Poloni, C. (eds.) PPSN 2008. LNCS, vol. 5199, pp. 1081–1090. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87700-4_107

    Chapter  Google Scholar 

  34. Pizzuti, C.: A multi-objective genetic algorithm for community detection in networks. In: 2009 21st IEEE International Conference on Tools with Artificial Intelligence, pp. 379–386. IEEE (2009)

    Google Scholar 

  35. Pizzuti, C.: A multiobjective genetic algorithm to find communities in complex networks. IEEE Trans. Evol. Comput. 16(3), 418–430 (2011)

    Article  Google Scholar 

  36. Pizzuti, C., Rombo, S.E.: Algorithms and tools for protein-protein interaction networks clustering, with a special focus on population-based stochastic methods. Bioinformatics 30(10), 1343–1352 (2014). https://doi.org/10.1093/bioinformatics/btu034

    Article  Google Scholar 

  37. Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cybern. 19(1), 17–30 (1989). https://doi.org/10.1109/21.24528

    Article  Google Scholar 

  38. Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. PNAS 101(9), 2658–2663 (2004). https://doi.org/10.1073/pnas.0400054101

    Article  Google Scholar 

  39. Resnik, P.: Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. arXiv:1105.5444 [cs], May 2011. https://doi.org/10.1613/jair.514, arXiv: 1105.5444

  40. Ruths, T., Ruths, D., Nakhleh, L.: GS2: an efficiently computable measure of GO-based similarity of gene sets. Bioinformatics 25(9), 1178–1184 (2009). https://doi.org/10.1093/bioinformatics/btp128

    Article  Google Scholar 

  41. Schlicker, A., Domingues, F.S., Rahnenführer, J., Lengauer, T.: A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinformatics 7, 302 (2006). https://doi.org/10.1186/1471-2105-7-302

    Article  Google Scholar 

  42. Sherman, B.T., Huang, D.W., Tan, Q., Guo, Y., Bour, S., Liu, D., Stephens, R., Baseler, M.W., Lane, H.C., Lempicki, R.A.: DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis. BMC Bioinformatics 8, 426 (2007). https://doi.org/10.1186/1471-2105-8-426

    Article  Google Scholar 

  43. Shi, C., Yu, P.S., Cai, Y., Yan, Z., Wu, B.: On selection of objective functions in multi-objective community detection. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 2301–2304. ACM (2011)

    Google Scholar 

  44. Shi, C., Yu, P.S., Yan, Z., Huang, Y., Wang, B.: Comparison and selection of objective functions in multiobjective community detection. Comput. Intell. 30(3), 562–582 (2014)

    Article  MathSciNet  Google Scholar 

  45. Shi, C., Zhong, C., Yan, Z., Cai, Y., Wu, B.: A multi-objective approach for community detection in complex network. In: IEEE Congress on Evolutionary Computation, pp. 1–8. IEEE (2010)

    Google Scholar 

  46. Snel, B., Lehmann, G., Bork, P., Huynen, M.A.: STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucl. Acids Res. 28(18), 3442–3444 (2000). https://doi.org/10.1093/nar/28.18.3442

    Article  Google Scholar 

  47. Tasgin, M., Bingol, H.: Community Detection in Complex Networks using Genetic Algorithm. arXiv:cond-mat/0604419, April 2006. arXiv: cond-mat/0604419

  48. Tasgin, M., Herdagdelen, A., Bingol, H.: Community Detection in Complex Networks Using Genetic Algorithms. arXiv:0711.0491 [physics], November 2007. arXiv: 0711.0491

  49. Wang, J.Z., Du, Z., Payattakool, R., Yu, P.S., Chen, C.F.: A new method to measure the semantic similarity of GO terms. Bioinformatics 23(10), 1274–1281 (2007). https://doi.org/10.1093/bioinformatics/btm087

    Article  Google Scholar 

  50. Wilson, S.J., Wilkins, A.D., Lin, C.H., Lua, R.C., Lichtarge, O.: Discovery of functional and disease pathways by community detection in protein-protein interaction networks. In: Pacific Symposium on Biocomputing 2017, pp. 336–347. World Scientific (2017)

    Google Scholar 

  51. Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32Nd Annual Meeting on Association for Computational Linguistics, pp. 133–138. ACL 1994, Association for Computational Linguistics, Stroudsburg, PA, USA (1994). https://doi.org/10.3115/981732.981751

  52. Xu, B., Lin, H., Yang, Z.: Ontology integration to identify protein complex in protein interaction networks. Proteome Sci. 9(1), S7 (2011). https://doi.org/10.1186/1477-5956-9-S1-S7

    Article  Google Scholar 

  53. Zhao, Y., Dong, J., Peng, T.: Ontology classification for semantic-web-based software engineering. IEEE Trans. Serv. Comput. 2(4), 303–317 (2009). https://doi.org/10.1109/TSC.2009.20

    Article  Google Scholar 

Download references

Acknowledgements

We would like to show our gratitude to Dr. Walid BEDHIAFI (Laboratoire de Génétique Immunologie et Pathologies Humaines, Université de Tunis El Manar) for assistance to comprehend the biological fields and for the interpretation of the results.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marwa Ben M’barek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ben M’barek, M., Borgi, A., Ben Hmida, S., Rukoz, M. (2020). GA-PPI-Net: A Genetic Algorithm for Community Detection in Protein-Protein Interaction Networks. In: van Sinderen, M., Maciaszek, L. (eds) Software Technologies. ICSOFT 2019. Communications in Computer and Information Science, vol 1250. Springer, Cham. https://doi.org/10.1007/978-3-030-52991-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-52991-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-52990-1

  • Online ISBN: 978-3-030-52991-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics