Abstract
This systematic review aims to provide a comprehensive overview of graph-based methodologies utilized in the analysis of protein–protein interaction (PPI) networks. The primary objective is to synthesize existing literature and identify key methodologies, resources, and best practices in the field, with a focus on their application in uncovering essential cancer proteins. A systematic literature search was conducted across various databases to identify relevant studies focusing on graph-based explorations of PPI networks. The selected articles were critically reviewed, and data were extracted regarding the methodologies employed, resources utilized, and best practices identified. The review proceeds to outline a workflow that illustrates the systematic process from the compilation of gene/protein datasets to the generation of essential cancer proteins. A case study on “uncovering essential cancer proteins in breast cancer” was included to exemplify the application of graph-based methodologies in a real-world scenario. The review revealed various graph-based methodologies utilized in PPI network analysis, including centrality measures, pathway enrichment analyses, and network visualization techniques. Essential resources such as databases, software tools, and repositories were identified, along with best practices for data preprocessing, network construction, and analysis. The synthesis of findings, complemented by the case study, provides researchers with a comprehensive understanding of the current landscape of graph-based PPI network analysis and its application in cancer research. This systematic review contributes to the field by offering a holistic overview of graph-based explorations in PPI network research, with a specific focus on cancer protein identification. By synthesizing existing knowledge and identifying essential resources and best practices, this review serves as a valuable resource for researchers, facilitating informed decision-making and enhancing research quality and reproducibility. The inclusion of the case study underscores the practical application of graph-based methodologies in uncovering essential cancer proteins.
Similar content being viewed by others
Data availability
Not applicable.
References
(2023) Uniprot: the universal protein knowledgebase in 2023. Nucl Acids Res 51(D1):D523–D531
Ahmed MR, Rehana H, Asaduzzaman S (2021) Protein interaction network and drug design of stomach cancer and associated disease: a bioinformatics approach. J Proteins Proteom 12:33–43
Alcalá A, Alberich R, Llabrés M et al (2020) Alignet: alignment of protein–protein interaction networks. BMC Bioinform 21:1–22
Amala A, Emerson IA (2019) Identification of target genes in cancer diseases using protein–protein interaction networks. Netw Model Anal Health Inform Bioinform 8:1–13
Ashtiani M, Salehzadeh-Yazdi A, Razaghi-Moghadam Z et al (2018) A systematic survey of centrality measures for protein–protein interaction networks. BMC Syst Biol 12(1):1–17
Assenov Y, Ramírez F, Schelhorn SE et al (2008) Computing topological parameters of biological networks. Bioinformatics 24(2):282–284
Auffray C, Chen Z, Hood L (2009) Systems medicine: the future of medical genomics and healthcare. Genome Med 1:1–11
Bajpai AK, Davuluri S, Tiwary K et al (2020) Systematic comparison of the protein–protein interaction databases from a user’s perspective. J Biomed Inform 103:103380
Bakhtiarnia A, Fahim A, Miandoab EM (2021) Parameter identification of complex network dynamics. Nonlinear Dyn 104(4):3991–4005
Baudot A, Gomez-Lopez G, Valencia A (2009) Translational disease interpretation with molecular networks. Genome Biol 10:1–9
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc Ser B (Methodol) 57(1):289–300
Bloch F, Jackson MO, Tebaldi P (2023) Centrality measures in networks. Soc Choice Welf 61(2):413–453
Bonacich P (1987) Power and centrality: a family of measures. Am J Sociol 92(5):1170–1182
Brandes U (2008) On variants of shortest-path betweenness centrality and their generic computation. Soc Netw 30(2):136–145
Brohee S, Faust K, Lima-Mendez G et al (2008) Neat: a toolbox for the analysis of biological networks, clusters, classes and pathways. Nucl Acids Res 36(suppl-2):W444–W451
Calderone A, Iannuccelli M, Peluso D et al (2020) Using the mint database to search protein interactions. Curr Protocols Bioinform 69(1):e93
Cao B, Luo L, Feng L et al (2017) A network-based predictive gene-expression signature for adjuvant chemotherapy benefit in stage II colorectal cancer. BMC Cancer 17:1–13
Chabot C, Stolte C, Hanrahan P (2003) Tableau software. Tableau Softw 6:1
Chen C, Shen H, Zhang LG et al (2016) Construction and analysis of protein–protein interaction networks based on proteomics data of prostate cancer. Int J Mol Med 37(6):1576–1586
Chen SJ, Liao DL, Chen CH et al (2019) Construction and analysis of protein–protein interaction network of heroin use disorder. Sci Rep 9(1):1–9
Cherven K (2015) Mastering Gephi network visualization. Packt Publishing Ltd, London
Chin CH, Chen SH, Wu HH et al (2014) cytohubba: identifying hub objects and sub-networks from complex interactome. BMC Syst Biol 8:1–7
Clermont G, Auffray C, Moreau Y et al (2009) Bridging the gap between systems biology and medicine. Genome Med 1:1–6
Clough E, Barrett T (2016) The gene expression omnibus database. Stat Genom Methods Protocols 2016:93–110
Coordinators NR (2016) Database resources of the national center for biotechnology information. Nucl Acids Res 44(D1):D7–D19
Csardi G, Nepusz T et al (2006) The igraph software package for complex network research. InterJ Complex Syst 1695(5):1–9
Dalkılıç F, Işik Z (2021) Compound target identification in tissue-specific interaction networks. IEEE Access 9:81702–81716
Del Toro N, Shrivastava A, Ragueneau E et al (2022) The intact database: efficient access to fine-grained molecular interaction data. Nucl Acids Res 50(D1):D648–D653
Finn RD, Miller BL, Clements J et al (2014) iPfam: a database of protein family and domain interactions found in the protein data bank. Nucl Acids Res 42(D1):D364–D373
Freeman LC et al (2002) Centrality in social networks: conceptual clarification. Soc Netw Crit Concepts Sociol Lond Routledge 1:238–263
Ge BK, Hu GM, Chen R et al (2022) Msclustering: a cytoscape tool for multi-level clustering of biological networks. Int J Mol Sci 23(22):14240
Ghandi M, Huang FW, Jané-Valbuena J et al (2019) Next-generation characterization of the cancer cell line encyclopedia. Nature 569(7757):503–508
Goel N, Khandnor P et al (2020) TCGA: a multi-genomics material repository for cancer research. Mater Today Proc 28:1492–1495
Good P (2013) Permutation tests: a practical guide to resampling methods for testing hypotheses. Springer, London
Hagberg A, Conway D (2020) Networkx: network analysis with Python. https://networkx.github.io
Hasan MR, Paul BK, Ahmed K et al (2020) Design protein–protein interaction network and protein–drug interaction network for common cancer diseases: a bioinformatics approach. Inform Med Unlock 18:100311
Huang DW, Sherman BT, Tan Q et al (2007) The David gene functional classification tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol 8(9):1–16
Huang DW, Sherman BT, Lempicki RA (2009) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucl Acids Res 37(1):1–13
Iragne F, Nikolski M, Mathieu B et al (2005) ProViz: protein interaction visualization and exploration. Bioinformatics 21(2):272–274
Jardim VC, Santos SdS, Fujita A et al (2019) BioNetStat: a tool for biological networks differential analysis. Front Genet 10:594
Jeong H, Mason SP, Barabási AL et al (2001) Lethality and centrality in protein networks. Nature 411(6833):41–42
Jiang M, Chen Y, Zhang Y et al (2013) Identification of hepatocellular carcinoma related genes with k-th shortest paths in a protein–protein interaction network. Mol BioSyst 9(11):2720–2728
Jonsson PF, Bates PA (2006) Global topological features of cancer proteins in the human interactome. Bioinformatics 22(18):2291–2297
Junker BH, Koschützki D, Schreiber F (2006) Exploration of biological network centralities with CentiBiN. BMC Bioinform 7(1):1–7
Kar G, Gursoy A, Keskin O (2009) Human cancer protein–protein interaction network: a structural perspective. PLoS Comput Biol 5(12):e1000601
Keshava Prasad T, Goel R, Kandasamy K et al (2009) Human protein reference database-2009 update. Nucl Acids Res 37(suppl–1):D767–D772
Klein B, Holmér L, Smith KM et al (2021) A computational exploration of resilience and evolvability of protein–protein interaction networks. Commun Biol 4(1):1352
Kulkarni P, Wiley HS, Levine H et al (2023) Addressing the genetic/nongenetic duality in cancer with systems biology. Trends Cancer 2023:1
Li M, Li D, Tang Y et al (2017) Cytocluster: a cytoscape plugin for cluster analysis and visualization of biological networks. Int J Mol Sci 18(9):1880
Liang B, Li C, Zhao J (2016) Identification of key pathways and genes in colorectal cancer using bioinformatics analysis. Med Oncol 33:1–8
Lin C, Cho Y, Hwang WC et al (2007) Clustering methods in protein–protein interaction network. In: Knowledge discovery in bioinformatics: techniques, methods and application, pp 1–35
Lin CY, Chin CH, Wu HH et al (2008) Hubba: hub objects analyzer—a framework of interactome hubs identification for network biology. Nucl Acids Res 36(suppl-2):W438–W443
Liu X, Hong Z, Liu J et al (2020) Computational methods for identifying the critical nodes in biological networks. Briefings Bioinform 21(2):486–497
Liu X, Li X, Fiumara G et al (2023) Link prediction approach combined graph neural network with capsule network. Expert Syst Appl 212:118737
Lombardo G, Poggi A, Tomaiuolo M (2022) Continual representation learning for node classification in power-law graphs. Fut Gener Comput Syst 128:420–428
Lü J, Wang P, Lü J et al (2020) Statistical analysis of functional genes in human PPI networks. Model Anal Biomol Netw 2020:397–426
Luo T, Wu S, Shen X et al (2013) Network cluster analysis of protein–protein interaction network identified biomarker for early onset colorectal cancer. Mol Biol Rep 40:6561–6568
Ma H, He Z, Chen J et al (2021) Identifying of biomarkers associated with gastric cancer based on 11 topological analysis methods of cytohubba. Sci Rep 11(1):1331
Maddah R, Molavi Z, Ehymayed HM et al (2024) Identification of shared hub genes and pathways between gastric cancer and helicobacter pylori infection through bioinformatics analysis. Human Gene 39:201237
Masood MMD, Manjula D, Sugumaran V (2018) Identification of new disease genes from protein–protein interaction network. J Ambient Intell Human Comput 2018:1–9
Meena C, Hens C, Acharyya S et al (2023) Emergent stability in complex network dynamics. Nat Phys 19(7):1033–1042
Mellor JC, Yanai I, Clodfelter KH et al (2002) Predictome: a database of putative functional links between proteins. Nucl Acids Res 30(1):306–309
Meng X, Li W, Peng X et al (2021) Protein interaction networks: centrality, modularity, dynamics, and applications. Front Comput Sci 15:1–17
Mortezapour M, Tapak L, Bahreini F et al (2023) Identification of key genes in colorectal cancer diagnosis by co-expression analysis weighted gene co-expression network analysis. Comput Biol Med 157:106779
Mottaghi-Dastjerdi N, Ghorbani A, Montazeri H et al (2023) A systems biology approach to pathogenesis of gastric cancer: gene network modeling and pathway analysis. BMC Gastroenterol 23(1):248
Mrvar A, Batagelj V (2016) Analysis and visualization of large networks with program package Pajek. Complex Adapt Syst Model 4:1–8
Murphy M, Brown G, Wallin C et al (2021) Gene help: integrated access to genes of genomes in the reference sequence collection. In: Gene Help (Internet). National Center for Biotechnology Information (US)
Najma, Farooqui A (2023) Biological networks analysis. In: Biological networks in human health and disease. Springer, London, pp 15–49
Newman ME (2005) A measure of betweenness centrality based on random walks. Soc Netw 27(1):39–54
Nithya C, Kiran M, Nagarajaram HA (2023a) Dissection of hubs and bottlenecks in a protein–protein interaction network. Comput Biol Chem 102:107802
Nithya C, Kiran M, Nagarajaram HA (2023b) Hubs and bottlenecks in protein–protein interaction networks. In: Reverse engineering of regulatory networks, pp 227–248
Niu B, Liang C, Lu Y et al (2020) Glioma stages prediction based on machine learning algorithm combined with protein–protein interaction networks. Genomics 112(1):837–847
Oughtred R, Rust J, Chang C et al (2021) The BioGRID database: a comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci 30(1):187–200
Page MJ, Moher D, Bossuyt PM et al (2021) Prisma 2020 explanation and elaboration: updated guidance and exemplars for reporting systematic reviews. BMJ 372:1
Pattin KA, Moore JH (2009) Role for protein–protein interaction databases in human genetics. Expert Rev Proteom 6(6):647–659
Pavlopoulos GA, Hooper SD, Sifrim A et al (2011a) Medusa: a tool for exploring and clustering biological networks. BMC Rese Not 4(1):1–6
Pavlopoulos GA, Secrier M, Moschopoulos CN et al (2011b) Using graph theory to analyze biological networks. BioData Min 4:1–27
Phipson B, Smyth GK (2010) Permutation p-values should never be zero: calculating exact p-values when permutations are randomly drawn. Stat Appl Genet Mol Biol 9(1):1
Piñero J, Ramírez-Anguita JM, Saüch-Pitarch J et al (2020) The disgenet knowledge platform for disease genomics: 2019 update. Nucl Acids Res 48(D1):D845–D855
Raman K (2010) Construction and analysis of protein–protein interaction networks. Autom Experiment 2:1–11
Ran J, Li H, Fu J et al (2013) Construction and analysis of the protein–protein interaction network related to essential hypertension. BMC Syst Biol 7:1–12
Rao VS, Srinivas K, Sujini G et al (2014) Protein–protein interaction detection: methods and analysis. Int J Proteom 2014:1
Reimand J, Isserlin R, Voisin V et al (2019) Pathway enrichment analysis and visualization of omics data using G: Profiler, GSEA, Cytoscape and EnrichmentMap. Nat Protoc 14(2):482–517
Rivera CG, Vakil R, Bader JS (2010) Nemo: network module identification in Cytoscape. BMC Bioinform 11:1–9
Rossetti G, Milli L, Cazabet R (2019) Cdlib: a Python library to extract, compare and evaluate communities from complex networks. Appl Netw Sci 4(1):1–26
Safari-Alighiarloo N, Taghizadeh M, Rezaei-Tavirani M et al (2014) Protein–protein interaction networks (PPI) and complex diseases. Gastroenterol Hepatol Bed Bench 7(1):17
Saito R, Smoot ME, Ono K et al (2012) A travel guide to Cytoscape plugins. Nat Methods 9(11):1069–1076
Salwinski L, Miller CS, Smith AJ et al (2004) The database of interacting proteins: 2004 update. Nucl Acids Res 32(suppl-1):D449–D451
Sanz-Pamplona R, Berenguer A, Sole X et al (2012) Tools for protein–protein interaction network analysis in cancer research. Clin Transl Oncol 14:3–14
Scardoni G, Tosadori G, Faizan M et al (2014) Biological network analysis with centiscape: centralities and experimental dataset integration. F1000Research 3:1
Secrier M, Pavlopoulos GA, Aerts J et al (2012) Arena3D: visualizing time-driven phenotypic differences in biological systems. BMC Bioinform 13:1–11
Shannon P, Markiel A, Ozier O et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504
Suderman M, Hallett M (2007) Tools for visually exploring biological networks. Bioinformatics 23(20):2651–2659
Szalay-Bekő M, Palotai R, Szappanos B et al (2012) Moduland plug-in for Cytoscape: determination of hierarchical layers of overlapping network modules and community centrality. Bioinformatics 28(16):2202–2204
Szklarczyk D, Kirsch R, Koutrouli M et al (2023) The string database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucl Acids Res 51(D1):D638–D646
Tadaka S, Kinoshita K (2016) NCMine: core-peripheral based functional module detection using near-clique mining. Bioinformatics 32(22):3454–3460
Tang Y, Li M, Wang J et al (2015) CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems 127:67–72
Tate JG, Bamford S, Jubb HC et al (2019) Cosmic: the catalogue of somatic mutations in cancer. Nucl Acids Res 47(D1):D941–D947
Theodosiou T, Efstathiou G, Papanikolaou N et al (2017) Nap: the network analysis profiler, a web tool for easier topological analysis and comparison of medium-scale biological networks. BMC Res Notes 10:1–9
Tumuluru P, Ravi B (2017) Dijkstra’s based identification of lung cancer related genes using PPI networks. Int J Comput Appl 975:8887
Utriainen M, Morris JH (2023) clusterMaker2: a major update to clusterMaker, a multi-algorithm clustering app for Cytoscape. BMC Bioinform 24(1):134
Vella D, Marini S, Vitali F et al (2018) MTGO: PPI network analysis via topological and functional module identification. Sci Rep 8(1):5499
Wahab Khattak F, Salamah Alhwaiti Y, Ali A et al (2021) Protein–protein interaction analysis through network topology (oral cancer). J Healthc Eng 2021:1
Wang E, Lenferink A, O’Connor-McCourt M (2007a) Cancer systems biology: exploring cancer-associated genes on cellular networks. Preprint arXiv:0712.3753
Wang E, Lenferink A, O’Connor-McCourt M (2007b) Genetic studies of diseases: cancer systems biology: exploring cancer-associated genes on cellular networks. Cell Mol Life Sci 64:1752–1762
Wang J, Li M, Wang H et al (2011) Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol Bioinf 9(4):1070–1080
Wang S, Huang G, Hu Q et al (2016) A network-based method for the identification of putative genes related to infertility. Biochim Biophys Acta (BBA) General Subj 1860(11):2716–2724
Wang Y, Zhou Z, Chen L et al (2021) Identification of key genes and biological pathways in lung adenocarcinoma via bioinformatics analysis. Mol Cell Biochem 476:931–939
Wilks C, Cline MS, Weiler E et al (2014) The cancer genomics hub (CGHub): overcoming cancer through the power of torrential data. Database 2014:bau093
Winter C, Henschel A, Kim WK et al (2006) SCOPPI: a structural classification of protein–protein interfaces. Nucl Acids Res 34(suppl-1):D310–D314
Winterhalter C, Widera P, Krasnogor N (2014) JEPETTO: a Cytoscape plugin for gene set enrichment and topological analysis based on interaction networks. Bioinformatics 30(7):1029–1030
Wu B, Xi S (2021) Bioinformatics analysis of differentially expressed genes and pathways in the development of cervical cancer. BMC Cancer 21(1):733
Wu Z, Zhao X, Chen L (2009) Identifying responsive functional modules from protein–protein interaction network. Mol Cells 27:271–277
Xu J, Li Y (2006) Discovering disease-genes by topological features in human protein–protein interaction network. Bioinformatics 22(22):2800–2805
Yan W, Xue W, Chen J et al (2016) Biological networks for cancer candidate biomarkers discovery. Cancer Inform 15:CIN-S39458
Yang H, Xue J, Li J et al (2020) Identification of key genes and pathways of diagnosis and prognosis in cervical cancer by bioinformatics analysis. Mol Genet Genomic Med 8(6):e1200
Yang Y, Zhu Y, Li X et al (2021) Identification of potential biomarkers and metabolic pathways based on integration of metabolomic and transcriptomic data in the development of breast cancer. Arch Gynecol Obstet 303:1599–1606
Yu D, Kim M, Xiao G et al (2013) Review of biological network data and its applications. Genom Inform 11(4):200
Yu H, Kim PM, Sprecher E et al (2007) The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput Biol 3(4):e59
Zamanian-Azodi M, Rezaei-Tavirani M, Rahmati-Rad S et al (2015) Protein–protein interaction network could reveal the relationship between the breast and colon cancer. Gastroenterol Hepatol Bed Bench 8(3):215
Zeng X, Shi G, He Q et al (2021) Screening and predicted value of potential biomarkers for breast cancer using bioinformatics analysis. Sci Rep 11(1):20799
Zhang P, Itan Y (2019) Biological network approaches and applications in rare disease studies. Genes 10(10):797
Zhang P, Wang J, Li X et al (2008) Clustering coefficient and community structure of bipartite networks. Physica A 387(27):6869–6875
Zhong J, Tang C, Peng W et al (2021) A novel essential protein identification method based on PPI networks and gene expression data. BMC Bioinform 22(1):248
Zhou G, Soufan O, Ewald J et al (2019) Networkanalyst 3.0: a visual analytics platform for comprehensive gene expression profiling and meta-analysis. Nucl Acids Res 47(W1):W234–W241
Acknowledgements
We acknowledge the DST-FIST Bioinformatics Lab, IIIT Bhubaneswar for the computational facilities used in this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author confirms that there are no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rout, T., Mohapatra, A. & Kar, M. A systematic review of graph-based explorations of PPI networks: methods, resources, and best practices. Netw Model Anal Health Inform Bioinforma 13, 29 (2024). https://doi.org/10.1007/s13721-024-00467-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13721-024-00467-0