Skip to main content

Advertisement

Essential Protein Identification in Cancer: A Graph-Based Approach Integrating Topological and Biological Features in PPI Networks

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

The essential protein identification on the protein–protein interaction (PPI) network can have crucial applications in cancer disease diagnosis and drug target cell identification. The study uses a graph-based approach to identify essential proteins in protein–protein interaction networks. Despite significant advancements in cancer research, identifying essential cancer proteins within PPI networks remains still a major challenge. The advantages of using PPI networks are the interconnectedness of cancer proteins and prioritize with the most significant impact on cancer disease progression. The proposed approach introduces an innovative way of identifying essential cancer proteins within PPI networks associated with breast, lung, colorectal, and ovarian cancers. This study commenced with an organized sequence of analytical procedures using cancer gene datasets from the National Center for Biotechnology Information (NCBI) about breast, lung, colorectal, and ovarian cancers. A graph-based random walk with restart (EPI-GBRWR), a novel method is introduced for exploring essential proteins that integrates topological and biological properties within PPI networks. A pivotal moment ensued with the implementation of an essential protein identification using graph-based random walk with restart, shedding light on the hierarchical influence of proteins within the PPI network. The outcomes of this investigation substantiate and contextualize the functional ramifications of the identified proteins through rigorous statistical assessments, including permutation and enrichment tests. The application of pathway analysis to these findings illuminates interconnected molecular pathways in cancer. This work underscores the potency of integrative methodologies in deciphering the complexity of cancer, presenting a transformative era in cancer research and treatment. The computational results confirm EPI-GBRWR’s efficiency in predicting essential proteins. Compared to other state-of-the-art methods for identifying essential proteins, EPI-GBRWR outperforms various evaluation criteria, marking a significant advancement in precision oncology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Algorithm 2
Algorithm 3
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The dataset generated and analyzed during the current study is available from the corresponding author upon reasonable request.

References

  1. Cai L, Shi B, Zhu K, Zhong X, Lai D, Wang J, Tou J. Bioinformatical analysis of the key differentially expressed genes for screening potential biomarkers in Wilms tumor. Sci Rep. 2023;13(1):15404.

    Article  Google Scholar 

  2. Ahmed F, Samantasinghar A, Ali W, Choi KH. Network-based drug repurposing identifies small molecule drugs as immune checkpoint inhibitors for endometrial cancer. Mol Divers. 2024;1–17.

  3. Mottaghi-Dastjerdi N, Ghorbani A, Montazeri H, Guzzi PH. A systems biology approach to pathogenesis of gastric cancer: gene network modeling and pathway analysis. BMC Gastroenterol. 2023;23(1):248.

    Article  Google Scholar 

  4. Rout T, Mohapatra A, Kar M. A systematic review of graph-based explorations of ppi networks: methods, resources, and best practices. Netw Model Anal Health Inform Bioinform. 2024;13(1):29.

    Article  Google Scholar 

  5. Ma H, He Z, Chen J, Zhang X, Song P. Identifying of biomarkers associated with gastric cancer based on 11 topological analysis methods of CytoHubba. Sci Rep. 2021;11(1):1331.

    Article  Google Scholar 

  6. Zhou Z, Hu G. Applications of graph theory in studying protein structure, dynamics, and interactions. J Math Chem. 2023;1–19.

  7. Chakraborty S, Banerjee S. Systems approaches in identifying disease-related genes and drug targets. In: Systems biology approaches: prevention, diagnosis, and understanding mechanisms of complex diseases. Berlin: Springer; 2024. pp. 195–255.

  8. Li G, Luo X, Hu Z, Wu J, Peng W, Liu J, Zhu X. Essential proteins discovery based on dominance relationship and neighborhood similarity centrality. Health Inf Sci Syst. 2023;11(1):55.

    Article  Google Scholar 

  9. Rajeh S, Savonnet M, Leclercq E, Cherifi H. Comparative evaluation of community-aware centrality measures. Qual Quant. 2023;57(2):1273–302.

    Article  Google Scholar 

  10. Zou H-T, Ji B-Y, Xie X-L. A multi-source molecular network representation model for protein-protein interactions prediction. Sci Rep. 2024;14(1):6184.

    Article  Google Scholar 

  11. Mukhopadhyay A, Ray S, Maulik U, Bandyopadhyay S. Multiobjective approach to protein complex detection. In: Multiobjective optimization algorithms for bioinformatics. Berlin: Springer; 2024. pp. 171–93.

  12. Bloch F, Jackson MO, Tebaldi P. Centrality measures in networks. Soc Choice Welf. 2023;1–41.

  13. Hahn MW, Kern AD. Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol Biol Evol. 2005;22(4):803–6.

    Article  Google Scholar 

  14. Wuchty S, Stadler PF. Centers of complex networks. J Theor Biol. 2003;223(1):45–53.

    Article  MathSciNet  Google Scholar 

  15. Joy MP, Brock A, Ingber DE, Huang S. High-betweenness proteins in the yeast protein interaction network. J Biomed Biotechnol. 2005;2005(2):96.

    Google Scholar 

  16. Bonacich P. Power and centrality: a family of measures. Am J Sociol. 1987;92(5):1170–82.

    Article  Google Scholar 

  17. Estrada E, Rodriguez-Velazquez JA. Subgraph centrality in complex networks. Phys Rev E. 2005;71(5): 056103.

    Article  MathSciNet  Google Scholar 

  18. Stephenson K, Zelen M. Rethinking centrality: methods and examples. Soc Netw. 1989;11(1):1–37.

    Article  MathSciNet  Google Scholar 

  19. Wang J, Li M, Wang H, Pan Y. Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol Bioinform. 2011;9(4):1070–80.

    Article  Google Scholar 

  20. Li M, Wang J, Chen X, Wang H, Pan Y. A local average connectivity-based method for identifying essential proteins from the network level. Comput Biol Chem. 2011;35(3):143–50.

    Article  MathSciNet  Google Scholar 

  21. Han S, Hong J, Yun SJ, Koo HJ, Kim TY. Pwn: enhanced random walk on a warped network for disease target prioritization. BMC Bioinform. 2023;24(1):105.

    Article  Google Scholar 

  22. Cappelletti L, Taverni S, Fontana T, Joachimiak MP, Reese J, Robinson P, Casiraghi E, Valentini G. Degree-normalization improves random-walk-based embedding accuracy in ppi graphs. In: International work-conference on bioinformatics and biomedical engineering. Springer; 2023. pp. 372–83.

  23. Nayar G, Altman RB. Heterogeneous network approaches to protein pathway prediction. Comput Struct Biotechnol J. 2024.

  24. Gao Z, Jiang C, Zhang J, Jiang X, Li L, Zhao P, Yang H, Huang Y, Li J. Hierarchical graph learning for protein-protein interaction. Nat Commun. 2023;14(1):1093.

    Article  Google Scholar 

  25. Menor-Flores M, Vega-Rodríguez MA. A protein-protein interaction network aligner study in the multi-objective domain. Comput Methods Progr Biomed. 2024;250: 108188.

    Article  Google Scholar 

  26. Rajan S, Schwarz E. Network-based artificial intelligence approaches for advancing personalized psychiatry. Am J Med Genet Part B Neuropsychiatr Genet. 2024;32997.

  27. Zhao H, Liu G, Cao X. A seed expansion-based method to identify essential proteins by integrating protein-protein interaction sub-networks and multiple biological characteristics. BMC Bioinform. 2023;24(1):452.

    Article  Google Scholar 

  28. Li B-Q, You J, Chen L, Zhang J, Zhang N, Li H-P, Huang T, Kong X-Y, Cai Y-D. Identification of lung-cancer-related genes with the shortest path approach in a protein-protein interaction network. BioMed Res Int. 2013;2013.

  29. Jiang M, Chen Y, Zhang Y, Chen L, Zhang N, Huang T, Cai Y-D, Kong X. Identification of hepatocellular carcinoma related genes with k-th shortest paths in a protein-protein interaction network. Mol BioSyst. 2013;9(11):2720–8.

    Article  Google Scholar 

  30. Failli M, Paananen J, Fortino V. Prioritizing target-disease associations with novel safety and efficacy scoring methods. Sci Rep. 2019;9(1):9852.

    Article  Google Scholar 

  31. Cullen LM, Arndt GM. Genome-wide screening for gene function using rnai in mammalian cells. Immunol Cell Biol. 2005;83(3):217–23.

    Article  Google Scholar 

  32. Zhao B, Wang J, Li M, Wu F-X, Pan Y. Prediction of essential proteins based on overlapping essential modules. IEEE Trans Nanobiosci. 2014;13(4):415–24.

    Article  Google Scholar 

  33. Vallabhajosyula RR, Chakravarti D, Lutfeali S, Ray A, Raval A. Identifying hubs in protein interaction networks. PloS One. 2009;4(4):5344.

    Article  Google Scholar 

  34. Jeong H, Mason SP, Barabási A-L, Oltvai ZN. Lethality and centrality in protein networks. Nature. 2001;411(6833):41–2.

    Article  Google Scholar 

  35. Amala A, Emerson IA. Identification of target genes in cancer diseases using protein-protein interaction networks. Netw Model Anal Health Inform Bioinform. 2019;8:1–13.

    Article  Google Scholar 

  36. Tumuluru P, Ravi B. Dijkstra’s based identification of lung cancer related genes using ppi networks. Int J Comput Appl. 2017;975:8887.

    Google Scholar 

  37. Brandes U. On variants of shortest-path betweenness centrality and their generic computation. Soc Netw. 2008;30(2):136–45.

    Article  Google Scholar 

  38. He B, Tang J, Ding Y, Wang H, Sun Y, Shin JH, Chen B, Moorthy G, Qiu J, Desai P. Mining relational paths in integrated biomedical data. PLoS One. 2011;6(12):27506.

    Article  Google Scholar 

  39. Li C, Li Q, Van Mieghem P, Stanley HE, Wang H. Correlation between centrality metrics and their application to the opinion model. Eur Phys J B. 2015;88(3):1–13.

    Article  MathSciNet  Google Scholar 

  40. Peng W, Wang J, Wang W, Liu Q, Wu F-X, Pan Y. Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks. BMC Syst Biol. 2012;6(1):1–17.

    Article  Google Scholar 

  41. Zhong J, Wang J, Peng W, Zhang Z, Pan Y. Prediction of essential proteins based on gene expression programming. BMC Genom. 2013;14(4):1–8.

    Google Scholar 

  42. Kim W. Prediction of essential proteins using topological properties in go-pruned ppi network based on machine learning methods. Tsinghua Sci Technol. 2012;17(6):645–58.

    Article  Google Scholar 

  43. Ahmed MR, Rehana H, Asaduzzaman S. Protein interaction network and drug design of stomach cancer and associated disease: a bioinformatics approach. J Proteins Proteom. 2021;12(1):33–43.

    Article  Google Scholar 

  44. Hasan MR, Paul BK, Ahmed K, Bhuyian T. Design protein-protein interaction network and protein-drug interaction network for common cancer diseases: a bioinformatics approach. Inform Med Unlocked. 2020;18: 100311.

    Article  Google Scholar 

  45. Wahab Khattak F, Salamah Alhwaiti Y, Ali A, Faisal M, Siddiqi MH. Protein-protein interaction analysis through network topology (oral cancer). J Healthc Eng. 2021;2021.

  46. Dalkılıç F, Işik Z. Compound target identification in tissue-specific interaction networks. IEEE Access. 2021;9:81702–16.

    Article  Google Scholar 

  47. Amanatidou AI, Dedoussis GV. Construction and analysis of protein-protein interaction network of non-alcoholic fatty liver disease. Comput Biol Med. 2021;131: 104243.

    Article  Google Scholar 

  48. Murphy M, Brown G, Wallin C, Tatusova T, Pruitt K, Murphy T, Maglott D. Gene help: integrated access to genes of genomes in the reference sequence collection. In: Gene Help . National Center for Biotechnology Information (US) (2021)

  49. Mering Cv, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B. String: a database of predicted functional associations between proteins. Nucl Acids Res. 2003;31(1):258–61.

  50. Huang DW, Sherman BT, Tan Q, Collins JR, Alvord WG, Roayaei J, Stephens R, Baseler MW, Lane HC, Lempicki RA. The David gene functional classification tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol. 2007;8(9):1–16.

    Article  Google Scholar 

Download references

Acknowledgements

We acknowledge the infrastructure and computational facilities received from DST-FIST Bioinformatics Lab of IIIT Bhubaneswar.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Trilochan Rout.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author confirms that there are no Conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rout, T., Mohapatra, A., Kar, M. et al. Essential Protein Identification in Cancer: A Graph-Based Approach Integrating Topological and Biological Features in PPI Networks. SN COMPUT. SCI. 5, 947 (2024). https://doi.org/10.1007/s42979-024-03312-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-024-03312-3

Keywords