Abstract
The essential protein identification on the protein–protein interaction (PPI) network can have crucial applications in cancer disease diagnosis and drug target cell identification. The study uses a graph-based approach to identify essential proteins in protein–protein interaction networks. Despite significant advancements in cancer research, identifying essential cancer proteins within PPI networks remains still a major challenge. The advantages of using PPI networks are the interconnectedness of cancer proteins and prioritize with the most significant impact on cancer disease progression. The proposed approach introduces an innovative way of identifying essential cancer proteins within PPI networks associated with breast, lung, colorectal, and ovarian cancers. This study commenced with an organized sequence of analytical procedures using cancer gene datasets from the National Center for Biotechnology Information (NCBI) about breast, lung, colorectal, and ovarian cancers. A graph-based random walk with restart (EPI-GBRWR), a novel method is introduced for exploring essential proteins that integrates topological and biological properties within PPI networks. A pivotal moment ensued with the implementation of an essential protein identification using graph-based random walk with restart, shedding light on the hierarchical influence of proteins within the PPI network. The outcomes of this investigation substantiate and contextualize the functional ramifications of the identified proteins through rigorous statistical assessments, including permutation and enrichment tests. The application of pathway analysis to these findings illuminates interconnected molecular pathways in cancer. This work underscores the potency of integrative methodologies in deciphering the complexity of cancer, presenting a transformative era in cancer research and treatment. The computational results confirm EPI-GBRWR’s efficiency in predicting essential proteins. Compared to other state-of-the-art methods for identifying essential proteins, EPI-GBRWR outperforms various evaluation criteria, marking a significant advancement in precision oncology.
Similar content being viewed by others
Data availability
The dataset generated and analyzed during the current study is available from the corresponding author upon reasonable request.
References
Cai L, Shi B, Zhu K, Zhong X, Lai D, Wang J, Tou J. Bioinformatical analysis of the key differentially expressed genes for screening potential biomarkers in Wilms tumor. Sci Rep. 2023;13(1):15404.
Ahmed F, Samantasinghar A, Ali W, Choi KH. Network-based drug repurposing identifies small molecule drugs as immune checkpoint inhibitors for endometrial cancer. Mol Divers. 2024;1–17.
Mottaghi-Dastjerdi N, Ghorbani A, Montazeri H, Guzzi PH. A systems biology approach to pathogenesis of gastric cancer: gene network modeling and pathway analysis. BMC Gastroenterol. 2023;23(1):248.
Rout T, Mohapatra A, Kar M. A systematic review of graph-based explorations of ppi networks: methods, resources, and best practices. Netw Model Anal Health Inform Bioinform. 2024;13(1):29.
Ma H, He Z, Chen J, Zhang X, Song P. Identifying of biomarkers associated with gastric cancer based on 11 topological analysis methods of CytoHubba. Sci Rep. 2021;11(1):1331.
Zhou Z, Hu G. Applications of graph theory in studying protein structure, dynamics, and interactions. J Math Chem. 2023;1–19.
Chakraborty S, Banerjee S. Systems approaches in identifying disease-related genes and drug targets. In: Systems biology approaches: prevention, diagnosis, and understanding mechanisms of complex diseases. Berlin: Springer; 2024. pp. 195–255.
Li G, Luo X, Hu Z, Wu J, Peng W, Liu J, Zhu X. Essential proteins discovery based on dominance relationship and neighborhood similarity centrality. Health Inf Sci Syst. 2023;11(1):55.
Rajeh S, Savonnet M, Leclercq E, Cherifi H. Comparative evaluation of community-aware centrality measures. Qual Quant. 2023;57(2):1273–302.
Zou H-T, Ji B-Y, Xie X-L. A multi-source molecular network representation model for protein-protein interactions prediction. Sci Rep. 2024;14(1):6184.
Mukhopadhyay A, Ray S, Maulik U, Bandyopadhyay S. Multiobjective approach to protein complex detection. In: Multiobjective optimization algorithms for bioinformatics. Berlin: Springer; 2024. pp. 171–93.
Bloch F, Jackson MO, Tebaldi P. Centrality measures in networks. Soc Choice Welf. 2023;1–41.
Hahn MW, Kern AD. Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol Biol Evol. 2005;22(4):803–6.
Wuchty S, Stadler PF. Centers of complex networks. J Theor Biol. 2003;223(1):45–53.
Joy MP, Brock A, Ingber DE, Huang S. High-betweenness proteins in the yeast protein interaction network. J Biomed Biotechnol. 2005;2005(2):96.
Bonacich P. Power and centrality: a family of measures. Am J Sociol. 1987;92(5):1170–82.
Estrada E, Rodriguez-Velazquez JA. Subgraph centrality in complex networks. Phys Rev E. 2005;71(5): 056103.
Stephenson K, Zelen M. Rethinking centrality: methods and examples. Soc Netw. 1989;11(1):1–37.
Wang J, Li M, Wang H, Pan Y. Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol Bioinform. 2011;9(4):1070–80.
Li M, Wang J, Chen X, Wang H, Pan Y. A local average connectivity-based method for identifying essential proteins from the network level. Comput Biol Chem. 2011;35(3):143–50.
Han S, Hong J, Yun SJ, Koo HJ, Kim TY. Pwn: enhanced random walk on a warped network for disease target prioritization. BMC Bioinform. 2023;24(1):105.
Cappelletti L, Taverni S, Fontana T, Joachimiak MP, Reese J, Robinson P, Casiraghi E, Valentini G. Degree-normalization improves random-walk-based embedding accuracy in ppi graphs. In: International work-conference on bioinformatics and biomedical engineering. Springer; 2023. pp. 372–83.
Nayar G, Altman RB. Heterogeneous network approaches to protein pathway prediction. Comput Struct Biotechnol J. 2024.
Gao Z, Jiang C, Zhang J, Jiang X, Li L, Zhao P, Yang H, Huang Y, Li J. Hierarchical graph learning for protein-protein interaction. Nat Commun. 2023;14(1):1093.
Menor-Flores M, Vega-Rodríguez MA. A protein-protein interaction network aligner study in the multi-objective domain. Comput Methods Progr Biomed. 2024;250: 108188.
Rajan S, Schwarz E. Network-based artificial intelligence approaches for advancing personalized psychiatry. Am J Med Genet Part B Neuropsychiatr Genet. 2024;32997.
Zhao H, Liu G, Cao X. A seed expansion-based method to identify essential proteins by integrating protein-protein interaction sub-networks and multiple biological characteristics. BMC Bioinform. 2023;24(1):452.
Li B-Q, You J, Chen L, Zhang J, Zhang N, Li H-P, Huang T, Kong X-Y, Cai Y-D. Identification of lung-cancer-related genes with the shortest path approach in a protein-protein interaction network. BioMed Res Int. 2013;2013.
Jiang M, Chen Y, Zhang Y, Chen L, Zhang N, Huang T, Cai Y-D, Kong X. Identification of hepatocellular carcinoma related genes with k-th shortest paths in a protein-protein interaction network. Mol BioSyst. 2013;9(11):2720–8.
Failli M, Paananen J, Fortino V. Prioritizing target-disease associations with novel safety and efficacy scoring methods. Sci Rep. 2019;9(1):9852.
Cullen LM, Arndt GM. Genome-wide screening for gene function using rnai in mammalian cells. Immunol Cell Biol. 2005;83(3):217–23.
Zhao B, Wang J, Li M, Wu F-X, Pan Y. Prediction of essential proteins based on overlapping essential modules. IEEE Trans Nanobiosci. 2014;13(4):415–24.
Vallabhajosyula RR, Chakravarti D, Lutfeali S, Ray A, Raval A. Identifying hubs in protein interaction networks. PloS One. 2009;4(4):5344.
Jeong H, Mason SP, Barabási A-L, Oltvai ZN. Lethality and centrality in protein networks. Nature. 2001;411(6833):41–2.
Amala A, Emerson IA. Identification of target genes in cancer diseases using protein-protein interaction networks. Netw Model Anal Health Inform Bioinform. 2019;8:1–13.
Tumuluru P, Ravi B. Dijkstra’s based identification of lung cancer related genes using ppi networks. Int J Comput Appl. 2017;975:8887.
Brandes U. On variants of shortest-path betweenness centrality and their generic computation. Soc Netw. 2008;30(2):136–45.
He B, Tang J, Ding Y, Wang H, Sun Y, Shin JH, Chen B, Moorthy G, Qiu J, Desai P. Mining relational paths in integrated biomedical data. PLoS One. 2011;6(12):27506.
Li C, Li Q, Van Mieghem P, Stanley HE, Wang H. Correlation between centrality metrics and their application to the opinion model. Eur Phys J B. 2015;88(3):1–13.
Peng W, Wang J, Wang W, Liu Q, Wu F-X, Pan Y. Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks. BMC Syst Biol. 2012;6(1):1–17.
Zhong J, Wang J, Peng W, Zhang Z, Pan Y. Prediction of essential proteins based on gene expression programming. BMC Genom. 2013;14(4):1–8.
Kim W. Prediction of essential proteins using topological properties in go-pruned ppi network based on machine learning methods. Tsinghua Sci Technol. 2012;17(6):645–58.
Ahmed MR, Rehana H, Asaduzzaman S. Protein interaction network and drug design of stomach cancer and associated disease: a bioinformatics approach. J Proteins Proteom. 2021;12(1):33–43.
Hasan MR, Paul BK, Ahmed K, Bhuyian T. Design protein-protein interaction network and protein-drug interaction network for common cancer diseases: a bioinformatics approach. Inform Med Unlocked. 2020;18: 100311.
Wahab Khattak F, Salamah Alhwaiti Y, Ali A, Faisal M, Siddiqi MH. Protein-protein interaction analysis through network topology (oral cancer). J Healthc Eng. 2021;2021.
Dalkılıç F, Işik Z. Compound target identification in tissue-specific interaction networks. IEEE Access. 2021;9:81702–16.
Amanatidou AI, Dedoussis GV. Construction and analysis of protein-protein interaction network of non-alcoholic fatty liver disease. Comput Biol Med. 2021;131: 104243.
Murphy M, Brown G, Wallin C, Tatusova T, Pruitt K, Murphy T, Maglott D. Gene help: integrated access to genes of genomes in the reference sequence collection. In: Gene Help . National Center for Biotechnology Information (US) (2021)
Mering Cv, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B. String: a database of predicted functional associations between proteins. Nucl Acids Res. 2003;31(1):258–61.
Huang DW, Sherman BT, Tan Q, Collins JR, Alvord WG, Roayaei J, Stephens R, Baseler MW, Lane HC, Lempicki RA. The David gene functional classification tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol. 2007;8(9):1–16.
Acknowledgements
We acknowledge the infrastructure and computational facilities received from DST-FIST Bioinformatics Lab of IIIT Bhubaneswar.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author confirms that there are no Conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rout, T., Mohapatra, A., Kar, M. et al. Essential Protein Identification in Cancer: A Graph-Based Approach Integrating Topological and Biological Features in PPI Networks. SN COMPUT. SCI. 5, 947 (2024). https://doi.org/10.1007/s42979-024-03312-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-024-03312-3