Skip to main content
Log in

A random walk-based method for detecting essential proteins by integrating the topological and biological features of PPI network

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

The essential protein detection on protein–protein interaction (PPI) network can not only promote the research of life science, but also have important applications in disease diagnosis and drug target cell identifying. A large number of computation-based essential protein detection algorithms have been presented recently. Most of those methods detect the essential proteins according to the centrality measures of the nodes in PPI networks. Those centrality-based essential protein detection methods only consider the topological information of the PPI networks and neglect the biological features of the proteins which are crucial in recognizing the essential proteins. This paper presents a random walk-based method named EPD-RW to identify essential proteins by integrating network topology and biological information extracted from GO (gene ontology) data, gene expression profiles, domain information and phylogenetic profile. EPD-RW uses both the topological structure of the PPI and biological information of the proteins to guide the random walk for computing their essentialness. An iterative method is presented to efficiently integrate the topological and biological features at each step of the random walk. We test our method EDP-RW by experiments on yeast PPI datasets. We also compare the test results of EDP-RW with those of other methods. The experimental results demonstrate that EPD-RW can achieve the best performance among all the methods tested. The biological illustration of the results shows that our random walk-based method effectively increases the accuracy of essential proteins detecting results, and the biological features of the proteins can greatly enhance the performance of essential protein detecting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Campos TL, Korhonen PK, Gasser RB, Young ND (2019) An evaluation of machine learning approaches for the prediction of essential genes in eukaryotes using protein sequence-derived features. Comput Struct Biotechnol J 17:785–796

    Article  Google Scholar 

  • Chen L, Vitkup D (2006) Predicting genes for orphan metabolic activities using phylogenetic profiles. Genome Biol 7(2):R17

    Article  Google Scholar 

  • Consortium G.O (2014) Gene ontology consortium: going forward. Nucleic Acids Res 43(D1):D1049–D1056

    Article  Google Scholar 

  • Cullen LM, Arndt GM (2005) Genome-wide screening for gene function using RNAi in mammalian cells. Immunol Cell Biol 83(3):217–223

    Article  Google Scholar 

  • Gavin AC, Aloy P, Grandi P, Krause R, Boesche MM, Marzioch M, Edelmann A (2006) Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084):631–636

    Article  Google Scholar 

  • George G, Parambath SV, Bekshe Lokappa SB, Varkey J (2019) Construction of Parkinson’s disease marker-based weighted protein-protein interaction network for prioritization of co-expressed genes. Gene 697:67–77

    Article  Google Scholar 

  • Gustafson AM, Snitkin ES, Parker SC, DeLisi C, Kasif S (2006) Towards the identification of essential genes using targeted genome sequencing and comparative analysis. BMC Genom 7(1):265

    Article  Google Scholar 

  • Hart GT, Lee I, Marcotte EM (2007) A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality. BMC Bioinform 8(1):236

    Article  Google Scholar 

  • Jeong H, Mason SP, Barabási AL, Oltvai ZN (2001) Lethality and centrality in protein networks. Nature 411(6833):41

    Article  Google Scholar 

  • Ji J, Lv J, Yang C, Zhang AD (2016) Detecting functional modules based on a multiple-grain model in large-scale protein-protein interaction networks. IEEE/ACM Trans Comput Biol Bioinf 13(4):610–622

    Article  Google Scholar 

  • Jiang Y, Wang Y, Peng W, Chen L, Sun H, Liang Y, Blanzieri E (2014) Essential protein identification based on essential protein–protein interaction prediction by integrated edge weights. In: IEEE international conference on bioinformatics and biomedicine (BIBM)

  • Jones P, Binns D, Chang HY, Fraser M, Li WZ et al (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30(9):1236–1240

    Article  Google Scholar 

  • Kim W (2012) Prediction of essential proteins using topological properties in GO-pruned PPI network based on machine learning methods. Tsinghua Sci Technol 17(6):645–658

    Article  Google Scholar 

  • Kim W , Li M, Wang J X, Pan Y (2011) Essential protein discovery based on network motif and gene ontology. In: 2011 IEEE international conference on bioinformatics and biomedicine. IEEE Press

  • Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J (2006) Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440(7084):637–643

    Article  Google Scholar 

  • Lei XJ, Zhao J, Fujita H, Zhang AD (2018) Predicting essential proteins based on RNA-Seq, subcellular localization and GO annotation datasets. Knowl-Based Syst 151:136–148

    Article  Google Scholar 

  • Li M, Zhang H, Wang J, Pan Y (2012) A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst Biol 6(1):15

    Article  Google Scholar 

  • Li M, Zheng R, Zhang H, Wang J, Pan Y (2014) Effective identification of essential proteins based on priori knowledge, network topology and gene expressions. Methods 2014:325–333

    Article  Google Scholar 

  • Li M, Lu Y, Wang JX, Wu FX, Pan Y (2015) A topology potential-based method for identifying essential proteins from PPI networks. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 12(2):372–383

    Article  Google Scholar 

  • Li G, Li M, Wang J, Wu J, Wu FX (2016) Pan Y (2016) Predicting essential proteins based on subcellular localization, orthology and PPI networks. BMC Bioinform 17(8):571–581

    Google Scholar 

  • Li M, Li WK, Wu FX, Pan Y, Wang JX (2018) Identifying essential proteins based on sub-network partition and prioritization by integrating subcellular localization information. J Theor Biol 447:65–73

    Article  Google Scholar 

  • Li GS, Li M, Peng W, Li YH, Wang JX (2019a) A novel extended Pareto Optimality Consensus model for predicting essential proteins. J Theor Biol 480:141–149

    Article  MATH  Google Scholar 

  • Li M, Ni P, Chen X, Wang J, Wu F, Pan Y (2019b) Construction of refined protein interaction network for predicting essential proteins. IEEE/ACM Trans Comput Biol Bioinform 16(4):1386–1397

    Article  Google Scholar 

  • Liu G, Wong L, Chua HN (2009) Complex discovery from weighted PPI networks. Bioinformatics 25(15):1891–1897

    Article  Google Scholar 

  • Peng W, Wang JX, Wang WP, Liu Q, Wu FX, Pan Y (2012) Iteration method for predicting essential proteins based on orthology and protein-protein interaction networks. BMC Syst Biol 6(1):87

    Article  Google Scholar 

  • Peng W, Wang J, Cheng Y, Lu Y, Wu FX, Pan Y (2015) UDoNC: an algorithm for identifying essential proteins based on protein domains and protein-protein interaction networks. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 12(2):276–288

    Article  Google Scholar 

  • Ren J, Wang J, Li M, Wang H, Liu B (2011) Prediction of essential proteins by integration of PPI network topology and protein complexes information. In: Chen J, Wang J, Zelikovsky A (eds) Bioinformatics research and applications. ISBRA 2011. Lecture Notes in Computer Science, vol 6674. Springer, Berlin. https://doi.org/10.1007/978-3-642-21260-4_6

  • Roemer T, Jiang B, Davison J, Ketela T, Veillette K et al (2003) Large-scale essential gene identification in Candida albicans and applications to antifungal drug discovery. Mol Microbiol 50(1):167–181

    Article  Google Scholar 

  • Stevenson D, Zumajo-Cardona C (2018) From plant ontology to gene ontology and back. Curr Plant Biol 14:66–69

    Article  Google Scholar 

  • Tang X, Wang WX, Zhong JC, Pan Y (2014) Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 11(2):407–418

    Article  Google Scholar 

  • Tang Y, Li M, Wang JX, Pan Y, Wu FX (2015) CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems 127:67–72

    Article  Google Scholar 

  • Tu BP, Kudlicki A, Rowicka M, McKnight SL (2005) Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science 310(5751):1152–1158

    Article  Google Scholar 

  • Vallabhajosyula RR, Chakravarti D, Lutfeali S, Ray A, Raval A (2009) Identifying hubs in protein interaction networks. PLoS ONE 4(4):e5344

    Article  Google Scholar 

  • Wang JZ, Du DZ, Payattakool R, Yu PS, Chen CF (2007) A new method to measure the semantic similarity of GO terms. Bioinformatics 23(10):1274–1281

    Article  Google Scholar 

  • Wang J, Li M, Wang H, Pan Y (2012) Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol Bioinf 9(4):1070–1080

    Article  Google Scholar 

  • Wang J, Peng W, Wu FX (2013) Computational approaches to predicting essential proteins: a survey. PROTEOMICS Clin Appl 7(1–2):181–192

    Article  Google Scholar 

  • Xenarios I, Salwínski L, Duan XJ, Higney P, Kim SM, Eisenberg D (2002) DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res 30(1):303–305

    Article  Google Scholar 

  • Xiao Q, Wang J, Peng X, Wu F (2015) Pan Y (2015) Identifying essential proteins from active PPI networks constructed with dynamic gene expression. BMC Genom 16(S3):S1

    Article  Google Scholar 

  • Yi Q, Luo J (2015) Prediction of essential proteins based on local interaction density. IEEE/ACM Trans Comput Biol Bioinf 13(6):1170–1182

    Google Scholar 

  • Zhang X, Xu J, Xiao WX (2013) A new method for the discovery of essential proteins. PLoS ONE 8(3):e58763

    Article  Google Scholar 

  • Zhang ZP, Ruan JS, Gao JZ, Wu FX (2019) Predicting essential proteins from protein-protein interactions using order statistics. J Theor Biol 480:274–283

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao B, Wang J, Li M, Wu F, Pan Y (2014) Prediction of essential proteins based on overlapping essential modules. IEEE Trans Nanobiosci 13(4):415–424

    Article  Google Scholar 

  • Zhao B, Wang J, Li X, Wu FX (2016) Essential protein discovery based on a combination of modularity and conservatism. Methods 110:54–63

    Article  Google Scholar 

  • Zhong J, Wang J, Peng W, Zhang Z (2013) Pan Y (2013) Prediction of essential proteins based on gene expression programming. BMC Genom 14(S4):S7

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported in part by the Chinese National Natural Science Foundation under grant Nos. 61379066, 61702441, 61070047, 61379064, 61472344, 61402395, 61906100 and 61602202; Natural Science Foundation of Jiangsu Province under contracts BK20180822, BK20130452, BK2012672, BK2012128 and BK20140492; and Natural Science Foundation of Education Department of Jiangsu Province under contract 18KJB520040, 12KJB520019 and 13KJB520026.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ling Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical standards

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Informed consent

This article does not contain any studies with animals performed by any of the authors. Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ahmed, N.M., Chen, L., Li, B. et al. A random walk-based method for detecting essential proteins by integrating the topological and biological features of PPI network. Soft Comput 25, 8883–8903 (2021). https://doi.org/10.1007/s00500-021-05780-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-021-05780-8

Keywords

Navigation