Skip to main content

Advertisement

Log in

In silico prediction methods of self-interacting proteins: an empirical and academic survey

  • Review Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

In silico prediction of self-interacting proteins (SIPs) has become an important part of proteomics. There is an urgent need to develop effective and reliable prediction methods to overcome the disadvantage of high cost and labor intensive in traditional biological wet-lab experiments. The goal of our survey is to sum up a comprehensive overview of the recent literature with the computational SIPs prediction, to provide important references for actual work in the future. In this review, we first describe the data required for the task of DTIs prediction. Then, some interesting feature extraction methods and computational models are presented on this topic in a timely manner. Afterwards, an empirical comparison is performed to demonstrate the prediction performance of some classifiers under different feature extraction and encoding schemes. Overall, we conclude and highlight potential methods for further enhancement of SIPs prediction performance as well as related research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  1. Watson J D, Cook-Deegan R M. Origins of the human genome project. The FASEB Journal, 1991, 5(1): 8–11

    Article  Google Scholar 

  2. Min S, Lee B, Yoon S. Deep learning in bioinformatics. Briefings in Bioinformatics, 2017, 18(5): 851–869

    Google Scholar 

  3. Larrañaga P, Calvo B, Santana R, Bielza C, Galdiano J, Inza I, Lozano J A, Armañanzas R, Santafé G, Pérez A. Machine learning in bioinformatics. Briefings in Bioinformatics, 2006, 7(1): 86–112

    Article  Google Scholar 

  4. Baxevanis A D, Bader G, Wishart D. Bioinformatics. John Wiley amp; Sons, 2020

  5. Black D L. Protein diversity from alternative splicing: a challenge for bioinformatics and post-genome biology. Cell, 2000, 103(3): 367–370

    Article  Google Scholar 

  6. James P. Protein identification in the post-genome era: the rapid rise of proteomics. Quarterly Reviews of Biophysics, 1997, 30(4): 279–331

    Article  Google Scholar 

  7. Eisenberg D, Marcotte E M, Xenarios I, Yeates T O. Protein function in the post-genomic era. Nature, 2000, 405(6788): 823–826

    Article  Google Scholar 

  8. Kanehisa M, Bork P. Bioinformatics in the post-sequence era. Nature Genetics, 2003, 33(3): 305–310

    Article  Google Scholar 

  9. Medini D, Serruto D, Parkhill J, Relman D A, Donati C, Moxon R, Falkow S, Rappuoli R. Microbiology in the post-genomic era. Nature Reviews Microbiology, 2008, 6(6): 419–430

    Article  Google Scholar 

  10. Hanash S. Disease proteomics. Nature, 2003, 422(6928): 226–232

    Article  Google Scholar 

  11. Rual J F, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz G F, Gibbons F D, Dreze M, Ayivi-Guedehoussou N, Klitgord N, Simon C, Boxem M, Milstein S, Rosenberg J, Goldberg D S, Zhang L V, Wong S L, Franklin G, Li S, Albala J S, Lim J, Fraughton C, Llamosas E, Cevik S, Bex C, Lamesch P, Sikorski R S, Vandenhaute J, Zoghbi H Y, Smolyar A, Bosak S, Sequerra R, Doucette-Stamm L, Cusick M E, Hill D E, Roth F P, Vidal M. Towards a proteome-scale map of the human protein—protein interaction network. Nature, 2005, 437(7062): 1173–1178

    Article  Google Scholar 

  12. Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck F H, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksöz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker E E. A human protein-protein interaction network: a resource for annotating the proteome. Cell, 2005, 122(6): 957–968

    Article  Google Scholar 

  13. Blagoev B, Kratchmarova I, Ong S E, Nielsen M, Foster L J, Mann M. A proteomics strategy to elucidate functional protein-protein interactions applied to EGF signaling. Nature Biotechnology, 2003, 21(3): 315–318

    Article  Google Scholar 

  14. Phizicky E, Bastiaens P I H, Zhu H, Snyder M, Fields S. Protein analysis on a proteomic scale. Nature, 2003, 422(6928): 208–215

    Article  Google Scholar 

  15. Chen Z H, You Z H, Li L P, Guo Z H, Hu P W, Jiang H J. Combining LSTM network model and wavelet transform for predicting self-interacting proteins. In: Proceedings of the 15th International Conference on Intelligent Computing Theories and Application. 2019, 166–174

  16. Horejs C M. Good chemistry between proteins and materials. Nature Reviews Materials, 2019, 4(7): 462–462

    Article  Google Scholar 

  17. Bao W, You Z H, Huang D S. CIPPN: computational identification of protein pupylation sites by using neural network. Oncotarget, 2017, 8(65): 108867–108879

    Article  Google Scholar 

  18. Huang Q, You Z, Zhang X, Zhou Y. Prediction of protein-protein interactions with clustered amino acids and weighted sparse representation. International Journal of Molecular Sciences, 2015, 16(5): 10855–10869

    Article  Google Scholar 

  19. Huang Y A, You Z H, Chen X, Yan G Y. Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition. BMC Systems Biology, 2016, 10(4): 120

    Article  Google Scholar 

  20. Lei Y K, You Z H, Ji Z, Zhu L, Huang D S. Assessing and predicting protein interactions by combining manifold embedding with multiple information integration. BMC Bioinformatics, 2012, 13(7): S3

    Article  Google Scholar 

  21. Luo X, You Z, Zhou M, Li S, Leung H, Xia Y, Zhu Q. A highly efficient approach to protein interactome mapping based on collaborative filtering framework. Scientific Reports, 2015, 5: 7702

    Article  Google Scholar 

  22. Wang L, You Z H, Xia S X, Chen X, Yan X, Zhou Y, Liu F. An improved efficient rotation forest algorithm to predict the interactions among proteins. Soft Computing, 2018, 22(10): 3373–3381

    Article  Google Scholar 

  23. You Z H, Lei Y K, Gui J, Huang D S, Zhou X. Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data. Bioinformatics, 2010, 26(21): 2744–2751

    Article  Google Scholar 

  24. Zhu L, You Z H, Huang D S. Increasing the reliability of protein-protein interaction networks via non-convex semantic embedding. Neurocomputing, 2013, 121: 99–107

    Article  Google Scholar 

  25. An J Y, You Z H, Chen X, Huang D S, Li Z W, Liu G, Wang Y. Identification of self-interacting proteins by exploring evolutionary information embedded in PSI-BLAST-constructed position specific scoring matrix. Oncotarget, 2016, 7(50): 82440–82449

    Article  Google Scholar 

  26. Li J Q, You Z H, Li X, Ming Z, Chen X. PSPEL: in silico prediction of self-interacting proteins from amino acids sequences using ensemble learning. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2017, 14(5): 1165–1172

    Article  Google Scholar 

  27. Liu Z, Guo F, Zhang J, Wang J, Lu L, Li D, He F. Proteome-wide prediction of self-interacting proteins based on multiple properties. Molecular amp; Cellular Proteomics, 2013, 12(6): 1689–1700

    Article  Google Scholar 

  28. Huang Y A, You Z H, Gao X, Wong L, Wang L. Using weighted sparse representation model combined with discrete cosine transformation to predict protein-protein interactions from protein sequence. BioMed Research International, 2015, 2015: 902198

    Article  Google Scholar 

  29. Li L P, Wang Y B, You Z H, Li Y, An J Y. PCLPred: a bioinformatics method for predicting protein-protein interactions by combining relevance vector machine model with low-rank matrix approximation. International Journal of Molecular Sciences, 2018, 19(4): 1029

    Article  Google Scholar 

  30. Wang Y B, You Z H, Li X, Jiang T H, Cheng L, Chen Z H. Prediction of protein self-interactions using stacked long short-term memory from protein sequences information. BMC Systems Biology, 2018, 12(8): 129

    Article  Google Scholar 

  31. Zhan Z H, You Z H, Zhou Y, Zheng K, Li Z W. An efficient LightGBM model to predict protein self-interacting using Chebyshev moments and Bi-gram. In: Proceedings of the 15th International Conference on Intelligent Computing Theories and Application. 2019, 453–459

  32. The UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Research, 2019, 47(D1): D506–D515

    Article  Google Scholar 

  33. Salwinski L, Miller C S, Smith A J, Pettit F K, Bowie J U, Eisenberg D. The database of interacting proteins: 2004 update. Nucleic Acids Research, 2004, 32(S1): D449–D451

    Article  Google Scholar 

  34. Breuer K, Foroushani A K, Laird M R, Chen C, Sribnaia A, Lo R, Winsor G L, Hancock R E W, Brinkman F S L, Lynn D J. InnateDB: systems biology of innate immunity and beyond—recent updates and continuing curation. Nucleic Acids Research, 2013, 41(D1): D1228–D1233

    Article  Google Scholar 

  35. Orchard S, Ammari M, Aranda B, Breuza L, Briganti L, Broackes-Carter F, Campbell N H, Chavali G, Chen C, Del-Toro N, Duesbury M, Dumousseau M, Galeota E, Hinz U, Iannuccelli M, Jagannathan S, Jimenez R, Khadake J, Lagreid A, Licata L, Lovering R C, Meldal B, Melidoni A N, Milagros M, Peluso D, Perfetto L, Porras P, Raghunath A, Ricard-Blum S, Roechert B, Stutz A, Tognolli M, Van Roey K, Cesareni G, Hermjakob H. The MIntAct project—IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Research, 2014, 42(D1): D358–D363

    Article  Google Scholar 

  36. Oughtred R, Stark C, Breitkreutz B J, Rust J, Boucher L, Chang C, Kolas N, O’Donnell L, Leung G, McAdam R, Zhang F, Dolma S, Willems A, Coulombe-Huntington J, Chatr-Aryamontri A, Dolinski K, Tyers M. The BioGRID interaction database: 2019 update. Nucleic Acids Research, 2019, 47(D1): D529–D541

    Article  Google Scholar 

  37. Clerc O, Deniaud M, Vallet S D, Naba A, Rivet A, Perez S, Thierry-Mieg N, Ricard-Blum S. MatrixDB: integration of new data with a focus on glycosaminoglycan interactions. Nucleic Acids Research, 2019, 47(D1): D376–D381

    Article  Google Scholar 

  38. Liu X, Yang S, Li C, Zhang Z, Song J. SPAR: a random forest-based predictor for self-interacting proteins with fine-grained domain information. Amino Acids, 2016, 48(7): 1655–1665

    Article  Google Scholar 

  39. Trier Ø D, Jain A K, Taxt T. Feature extraction methods for character recognition-a survey. Pattern Recognition, 1996, 29(4): 641–662

    Article  Google Scholar 

  40. Guyon I, Gunn S, Nikravesh M, Zadeh L A. Feature Extraction: Foundations and Applications. Springer, 2008

  41. Li H, Wei Y, Li L, Chen C L P. Hierarchical feature extraction with local neural response for image recognition. IEEE Transactions on Cybernetics, 2013, 43(2): 412–424

    Article  Google Scholar 

  42. Omara I, Li F, Zhang H, Zuo W. A novel geometric feature extraction method for ear recognition. Expert Systems with Applications, 2016, 65: 127–135

    Article  Google Scholar 

  43. Shao W, Ding Y, Shen H B, Zhang D. Deep model-based feature extraction for predicting protein subcellular localizations from bioimages. Frontiers of Computer Science, 2017, 11(2): 243–252

    Article  Google Scholar 

  44. Wei L, Xing P, Zeng J, Chen J, Su R, Guo F. Improved prediction of protein—protein interactions using novel negative samples, features, and an ensemble classifier. Artificial Intelligence in Medicine, 2017, 83: 67–74

    Article  Google Scholar 

  45. Altschul S F, Koonin E V. Iterated profile searches with PSI-BLAST—a tool for discovery in protein databases. Trends in Biochemical Sciences, 1998, 23(11): 444–447

    Article  Google Scholar 

  46. Mosca R, Céol A, Stein A, Olivella R, Aloy P. 3did: a catalog of domain-based interactions of known three-dimensional structure. Nucleic Acids Research, 2014, 42(D1): D374–D379

    Article  Google Scholar 

  47. Finn R D, Bateman A, Clements J, Coggill P, Eberhardt R Y, Eddy S R, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer E L L, Tate J, Punta M. Pfam: the protein families database. Nucleic Acids Research, 2014, 42(D1): D222–D230

    Article  Google Scholar 

  48. Finn R D, Clements J, Eddy S R. HMMER web server: interactive sequence similarity searching. Nucleic Acids Research, 2011, 39(S2): W29–W37

    Article  Google Scholar 

  49. Markovsky I, Usevich K. Software for weighted structured low-rank approximation. Journal of Computational and Applied Mathematics, 2014, 256: 278–292

    Article  MathSciNet  Google Scholar 

  50. Zernike F, Stratton F J M. Diffraction theory of the knife-edge test and its improved form, the phase-contrast method. Monthly Notices of the Royal Astronomical Society, 1934, 94(5): 377–384

    Article  Google Scholar 

  51. Wang Y B, You Z H, Li L P, Huang D S, Zhou F F, Yang S. Improving prediction of self-interacting proteins using stacked sparse auto-encoder with PSSM profiles. International Journal of Biological Sciences, 2018, 14(8): 983–991

    Article  Google Scholar 

  52. Xu J, Xiang L, Liu Q, Gilmore H, Wu J, Tang J, Madabhushi A. Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images. IEEE Transactions on Medical Imaging, 2016, 35(1): 119–130

    Article  Google Scholar 

  53. Brown P F, Desouza P V, Mercer R L, Pietra V J D, Lai J C. Class-based n-gram models of natural language. Computational Linguistics, 1992, 18(4): 467–479

    Google Scholar 

  54. Mariño J B, Banchs R E, Crego J M, de Gispert A, Lambert P, Fonollosa J A R, Costa-Jussà M R. N-gram-based machine translation. Computational Linguistics, 2006, 32(4): 527–549

    Article  MathSciNet  Google Scholar 

  55. Cao S, Lu W, Zhou J, Li X. cw2vec: learning Chinese word embeddings with stroke n-gram information. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. 2018, 5053–5061

  56. Suzuki M, Itoh N, Nagano T, Kurata G, Thomas S. Improvements to n-gram language model using text generated from neural language model. In: Proceedings of ICASSP 2019 — 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2019, 7245–7249

  57. Meng F R, You Z H, Chen X, Zhou Y, An J Y. Prediction of drug—target interaction networks from the integration of protein sequences and drug chemical structures. Molecules, 2017, 22(7): 1119

    Article  Google Scholar 

  58. Zhang L, Zhang C, Gao R, Yang R, Song Q. Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes. BMC Bioinformatics, 2016, 17(1): 225

    Article  Google Scholar 

  59. Yu B, Qiu W, Chen C, Ma A, Jiang J, Zhou H, Ma Q. SubMito-XGBoost: predicting protein submitochondrial localization by fusing multiple feature information and eXtreme gradient boosting. Bioinformatics, 2020, 36(4): 1074–1081

    Article  Google Scholar 

  60. Cooley J W, Lewis P A W, Welch P D. The fast Fourier transform and its applications. IEEE Transactions on Education, 1969, 12(1): 27–34

    Article  Google Scholar 

  61. Kapralov M, Velingker A, Zandieh A. Dimension-independent sparse Fourier transform. In: Proceedings of 2019 Annual ACM-SIAM Symposium on Discrete Algorithms. 2019, 2709–2728

  62. Nussbaumer H J. The fast Fourier transform. In: Nussbaumer H J, ed. Fast Fourier Transform and Convolution Algorithms. Berlin, Heidelberg: Springer, 1981, 80–111

    Chapter  Google Scholar 

  63. Chen Z H, You Z H, Li L P, Wang Y B, Wong L, Yi H C. Prediction of self-interacting proteins from protein sequence information based on random projection model and fast Fourier transform. International Journal of Molecular Sciences, 2019, 20(4): 930

    Article  Google Scholar 

  64. Babuška I, Vitásek E, Kroupa F. Some applications of the discrete Fourier transform to problems of crystal lattice deformation I. Cechoslovackij Fiziceskij Zurnal B, 1960, 10(6): 419–427

    MathSciNet  Google Scholar 

  65. Anand A V. A brief study of discrete and fast Fourier transforms. The University of Chicago, Dissertation, 2010

  66. Sundararajan D. Fourier Analysis—A Signal Processing Approach. Singapore: Springer, 2018

    Book  Google Scholar 

  67. Daubechies I. The wavelet transform, time-frequency localization and signal analysis. IEEE Transactions on Information Theory, 1990, 36(5): 961–1005

    Article  MathSciNet  Google Scholar 

  68. Zhang D. Wavelet transform. In: Zhang D, ed. Fundamentals of Image Data Mining. Cham: Springer, 2019, 35–44

    Chapter  Google Scholar 

  69. Mallat S. Zero-crossings of a wavelet transform. IEEE Transactions on Information Theory, 1991, 37(4): 1019–1033

    Article  MathSciNet  Google Scholar 

  70. Muñoz C Q G, Jiménez A A, Márquez F P G. Wavelet transforms and pattern recognition on ultrasonic guides waves for frozen surface state diagnosis. Renewable Energy, 2018, 116: 42–54

    Article  Google Scholar 

  71. Chang T, Kuo C C J. Texture analysis and classification with tree-structured wavelet transform. IEEE Transactions on Image Processing, 1993, 2(4): 429–441

    Article  Google Scholar 

  72. Abry P, Roux S G, Wendt H, Messier P, Klein A G, Tremblay N, Borgnat P, Jaffard S, Vedel B, Coddington J, Daffner L A. Multiscale anisotropic texture analysis and classification of photographic prints: art scholarship meets image processing algorithms. IEEE Signal Processing Magazine, 2015, 32(4): 18–27

    Article  Google Scholar 

  73. Srinivasan A, Battacharjee P, Prasad A, Sanyal G. Brain MR image analysis using discrete wavelet transform with fractal feature analysis. In: Proceedings of the 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA). 2018, 1660–1664

  74. Gupta D, Choubey S. Discrete wavelet transform for image processing. International Journal of Emerging Technology and Advanced Engineering, 2015, 4(3): 598–602

    Google Scholar 

  75. Chen J, Li Z, Pan J, Chen G, Zi Y, Yuan J, Chen B, He Z. Wavelet transform based on inner product in fault diagnosis of rotating machinery: a review. Mechanical Systems and Signal Processing, 2016, 70–71: 1–35

    Article  Google Scholar 

  76. Chen Z H, Li L P, He Z, Zhou J R, Li Y, Wong L. An improved deep forest model for predicting self-interacting proteins from protein sequence using wavelet transformation. Frontiers in Genetics, 2019, 10: 90

    Article  Google Scholar 

  77. Tseng C C. Design of fractional order digital FIR differentiators. IEEE Signal Processing Letters, 2001, 8(3): 77–79

    Article  MathSciNet  Google Scholar 

  78. Sengupta N, Kasabov N. Spike-time encoding as a data compression technique for pattern recognition of temporal data. Information Sciences, 2017, 406–407: 133–145

    Article  Google Scholar 

  79. Gastal E S L, Oliveira M M. High-order recursive filtering of non-uniformly sampled signals for image and video processing. Computer Graphics Forum, 2015, 34(2): 81–93

    Article  Google Scholar 

  80. Haigh P A, Le S T, Zvanovec S, Ghassemlooy Z, Luo P, Xu T, Chvojka P, Kanesan T, Giacoumidis E, Canyelles-Pericas P, Minh H L, Popoola W, Rajbhandari S, Papakonstantinou I, Darwazeh I. Multiband carrier-less amplitude and phase modulation for bandlimited visible light communications systems. IEEE Wireless Communications, 2015, 22(2): 46–53

    Article  Google Scholar 

  81. Shi X, Feng H, Zhai M, Yang T, Hu B. Infinite impulse response graph filters in wireless sensor networks. IEEE Signal Processing Letters, 2015, 22(8): 1113–1117

    Article  Google Scholar 

  82. Chen Z H, You Z H, Li L P, Wang Y B, Qiu Y, Hu P W. Identification of self-interacting proteins by integrating random projection classifier and finite impulse response filter. BMC Genomics, 2019, 20(13): 928

    Article  Google Scholar 

  83. Chen J, Ma T, Xiao C. FastGCN: fast learning with graph convolutional networks via importance sampling. In: Proceedings of the 6th International Conference on Learning Representations, 2018

  84. Wang L, You Z H, Yan X, Zheng K, Li Z W. GCNSP: a novel prediction method of self-interacting proteins based on graph convolutional networks. In: Proceedings of the 16th International Conference on Intelligent Computing Theories and Application. 2020, 109–120

  85. Zeng Z, Espino S, Roy A, Li X, Khan S A, Clare S E, Jiang X, Neapolitan R, Luo Y. Using natural language processing and machine learning to identify breast cancer local recurrence. BMC Bioinformatics, 2018, 19(17): 498

    Article  Google Scholar 

  86. Badal V D, Kundrotas P J, Vakser I A. Natural language processing in text mining for structural modeling of protein complexes. BMC Bioinformatics, 2018, 19(1): 84

    Article  Google Scholar 

  87. Yu K, Zhao T, Zhao P, Zhang J. Extraction of protein-protein interactions using natural language processing based pattern matching. In: Proceedings of 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2017, 1292–1295

  88. Lee J, Yoon W, Kim S, Kim D, Kim S, So C H, Kang J. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 2020, 36(4): 1234–1240

    Article  Google Scholar 

  89. Chen Z H, You Z H, Zhang W B, Wang Y B, Cheng L, Alghazzawi D. Global vectors representation of protein sequences and its application for predicting self-interacting proteins with multi-grained cascade forest model. Genes, 2019, 10(11): 924

    Article  Google Scholar 

  90. Wang D, Cui P, Zhu W. Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 1225–1234

  91. Chang S, Han W, Tang J, Qi G J, Aggarwal C C, Huang T S. Heterogeneous network embedding via deep architectures. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2015, 119–128

  92. Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S. Community preserving network embedding. In: Proceedings of the Thirty-first AAAI Conference on Artificial Intelligence. 2017, 203–209

  93. Perozzi B, Al-Rfou R, Skiena S. Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on KNOWLEDGE DISCOVERY and Data Mining. 2014, 701–710

  94. Tu C, Zhang W, Liu Z, Sun M. Max-margin deepwalk: discriminative learning of network representation. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. 2016, 3889–3895

  95. Shrestha A, Won M. DeepWalking: enabling smartphone-based walking speed estimation using deep learning. In: Proceedings of 2018 IEEE global communications conference (GLOBECOM). 2018, 1–6

  96. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. LINE: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web. 2015, 1067–1077

  97. Grover A, Leskovec J. node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 855–864

  98. Peng J, Guan J, Shang X. Predicting Parkinson’s disease genes based on node2vec and autoencoder. Frontiers in Genetics, 2019, 10: 226

    Article  Google Scholar 

  99. Cao S, Lu W, Xu Q. Grarep: learning graph representations with global structural information. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 2015, 891–900

  100. Ou M, Cui P, Pei J, Zhang Z, Zhu W. Asymmetric transitivity preserving graph embedding. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 1105–1114

  101. Galassi A, Lippi M, Torroni P. Attention in natural language processing. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(10): 4291–4308

    Article  Google Scholar 

  102. Han K, Xiao A, Wu E, Guo J, Xu C, Wang Y. Transformer in transformer. In: Proceedings of the 35th Conference on Neural Information Processing Systems. 2021, 34

  103. Dai Q, Li Q, Tang J, Wang D. Adversarial network embedding. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. 2018, 2167–2174

  104. Yuan S, Wu X, Xiang Y. SNE: signed network embedding. In: Proceedings of the 21st Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. 2017, 183–195

  105. He T, Chan K C C. MISAGA: an algorithm for mining interesting subgraphs in attributed graphs. IEEE Transactions on Cybernetics, 2018, 48(5): 1369–1382

    Article  Google Scholar 

  106. Liu H, Mao H, Fu Y. Robust multi-view feature selection. In: Proceedings of the 2016 IEEE 16th International Conference on Data Mining (ICDM). 2016, 281–290

  107. He T, Liu Y, Ko T H, Chan K C C, Ong Y S. Contextual correlation preserving multiview featured graph clustering. IEEE Transactions on Cybernetics, 2020, 50(10): 4318–4331

    Article  Google Scholar 

  108. He T, Bai L, Ong Y S. Graph joint attention networks. 2021, arXiv preprint arXiv:2102.03147

  109. Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2016

  110. Vapnik V, Chervonenkis A. A note on one class of perceptrons. Automation and Remote Control, 1964, 25: 821–837

    Google Scholar 

  111. Burges C J C. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 1998, 2(2): 121–167

    Article  Google Scholar 

  112. Liu Y, Wen K, Gao Q, Gao X, Nie F. SVM based multi-label learning with missing labels for image annotation. Pattern Recognition, 2018, 78: 307–317

    Article  Google Scholar 

  113. Tong S, Koller D. Support vector machine active learning with applications to text classification. The Journal of Machine Learning Research, 2002, 2: 45–66

    Google Scholar 

  114. Kowsari K, Brown D E, Heidarysafa M, Meimandi K J, Gerber M S, Barnes L E. Hdltex: hierarchical deep learning for text classification. In: Proceedings of the 16th IEEE International Conference on Machine Learning and Applications (ICMLA). 2017, 364–371

  115. Harris T. Credit scoring using the clustered support vector machine. Expert Systems with Applications, 2015, 42(2): 741–750

    Article  Google Scholar 

  116. Maldonado S, Bravo C, López J, Pérez J. Integrated framework for profit-based feature selection and SVM classification in credit scoring. Decision Support Systems, 2017, 104: 113–121

    Article  Google Scholar 

  117. Plawiak P, Abdar M, Acharya U R. Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring. Applied Soft Computing, 2019, 84: 105740

    Article  Google Scholar 

  118. Chen Z H, You Z H, Li L P, Wang Y B, Li X. RP-FIRF: prediction of self-interacting proteins using random projection classifier combining with finite impulse response filter. In: Proceedings of the 14th International Conference on Intelligent Computing Theories and Application. 2018, 232–240

  119. Zhang X, Liu S. RBPPred: predicting RNA-binding proteins from sequence using SVM. Bioinformatics, 2017, 33(6): 854–862

    Article  Google Scholar 

  120. Orlando G, Raimondi D, Khan T, Lenaerts T, Vranken W F. SVM-dependent pairwise HMM: an application to protein pairwise alignments. Bioinformatics, 2017, 33(24): 3902–3908

    Article  Google Scholar 

  121. Huang S, Cai N, Pacheco P P, Narrandes S, Wang Y, Xu W. Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genomics amp; Proteomics, 2018, 15(1): 41–51

    Google Scholar 

  122. Hsieh W W. Machine Learning Methods in the Environmental Sciences: Neural Networks and Kernels. Cambridge: Cambridge University Press, 2009

    Book  Google Scholar 

  123. Tipping M E. The relevance vector machine. In: Proceedings of the 12th International Conference on Neural Information Processing Systems. 1999, 652–658

  124. Kaltwang S, Todorovic S, Pantic M. Doubly sparse relevance vector machine for continuous facial behavior estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(9): 1748–1761

    Article  Google Scholar 

  125. Karthik H S, Manikandan J. Evaluation of relevance vector machine classifier for a real-time face recognition system. In: Proceedings of 2017 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia). 2017, 26–30

  126. Demir B, Erturk S. Hyperspectral image classification using relevance vector machines. IEEE Geoscience and Remote Sensing Letters, 2007, 4(4): 586–590

    Article  Google Scholar 

  127. Gholami B, Haddad W M, Tannenbaum A R. Relevance vector machine learning for neonate pain intensity assessment using digital imaging. IEEE Transactions on Biomedical Engineering, 2010, 57(6): 1457–1466

    Article  Google Scholar 

  128. Widodo A, Kim E Y, Son J D, Yang B S, Tan A C C, Gu D S, Choi B K, Mathew J. Fault diagnosis of low speed bearing based on relevance vector machine and support vector machine. Expert Systems with Applications, 2009, 36(3): 7252–7261

    Article  Google Scholar 

  129. Wang T, Xu H, Han J, Elbouchikhi E, El Hachemi Benbouzid M. Cascaded H-bridge multilevel inverter system fault diagnosis using a PCA and multiclass relevance vector machine approach. IEEE Transactions on Power Electronics, 2015, 30(12): 7006–7018

    Article  Google Scholar 

  130. Mehrotra H, Singh R, Vatsa M, Majhi B. Incremental granular relevance vector machine: a case study in multimodal biometrics. Pattern Recognition, 2016, 56: 63–76

    Article  Google Scholar 

  131. Breiman L, Cutler A. State of the art of data mining using Random forest. In: Proceedings of the Salford Data Mining Conference, San Diego, USA. 2012, 24–25

  132. Ho T K. Random decision forests. In: Proceedings of the 3rd International Conference on Document Analysis and Recognition. 1995, 278–282

  133. Ho T K. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(8): 832–844

    Article  Google Scholar 

  134. Ho T K. A data complexity analysis of comparative advantages of decision forest constructors. Pattern Analysis amp; Applications, 2002, 5(2): 102–112

    Article  MathSciNet  Google Scholar 

  135. Fayyad U M, Irani K B. The attribute selection problem in decision tree generation. In: Proceedings of the 10th National Conference on Artificial Intelligence. 1992, 104–110

  136. Rodriguez J J, Kuncheva L I, Alonso C J. Rotation forest: a new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(10): 1619–1630

    Article  Google Scholar 

  137. Xia J, Du P, He X, Chanussot J. Hyperspectral remote sensing image classification based on rotation forest. IEEE Geoscience and Remote Sensing Letters, 2014, 11(1): 239–243

    Article  Google Scholar 

  138. Du P, Samat A, Waske B, Liu S, Li Z. Random forest and rotation forest for fully polarized SAR image classification using polarimetric and spatial features. ISPRS Journal of Photogrammetry and Remote Sensing, 2015, 105: 38–53

    Article  Google Scholar 

  139. Lu H, Yang L, Yan K, Xue Y, Gao Z. A cost-sensitive rotation forest algorithm for gene expression data classification. Neurocomputing, 2017, 228: 270–276

    Article  Google Scholar 

  140. Zhao Z, Shkolnisky Y, Singer A. Fast steerable principal component analysis. IEEE Transactions on Computational Imaging, 2016, 2(1): 1–12

    Article  MathSciNet  Google Scholar 

  141. Ringnér M. What is principal component analysis? Nature Biotechnology, 2008, 26(3): 303–304

    Article  Google Scholar 

  142. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735–1780

    Article  Google Scholar 

  143. Darmochwal A. The Euclidean space. Formalized Mathematics, 1991, 2(4): 599–603

    Google Scholar 

  144. Chiong K X, Shum M. Random projection estimation of discrete-choice models with large choice sets. Management Science, 2019, 65(1): 256–271

    Article  Google Scholar 

  145. Cannings T I, Samworth R J. Random-projection ensemble classification. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2017, 79(4): 959–1035

    Article  MathSciNet  Google Scholar 

  146. Schclar A, Rokach L. Random projection ensemble classifiers. In: Proceedings of the 11th International Conference on Enterprise Information Systems. 2009, 309–316

  147. Linial M, Linial N, Tishby N, Yona G. Global self-organization of all known protein sequences reveals inherent biological signatures. Journal of Molecular Biology, 1997, 268(2): 539–556

    Article  Google Scholar 

  148. Goel N, Bebis G, Nefian A. Face recognition experiments with random projection. In: Proceedings of SPIE 5779, Biometric Technology for Human Identification II. 2005, 426–437

  149. Chen C, Vong C M, Wong C M, Wang W, Wong P K. Efficient extreme learning machine via very sparse random projection. Soft Computing, 2018, 22(11): 3563–3574

    Article  Google Scholar 

  150. Candes E J, Romberg J, Tao T. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 2006, 52(2): 489–509

    Article  MathSciNet  Google Scholar 

  151. Donoho D L. Compressed sensing. IEEE Transactions on Information Theory, 2006, 52(4): 1289–1306

    Article  MathSciNet  Google Scholar 

  152. Bingham E, Mannila H. Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2001, 245–250

  153. Zhang J, Zhu M, Chen P, Wang B. DrugRPE: random projection ensemble approach to drug-target interaction prediction. Neurocomputing, 2017, 228: 256–262

    Article  Google Scholar 

  154. Jiang J, Wang N, Chen P, Zheng C, Wang B. Prediction of protein hotspots from whole protein sequences by a random projection ensemble system. International Journal of Molecular Sciences, 2017, 18(7): 1543

    Article  Google Scholar 

  155. Ge H, Sun L, Yao Y, Yu J. An automatic motif recognition algorithm in DNA sequences based on particle swarm optimization and random projection. In: Proceedings of 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). 2017, 2241–2243

  156. Dahl G E, Stokes J W, Deng L, Yu D. Large-scale malware classification using random projections and neural networks. In: Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. 2013, 3422–3426

  157. Hinton G, Deng L, Yu D, Dahl G, Mohamed A R, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Kingsbury B. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Processing Magazine, 2012, 29(6): 82–97

    Article  Google Scholar 

  158. Dahl G E, Yu D, Deng L, Acero A. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1): 30–42

    Article  Google Scholar 

  159. Ciregan D, Meier U, Schmidhuber J. Multi-column deep neural networks for image classification. In: Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012, 3642–3649

  160. Szegedy C, Toshev A, Erhan D. Deep neural networks for object detection. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 2553–2561

  161. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553): 436–444

    Article  Google Scholar 

  162. Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge: MIT Press, 2016

    Google Scholar 

  163. Zhou Z H, Feng J. Deep forest. National Science Review, 2019, 6(1): 74–86

    Article  MathSciNet  Google Scholar 

  164. Breiman L. Random forests. Machine Learning, 2001, 45(1): 5–32

    Article  Google Scholar 

  165. Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 785–794

  166. Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H, Chen K. Xgboost: extreme gradient boosting. R package version 0.4–2, 2015, 1(4): 1–4

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Key Ramp;D Program of China (2020YFA0908700 and 2018AAA0100100), the National Natural Science Foundation of China (Grant Nos. 62002297, 61902342, U1713212, 61836005, and 62073225), the Natural Science Foundation of Guangdong Province-Outstanding Youth Program (2019B151502018), the Technology Research Project of Shenzhen City (JSGG20180507182904693), and Public Technology Platform of Shenzhen City (GGFW20180211181 45859).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhuhong You.

Additional information

Zhanheng Chen received his PhD degree with the University of Chinese Academy of Sciences, China. His current research interests include data mining, natural language processing, bioinformatics, machine learning, and pattern identification. He has authored over 22 research publications in these areas (Published in Briefings in Bioinformatics, Molecular Therapy-Nucleic Acids, iScience, Communications Biology, BMC Genomics, BMC Systems Biology, Frontiers in Genetics, International Journal of Molecular Sciences, Journal of Cellular and Molecular Medicine, and so on.), and international conferences (such as ICIBM, ICIC).

Zhuhong You received the BE degree in electronic information science and engineering from Hunan Normal University, China in 2005, and the PhD degree in control science and engineering from the University of Science and Technology of China (USTC), China in 2010. From June 2008 to November 2009, he was a Visiting Research Fellow with the Center of Biotechnology and Information, Cornell University, USA. He is currently a Professor with the School of Computer Science, Northwestern Polytechnical University, China. He has published more than 170 research articles in refereed journals and conferences in the areas of pattern recognition, bioinformatics, and complex-network analysis. He holds more than ten patents. His current research interests include neural networks, intelligent information processing, sparse representation, and its applications in bioinformatics.

Qinhu Zhang received his PhD degree in computer science and technology at Tongji University, China in 2019. Now he is working at Tongji University as a postdoctor. His research interests include bioinformatics, machine learning, and deep learning.

Zhenhao Guo received his BE degree in School of Information Science and Engineer-ing (ISE) from Shandong University, China in 2011. The MSc degree from University of Chinese Academy of Sciences, China in 2017. He is currently pursuing his PhD degree in Tongji University, China. His current research interests include text data mining, network analysis, and its applications in bioinformatics.

Siguo Wang received the MS degree in School of Computer Science from Shaanxi Normal University, China in 2019. She is working toward the PhD degree in computer science and technology, Tongji University, China. Her research interests include Bioinformatics, machine learning, and deep learning.

Yanbin Wang received his BE degree in Computer Science and Technology from Zhengzhou University, China in 2015. He obtained his MS degree in Computer Science from University of Chinese Academy of Sciences (UCAS), China in 2018. He is currently pursuing his PhD degree in Zhejiang University, China. His research interests include deep neural networks, big data, signal processing, and its applications in bioinformatics.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., You, Z., Zhang, Q. et al. In silico prediction methods of self-interacting proteins: an empirical and academic survey. Front. Comput. Sci. 17, 173901 (2023). https://doi.org/10.1007/s11704-022-1563-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-022-1563-1

Keywords