An effective machine learning-based model for the prediction of protein–protein interaction sites in health systems

Tahir, Muhammad; Khan, Fazlullah; Hayat, Maqsood; Alshehri, Mohammad Dahman

doi:10.1007/s00521-022-07024-8

An effective machine learning-based model for the prediction of protein–protein interaction sites in health systems

S.I.: Improving Healthcare outcomes using Multimedia Big Data Analytics
Published: 20 February 2022

Volume 36, pages 65–75, (2024)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Muhammad Tahir¹,
Fazlullah Khan¹,
Maqsood Hayat¹ &
…
Mohammad Dahman Alshehri²

719 Accesses
1 Altmetric
Explore all metrics

Abstract

Protein is a vital biomolecule that accomplishes distinct biological activities by interacting with other proteins in complex biological systems. The protein–protein interaction (PPI) sites hot spot characterization holds preliminary importance in drug discovery as well as in the comprehension of the cellular signaling phenomenon. Looking at the significance of PPIs, an intelligent prediction system based on the notion of fuzzy logic “PPIs-FuzzyKNN” is developed for PPI sites identification. Here, protein sequences are transformed into an equal length of numerical descriptors by using physicochemical properties of amino acids and a position-specific scoring matrix. Here, we have utilized conventional machine learning algorithms as well as fuzzy k-nearest neighbors. The results of the model are assessed via a tenfold cross-validation test. The proposed model PPIs-FuzzyKNN obtained 91.20, 92.65, and 93.50% of accuracy on the three different datasets, namely Dtestset72, PDBtestset164, and Dset186, respectively. The results exhibited that the outcomes of the proposed model are outstanding and persistent in all datasets, so far, compared to the literature. Consequently, it will not only play a leading role in the accurate identification of PPI sites but also becomes a rudimentary tool for the research community.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Protein–Protein interaction site prediction in Homo sapiens and E. coli using an interaction-affinity based membership function in fuzzy SVM

Article 28 September 2015

Prediction of protein–protein interaction sites by means of ensemble learning and weighted feature descriptor

Article Open access 04 July 2016

A Logistic Regression Approach for Identifying Hot Spots in Protein Interfaces

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Huart A-S, MacLaine NJ, Narayan V, Hupp TR (2012) Exploiting the MDM2-CK1α Protein-Protein Interface to Develop Novel Biologics That Induce UBL-Kinase-Modification and Inhibit Cell Growth. PloS one 7:e43391
Article Google Scholar
Wei L, Liao M, Gao X, Zou Q (2015) An improved protein structural classes prediction method by incorporating both sequence and structure information. IEEE Trans Nanobiosci 14:339–349
Article Google Scholar
Hwang H, Pierce B, Mintseris J, Janin J, Weng Z (2008) Protein–protein docking benchmark version 3.0, Proteins: structure. Funct Bioinf 73:705–709
Article Google Scholar
Sharma A, Lyons J, Dehzangi A, Paliwal KK (2013) A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition. J Theor Biol 320:41–46
Article MathSciNet Google Scholar
Ghoorah AW, Devignes M-D, Smaïl-Tabbone M, Ritchie DW (2011) Spatial clustering of protein binding sites for template based protein docking. Bioinformatics 27:2820–2827
Article Google Scholar
Mignani S, El Kazzouli S, Bousmina MM, Majoral J-P (2014) Dendrimer space exploration: an assessment of dendrimers/dendritic scaffolding as inhibitors of protein–protein interactions, a potential new area of pharmaceutical development. Chem Rev 114:1327–1342
Article Google Scholar
Mørk S, Pletscher-Frankild S, Caro AP, Gorodkin J, Jensen LJ (2013) Protein-driven inference of miRNA–disease associations. Bioinformatics 30:392–397
Article Google Scholar
Rao VS, Srinivas K, Sujini G, Kumar G (2014) Protein-protein interaction detection: methods and analysis. Int J Proteom. https://doi.org/10.1155/2014/147648
Article Google Scholar
Jones S, Thornton JM (1997) Analysis of protein-protein interaction sites using surface patches. J Mol Biol 272:121–132
Article Google Scholar
Wei Z-S, Han K, Yang J-Y, Shen H-B, Yu D-J (2016) Protein–protein interaction sites prediction by ensembling SVM and sample-weighted random forests. Neurocomputing 193:201–212
Article Google Scholar
Ofran Y, Rost B (2007) ISIS: interaction sites identified from sequence. Bioinformatics 23:e13–e16
Article Google Scholar
Porollo A, Meller J (2007) Prediction-based fingerprints of protein–protein interactions. Proteins: Struct, Function, Bioinf 66:630–645
Article Google Scholar
Murakami Y, Mizuguchi K (2010) Applying the Naïve Bayes classifier with kernel density estimation to the prediction of protein–protein interaction sites. Bioinformatics 26:1841–1848
Article Google Scholar
Singh G, Dhole K, Pai PP, Mondal S (2014) SPRINGS: prediction of protein-protein interaction sites using artificial neural networks, PeerJ PrePrints
Dhole K, Singh G, Pai PP, Mondal S (2014) Sequence-based prediction of protein–protein interaction sites with L1-logreg classifier. J Theor Biol 348:47–54
Article Google Scholar
Chou K-C (2011) Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 273:236–247
Article MathSciNet Google Scholar
Liu G-H, Shen H-B, Yu D-J (2016) Prediction of protein-protein interaction sites with machine-learning-based data-cleaning and post-filtering procedures. J Membr Biol 249:141–153
Article Google Scholar
Jia J, Liu Z, Xiao X, Liu B, Chou K-C (2015) iPPI-Esml: an ensemble classifier for identifying the interactions of proteins by incorporating their physicochemical properties and wavelet transforms into PseAAC. J Theor Biol 377:47–56
Article Google Scholar
Chou KC (2001) Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins: Struct Function, Bioinf 43:246–255
Article Google Scholar
Hayat M, Khan A (2012) MemHyb: predicting membrane protein types by hybridizing SAAC and PSSM. J Theor Biol 292:93–102
Article MathSciNet Google Scholar
Hayat M, Khan A (2013) WRF-TMH: predicting transmembrane helix by fusing composition index and physicochemical properties of amino acids. Amino Acids 44:1317–1328
Article Google Scholar
Chou K-C, Shen H-B (2007) MemType-2L: a web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM. Biochem Biophys Res Commun 360:339–345
Article Google Scholar
Hayat M, Tahir M (2015) PSOFuzzySVM-TMH: identification of transmembrane helix segments using ensemble feature space by incorporated fuzzy support vector machine. Mol BioSyst 11:2255–2262
Article Google Scholar
Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292:195–202
Article Google Scholar
Yu D, Wu X, Shen H, Yang J, Tang Z, Qi Y, Yang J (2012) Enhancing membrane protein subcellular localization prediction by parallel fusion of multi-view features. IEEE Trans Nanobiosci 11:375–385
Article Google Scholar
Yu D-J, Shen H-B, Yang J-Y (2012) SOMPNN: an efficient non-parametric model for predicting transmembrane helices. Amino Acids 42:2195–2205
Article Google Scholar
Yu D-J, Hu J, Yang J, Shen H-B, Tang J, Yang J-Y (2013) Designing template-free predictor for targeting protein-ligand binding sites with classifier ensemble and spatial clustering. IEEE/ACM Trans Comput Biol Bioinf 10:994–1008
Article Google Scholar
Feng P-M, Chen W, Lin H, Chou K-C (2013) iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem 442:118–125
Article Google Scholar
Manavalan B, Shin TH, Lee G (2018) PVP-SVM: sequence-based prediction of phage virion proteins using a support vector machine. Front Microbiol 9:476
Article Google Scholar
Jia C, Yang Q, Zou Q (2018) NucPosPred: predicting species-specific genomic nucleosome positionin g via four different modes of general PseKNC. J Theor Biol 450:15–21
Article Google Scholar
Hong X, Chen S, Harris CJ (2007) A kernel-based two-class classifier for imbalanced data sets. IEEE Trans Neural Netw 18:28–41
Article Google Scholar
Tahir M, Hayat M, Khan SA (2017) A Two-Layer Computational Model for Discrimination of Enhancer and Their Types Using Hybrid Features Pace of Pseudo K-Tuple Nucleotide Composition. Arab J Sci Eng 43:6719–6727
Article Google Scholar
Specht DF (1990) Probabilistic neural networks. Neural Netw 3:109–118
Article Google Scholar
Kozma L(2008) k Nearest Neighbors algorithm (kNN), Helsinki University of Technology
Khan ZU, Hayat M, Khan MA (2015) Discrimination of acidic and alkaline enzyme using Chou’s pseudo amino acid composition in conjunction with probabilistic neural network model. J Theor Biol 365:197–203
Article MathSciNet Google Scholar
Keller JM, Gray MR, Givens JA (1985) A fuzzy k-nearest neighbor algorithm. IEEE Trans Syst, Man, Cybern SMC-15:580–585
Article Google Scholar
Hayat M, Khan A (2012) Discriminating outer membrane proteins with fuzzy K-nearest neighbor algorithms based on the general form of Chou’s PseAAC. Protein Pept Lett 19:411–421
Article Google Scholar
Maillo J, Luengo J, García S, Herrera F, Triguero I (2017) Exact fuzzy k-nearest neighbor classification for big datasets, Fuzzy Systems (FUZZ-IEEE), 2017 IEEE international conference on, IEEE, pp 1–6
Feng P, Yang H, Ding H, Lin H, Chen W, Chou K-C (2018) iDNA6mA-PseKNC: Identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 111(1):96–102
Article Google Scholar
Manavalan B, Lee J (2017) SVMQA: support–vector-machine-based protein single-model quality assessment. Bioinformatics 33:2496–2503
Article Google Scholar
Chen W, Feng P, Yang H, Ding H, Lin H, Chou K-C (2017) iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences. Oncotarget 8:4208
Article Google Scholar

Download references

Acknowledgements

The study is supported by the Taif University Researchers Supporting Project number (TURSP-2020/126), Taif University, Taif, Saudi Arabia.

Author information

Authors and Affiliations

Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, 23200, KPK, Pakistan
Muhammad Tahir, Fazlullah Khan & Maqsood Hayat
Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif, 21944, Saudi Arabia
Mohammad Dahman Alshehri

Authors

Muhammad Tahir
View author publications
You can also search for this author in PubMed Google Scholar
Fazlullah Khan
View author publications
You can also search for this author in PubMed Google Scholar
Maqsood Hayat
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Dahman Alshehri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Fazlullah Khan or Maqsood Hayat.

Ethics declarations

Conflict of interest

Authors have no conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tahir, M., Khan, F., Hayat, M. et al. An effective machine learning-based model for the prediction of protein–protein interaction sites in health systems. Neural Comput & Applic 36, 65–75 (2024). https://doi.org/10.1007/s00521-022-07024-8

Download citation

Received: 25 June 2021
Accepted: 30 January 2022
Published: 20 February 2022
Issue Date: January 2024
DOI: https://doi.org/10.1007/s00521-022-07024-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An effective machine learning-based model for the prediction of protein–protein interaction sites in health systems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Protein–Protein interaction site prediction in Homo sapiens and E. coli using an interaction-affinity based membership function in fuzzy SVM

Prediction of protein–protein interaction sites by means of ensemble learning and weighted feature descriptor

A Logistic Regression Approach for Identifying Hot Spots in Protein Interfaces

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An effective machine learning-based model for the prediction of protein–protein interaction sites in health systems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Protein–Protein interaction site prediction in Homo sapiens and E. coli using an interaction-affinity based membership function in fuzzy SVM

Prediction of protein–protein interaction sites by means of ensemble learning and weighted feature descriptor

A Logistic Regression Approach for Identifying Hot Spots in Protein Interfaces

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation