Abstract
The study of protein-protein interactions and protein structure through computational methods is critical to understand protein function. Hot spot residues play an important role in bioinformatics to reveal life activities. However, conventional hot spots prediction methods may face great challenges. This paper proposes a hot spot prediction method based on feature selection method SVM-RFE to improve the training performance. SMOTE based oversampling is used to adds new samples to avoid an overfitting classifier. SVM-RFE is then invoked to obtained optimal feature subset. Finally, a feature-based SVM is created to predict the hot spots. Experimental results indicate that the performance of hot spots prediction has been significantly improved compared with the previous methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Giot, L., Bader, J.S., Brouwer, C., Chaudhuri, A., Kuang, B.: A protein interaction map of Drosophila Melanogaster. Science 302, 1727–1736 (2003)
Lin, X.L., Zhang, X.L., Zhou, F.L.: Protein structure prediction with local adjust tabu search algorithm. BMC Bioinform. 5(S15), S1 (2014)
Sahu, S.S., Panda, G.: Efficient Localization of hot spots in proteins using a novel S-transform based filtering approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(5), 1235–1246 (2011)
Keskin, O., Tuncbag, N., Gursoy, A.: Predicting protein-protein interactions from the molecular to the proteome level. Chem. Rev. 116(8), 4884–4909 (2016)
Cho, K., Kim, D., Lee, D.: A feature-based approach to modeling protein-protein interaction hot spots. Nucl. Acids Res. 37(8), 2672–2687 (2009)
Morrison, K.L., Weiss, G.A.: Combinatorial Alanine-scanning. Curr. Opin. Chem. Biol. 5(3), 302–307 (2001)
Kortemme, T., Kim, D.E., Baker, D.: Computational Alanine scanning of protein-protein interfaces. Sci. STKE Signal Transduct. Knowl. Environ. (STKE) 2004(219), pl2 (2004)
Bogan, A., Thorn, K.S.: Anatomy of ces. J. Mol. Biol. 280, 1–9 (1998)
Kortemme, T., Baker, D.: A simple physical model for binding energy hot spots in protein-protein complexes. Proc. Nat. Acad. Sci. USA 99(22), 14116–14121 (2002)
Ofran, Y., Rost, B.: ISIS: Interaction Sites Identified from Sequences. Bioinformatics 23, e13–e16 (2006)
Darnell, S.J., Page, D., Mitchell, J.C.: An automated decision-tree approach to predicting protein interaction hot spots. Proteins 68(4), 813–823 (2007)
Burgoyne, N.J., Jackson, R.M.: Predicting protein interaction sites: binding hot-spots in protein-protein and protein-ligand interfaces. Bioinformatics 22(11), 1335–1342 (2006)
Barata, T.S., Zhang, C., Dalby, P.A., Brocchini, S., Zloh, M.: Identification of protein-excipient interaction hotspots using computational approaches. Int. J. Mol. Sci. 17(6), 853 (2016)
Tuncbag, N., Gursoy, A., Keskin, O.: Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy. Bioinformatics 25(12), 1513–1520 (2009)
Keskin, O., Ma, B., Nussinov, R.: Hot regions in protein- protein interaction: the organization and contribution of structurally conserved hot spot residues. J. Mol. Biol. 345(5), 1281–1294 (2005)
Xia, J.F., Zhao, X.M., Song, J., Huang, D.S.: APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinform. 11, 174 (2010)
Thorn, K.S., Bogan, A.A.: ASEdb: a data base of Alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 17(3), 284–285 (2001)
Moal, I.H., Fernández-Recio, J.: SKEMPI: a Structural Kinetic and Energetic database of Mutant Protein Interactions and its use in empirical models. Bioinformatics 28(20), 2600–2607 (2012)
Fischer, T.B., Arunachalam, K.V., Bailey, D., Mangual: The Binding Interface Database (BID): a compilation of amino acid hot spots in protein interfaces. Bioinformatics 19(11), 1453–1454 (2003)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Bermingham, M.L., Pongwong, R., Spiliopoulou, A., et al.: Application of high-dimensional feature selection: evaluation for genomic prediction in man. Sci. Rep. 5, 10312 (2015)
Mihel, J., Sikic, M., Tomic, S., Jeren, B., Vlahovicek, K.: PSAIA-protein structure and interaction analyzer. BMC Struct. Biol. 8(1), 21 (2008)
Ofran, Y., Ros, B.: ISIS: Interaction Sites Identified from Sequence. Bioinformatics 23(2), e13–e16 (2007)
Guerois, R., Nielsen, J.E., Serrano, L.: Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J. Mol. Biol. 320(2), 369–387 (2002)
Darnell, S.J., Legault, L., Mitchell, J.C.: KFC server: interactive forecasting of protein interaction hot spots. Nucl. Acids Res. 36(suppl 2), W265–W269 (2008)
Cho, K., Kim, D., Lee, D.: A feature-based approach to modeling protein–protein interaction hot spots. Nucl. Acids Res 37(8), 2672–2687 (2009)
Zhang, S.H., Zhang, X.L.: Prediction of hot spots at protein-protein interface. Acta Biophysica Sinica 29(2), 1–12 (2013)
Acknowledgment
The authors thank the members of Machine Learning and Artificial Intelligence Laboratory, School of Computer Science and Technology, Wuhan University of Science and Technology, for their helpful discussion within seminars. This work was supported in part by National Natural Science Foundation of China (No. 61502356, 61273225), by Hubei Province Natural Science Foundation of China (No. 2018CFB526).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Lin, X., Zhang, X., Zhou, F. (2018). Identification of Hotspots in Protein-Protein Interactions Based on Recursive Feature Elimination. In: Huang, DS., Bevilacqua, V., Premaratne, P., Gupta, P. (eds) Intelligent Computing Theories and Application. ICIC 2018. Lecture Notes in Computer Science(), vol 10954. Springer, Cham. https://doi.org/10.1007/978-3-319-95930-6_56
Download citation
DOI: https://doi.org/10.1007/978-3-319-95930-6_56
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95929-0
Online ISBN: 978-3-319-95930-6
eBook Packages: Computer ScienceComputer Science (R0)