Hot Spots & Hot Regions Detection Using Classification Algorithms in BMPs Complexes at the Protein-Protein Interface with the Ground-State Energy Feature

Chaparro-Amaro, O.; Martínez-Felipe, M.; Martínez-Castro, J.

doi:10.1007/978-3-031-07750-0_1

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13264))

Included in the following conference series:

Mexican Conference on Pattern Recognition

743 Accesses

Abstract

We present the results of the application of some machine learning algorithms to predict the hot spots & hot regions residues in protein complexes at the protein-protein interface between their polypeptide chains. The dataset consisted of twenty-nine bone morphogenetic proteins (BMPs) obtained from the Protein Data Bank (PDB). The training features were selected from biochemical and biophysical properties such as B-factor, hydrophobicity index, prevalence score, accessible surface area (ASA), conservation score, and the ground-state energy (using Density Functional Theory (DFT)) of each amino acid of these interfaces. Also, we implemented parallel CPU/GPU hardware acceleration techniques during the preprocessing in order to speed up the ASA and DFT calculations with more efficient execution times. We evaluated the performance of the classifiers with several metrics. The random forest classifier obtained the best performance, achieving an average of $90\%$ of well-classified residues in both the true negative and true positive rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

P2RANK: Knowledge-Based Ligand Binding Site Prediction Using Aggregated Local Features

New machine learning and physics-based scoring functions for drug discovery

Article Open access 04 February 2021

Ensemble of Artificial Bee Colony Optimization and Random Forest Technique for Feature Selection and Classification of Protein Function Family Prediction

References

Bogan, A.A., Thorn, K.S.: Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280(mb981843), 1–9 (1998)
Article Google Scholar
Ashkenazy, H., et al.: ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Res. 44, 344–350 (2016). https://doi.org/10.1093/nar/gkw408
Article Google Scholar
Berman, H., Henrick, K., Nakamura, H., Markley, J.: The worldwide protein data bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res. 35, D301–D303 (2007)
Article Google Scholar
Boughorbel, S., Jarray, F., El-Anbari, M.: Optimal classifier for imbalanced data using Matthews correlation coefficient metric. PLoS ONE 12(6), 1–17 (2017). https://doi.org/10.1371/journal.pone.0177678
Article Google Scholar
Carugo, O.: How large b-factors can be in protein crystal structures. BMC Bioinf. 19(61), 1–9 (2018). https://doi.org/10.1186/s12859-018-2083-8
Article Google Scholar
Chen, D., Zhao, M., Mundy, G.R.: Bone morphogenetic proteins. Growth Factors 22(4), 233–241 (2004)
Article Google Scholar
Cukuroglu, E., Gursoy, A., Keskin, O.: HotRegion: a database of predicted hot spot clusters. Nucleic Acids Res. 40(22080558), 829–833 (2011)
Google Scholar
Haykin, S., Haykin, S.: Neural Networks and Learning Machines, vol. 10. Prentice Hall, New York (2009)
MATH Google Scholar
Hintze, B.J., et al.: MolProbity ultimate rotamer-library distributions for model validation. Proteins Struct. Funct. Bioinf. 84, 1177–1189 (2016)
Article Google Scholar
Kortemme, T., Baker, D.: A simple physical model for binding energy hot spots in protein-protein complexes. PNAS 99(22), 14116–14121 (2002). https://doi.org/10.1073/pnas.202485799
Article Google Scholar
Kortemme, T., Kim, D.E., Baker, D.: Computational alanine scanning of protein-protein interfaces. Sci. STKE Protoc. 1–8 (2004). https://doi.org/10.1126/stke.2192004pl2
Kyte, J., Doolittle, R.F.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 5(157), 105–132 (1982). https://doi.org/10.1016/0022-2836(82)90515-0
Article Google Scholar
Lise, S., et al.: Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methods. BMC Bioinf. 10(365), 1–17 (2009). https://doi.org/10.1186/1471-2105-10-365
Article Google Scholar
Liu, S., Liu, C., Deng, L.: Machine learning approaches for protein-protein interaction hot spot prediction: progress and comparative assessment. MDPI Mol. 23(10), 2535 (2018). https://doi.org/10.3390/molecules23102535
Article Google Scholar
McKerns, M.M., et al.: Building a framework for predictive science. In: Proceedings of the 10th Python in Science Conference, vol. 1, pp. 1–11 (2011). https://doi.org/10.48550/arXiv.1202.1056
Mitternacht, S.: FreeSASA: an open source C library for solvent accessible surface area calculations. F1000 Res. 5(189), 1–10 (2016). https://doi.org/10.12688/f1000research.7931.1
Article Google Scholar
Morrow, J.K., Zhang, S.: Computational prediction of hot spot residues. Curr. Pharm. Des. 18, 1255–1265 (2012). https://doi.org/10.2174/138161212799436412
Article Google Scholar
Muller, R.: PyQuante2. PyQuante Sourceforge Project Page (2013). https://github.com/rpmuller/pyquante2
Tuncbag, N., Keskin, O., Gursoy, A.: Hotpoint: hot spot prediction server for protein interfaces. Nucleic Acids Res. 38(20444871), 402–406 (2010). https://doi.org/10.1093/nar/gkq323
Article Google Scholar
Nguyen, Q.T., Fablet, R., Pastor, D.: Protein interaction hotspot identification using sequence-based frequency-derived features. IEEE Trans. Biomed. Eng. 60(11), 2993–3002 (2013). https://doi.org/10.1109/TBME.2011.2161306
Article Google Scholar
Nussinov, R., Schreiber, G.: Computational Protein-Protein Interactions. CRC Press, Boca Raton (2009). https://doi.org/10.1201/9781420070071
Book Google Scholar
NVIDIA, Vingelmann, P., Fitzek, F.H.: CUDA, release. Accessed 10 Feb 1989 (2020). https://developer.nvidia.com/cuda-toolkit
PDBremix: Calculating the solvent accessible surface area (2014)
Google Scholar
Qiao, Y., et al.: Protein-protein interface hot spots prediction based on a hybrid feature selection strategy. BMC Bioinf. 14(19), 1–16 (2018). https://doi.org/10.1186/s12859-018-2009-5
Article Google Scholar
Shrake, A., Rupley, J.A.: Environment and exposure to solvent of protein atoms. lysozyme and insulin. J. Mol. Biol. 2(79), 351–371 (1973). https://doi.org/10.1016/0022-2836(73)90011-9
Article Google Scholar
Stephen, F., et al.: Density functional theory calculations on entire proteins for free energies of binding: application to a model polar binding site. Proteins Struct. Funct. Bioinf. 82(12), 3335–3346 (2014). https://doi.org/10.1002/prot.24686
Article Google Scholar
Tuncbag, N., Gursoy, A., Keskin, O.: Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy. J. Bioinf. 25(12), 1513–1520 (2009). https://doi.org/10.1093/bioinformatics/btp240
Article Google Scholar
Cavalcante, J.P.U., Gonçalves, A.C., Bonidia, R.P., Sanches, D.S., de Carvalho, A.C.P.L.F.: MathPIP: classification of proinflammatory peptides using mathematical descriptors. In: Stadler, P.F., Walter, M.E.M.T., Hernandez-Rosales, M., Brigido, M.M. (eds.) BSB 2021. LNCS, vol. 13063, pp. 131–136. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-91814-9_13
Chapter Google Scholar
Wang, L., et al.: Prediction of hot spots in protein interfaces using a random forest model with hybrid features. Protein Eng. Des. Sel. 25(3), 119–126 (2012). https://doi.org/10.1093/protein/gzr066
Article Google Scholar
Xia, J.F., et al.: APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinf. 174(11), 1–14 (2010). https://doi.org/10.1186/1471-2105-11-174
Article Google Scholar
Yan, C., et al.: Characterization of protein-protein interfaces. Protein J. 27(1), 59–70 (2008). https://doi.org/10.1007/s10930-007-9108-x
Article Google Scholar

Download references

Acknowledgments

This study was supported by: “Programa de desarrollo tecnológico e innovación para alumnos del IPN. México 2021” and by CONACYT (Consejo Nacional de Ciencia y Tecnología).

Author information

Authors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan de Dios Bátiz s/n, Esq. Miguel Othón de Mendizábal, Col. Nueva Industrial Vallejo, Gustavo A. Madero, CDMX, C.P. 07738, Mexico
O. Chaparro-Amaro, M. Martínez-Felipe & J. Martínez-Castro

Authors

O. Chaparro-Amaro
View author publications
You can also search for this author in PubMed Google Scholar
M. Martínez-Felipe
View author publications
You can also search for this author in PubMed Google Scholar
J. Martínez-Castro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to O. Chaparro-Amaro .

Editor information

Editors and Affiliations

Universidad Autónoma de Ciudad Juárez, Ciudad Juárez, Mexico
Osslan Osiris Vergara-Villegas
Universidad Autónoma de Ciudad Juárez, Ciudad Juárez, Mexico
Vianey Guadalupe Cruz-Sánchez
Instituto Politécnico Nacional, Mexico City, Mexico
Juan Humberto Sossa-Azuela
Instituto Nacional de Astrofísica, Óptica y Electrónica, Puebla, Mexico
Jesús Ariel Carrasco-Ochoa
Instituto Nacional de Astrofísica, Óptica y Electrónica, Puebla, Mexico
José Francisco Martínez-Trinidad
Autonomous University of Puebla, Puebla, Mexico
José Arturo Olvera-López

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chaparro-Amaro, O., Martínez-Felipe, M., Martínez-Castro, J. (2022). Hot Spots & Hot Regions Detection Using Classification Algorithms in BMPs Complexes at the Protein-Protein Interface with the Ground-State Energy Feature. In: Vergara-Villegas, O.O., Cruz-Sánchez, V.G., Sossa-Azuela, J.H., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Olvera-López, J.A. (eds) Pattern Recognition. MCPR 2022. Lecture Notes in Computer Science, vol 13264. Springer, Cham. https://doi.org/10.1007/978-3-031-07750-0_1

Download citation

DOI: https://doi.org/10.1007/978-3-031-07750-0_1
Published: 11 June 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-07749-4
Online ISBN: 978-3-031-07750-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Hot Spots & Hot Regions Detection Using Classification Algorithms in BMPs Complexes at the Protein-Protein Interface with the Ground-State Energy Feature