Skip to main content

Predicting Hot Spots Using a Deep Neural Network Approach

  • Protocol
  • First Online:
Artificial Neural Networks

Abstract

Targeting protein–protein interactions is a challenge and crucial task of the drug discovery process. A good starting point for rational drug design is the identification of hot spots (HS) at protein–protein interfaces, typically conserved residues that contribute most significantly to the binding. In this chapter, we depict point-by-point an in-house pipeline used for HS prediction using only sequence-based features from the well-known SpotOn dataset of soluble proteins (Moreira et al., Sci Rep 7:8007, 2017), through the implementation of a deep neural network. The presented pipeline is divided into three steps: (1) feature extraction, (2) deep learning classification, and (3) model evaluation. We present all the available resources, including code snippets, the main dataset, and the free and open-source modules/packages necessary for full replication of the protocol. The users should be able to develop an HS prediction model with accuracy, precision, recall, and AUROC of 0.96, 0.93, 0.91, and 0.86, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

FN:

False negatives

FP:

False positives

TN:

True negatives

TP:

True positives

References

  1. Kotlyar M, Pastrello C, Malik Z et al (2019) IID 2018 update: context-specific physical protein–protein interactions in human, model organisms and domesticated species. Nucleic Acids Res 47:D581–D589

    Article  CAS  PubMed  Google Scholar 

  2. Lage K (2014) Protein–protein interactions and genetic diseases: the interactome. Biochim Biophys Acta Mol basis Dis 1842:1971–1980

    Article  CAS  Google Scholar 

  3. Ran X, Gestwicki JE (2018) Inhibitors of protein–protein interactions (PPIs): an analysis of scaffold choices and buried surface area. Curr Opin Chem Biol 44:75–86

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Fry DC (2015) Targeting protein-protein interactions for drug discovery. Protein-protein interactions. Methods Mol Biol 1278:93–106

    Article  CAS  PubMed  Google Scholar 

  5. Moreira IS, Koukos PI, Melo R et al (2017) SpotOn: high accuracy identification of protein-protein Interface hot-spots. Sci Rep 7:8007

    Article  PubMed  PubMed Central  Google Scholar 

  6. Moreira IS, Fernandes PA, Ramos MJ (2007) Hot spots-a review of the protein-protein interface determinant amino-acid residues. Proteins 68:803–812

    Article  CAS  PubMed  Google Scholar 

  7. Melo R, Fieldhouse R, Melo A et al (2016) A machine learning approach for hot-spot detection at protein-protein interfaces. Int J Mol Sci 17:1215

    Article  PubMed Central  Google Scholar 

  8. Sommer C, Gerlich DW (2013) Machine learning in cell biology—teaching computers to recognize phenotypes. J Cell Sci 126:5529–5539

    CAS  PubMed  Google Scholar 

  9. Libbrecht MW, Noble WS (2015) Machine learning applications in genetics and genomics. Nat Rev Genet 16:321–332

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Lise S, Buchan D, Pontil M et al (2011) Predictions of hot spot residues at protein-protein interfaces using support vector machines. PLoS One 6:e16774

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Ofran Y, Rost B (2007) ISIS: interaction sites identified from sequence. Bioinformatics 23:e13–e16

    Article  CAS  PubMed  Google Scholar 

  12. Wang H, Liu C, Deng L (2018) Enhanced prediction of hot spots at protein-protein interfaces using extreme gradient boosting. Sci Rep 8:14285

    Article  PubMed  PubMed Central  Google Scholar 

  13. Jain AK, Jianchang M, Mohiuddin KM (1996) Artificial neural networks: a tutorial. Computer (Long Beach Calif) 29:31–44

    Google Scholar 

  14. Gonzalez RC (2018) Deep convolutional neural networks [lecture notes]. IEEE Signal Process Mag 35:79–87

    Article  Google Scholar 

  15. Bengio Y (2009) Learning deep architectures for AI. Found trends®. Mach Learn 2:1–127

    Google Scholar 

  16. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

    Article  CAS  PubMed  Google Scholar 

  17. Cock PJA, Antao T, Chang JT et al (2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 25:1422–1423

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. van der Walt S, Colbert SC, Varoquaux G (2011) The NumPy Array: a structure for efficient numerical computation. Comput Sci Eng 13:22–30

    Article  Google Scholar 

  19. McKinney W (2010) Data structures for statistical computing in python, in: proceeding of the 9th python in science Conf (SciPy 2010), Austin, Texas

    Google Scholar 

  20. Rossum G van, Boer J de (1991) Linking a stub generator (AIL) to a prototyping language (python), In: EurOpen Conference Proceedings, Tromso, Norway

    Google Scholar 

  21. Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    Google Scholar 

  22. Abadi M, Agarwal A, Barham P et al (2015) TensorFlow: large-scale machine learning on heterogeneous distributed systems, preprint available at arXiv:1603.04467

    Google Scholar 

  23. Buckman J, Roy A, Raffel C et al (2018), Thermometer encoding: one hot way to resist adversarial examples. In: 6th international conference on learning representations (ICLR 2018), Vancouver, Canada

    Google Scholar 

  24. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Preprint available at arXiv:1412.6980

    Google Scholar 

  25. Crowther PS, Cox RJ (2005) A method for optimal division of data sets for use in neural networks, presented at the knowledge-based intelligent information and engineering systems. KES 2005. In: Lecture notes in computer science, vol 3684. Springer, Berlin, Heidelberg

    Google Scholar 

Download references

Acknowledgments

This work was supported by the European Regional Development Fund (ERDF), through the Centro 2020 Regional Operational Programme under project CENTRO-01-0145-FEDER-000008: BrainHealth 2020 and through the COMPETE 2020—Operational Programme for Competitiveness and Internationalisation and Portuguese national funds via FCT—Fundação para a Ciência e a Tecnologia, under project[s] POCI-01-0145-FEDER-031356, PTDC/QUI-OUT/32243/2017, and UIDB/04539/2020. A. J. Preto was also supported by FCT through PhD scholarship SFRH/BD/144966/2019. I. S. Moreira was funded by the FCT Investigator Programme—IF/00578/2014 (co-financed by European Social Fund and Programa Operacional Potencial Humano). The authors would like also to acknowledge ERNEST—European Research Network on Signal Transduction, CA18133, and STRATAGEM—New diagnostic and therapeutic tools against multidrug-resistant tumors, CA17104.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Irina S. Moreira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Preto, A.J., Matos-Filipe, P., de Almeida, J.G., Mourão, J., Moreira, I.S. (2021). Predicting Hot Spots Using a Deep Neural Network Approach. In: Cartwright, H. (eds) Artificial Neural Networks. Methods in Molecular Biology, vol 2190. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0826-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-0826-5_13

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-0825-8

  • Online ISBN: 978-1-0716-0826-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics