Skip to main content

Advertisement

Log in

Active learning strategies with COMBINE analysis: new tricks for an old dog

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

The COMBINE method was designed to study congeneric series of compounds including structural information of ligand–protein complexes. Although very successful, the method has not received the same level of attention than other alternatives to study Quantitative Structure Active Relationships (QSAR) mainly because lack of ways to measure the uncertainty of the predictions and the need for large datasets. Active learning, a semi-supervised learning approach that makes use of uncertainty to enhance models’ performance while reducing the size of the training sets, has been used in this work to address both problems. We propose two estimators of uncertainty: the pool of regressors and the distance to the training set. The performance of the methods has been evaluated by testing the resulting active learning workflows in 3 diverse datasets: HIV-1 protease inhibitors, Taxol-derivatives and BRD4 inhibitors. The proposed strategies were successful in 80% of the cases for the taxol-derivatives and BRD4 inhibitors, while outperformed random selection in the case of the HIV-1 protease inhibitors time-split. Our results suggest that AL-COMBINE might be an effective way of producing consistently superior QSAR models with a limited number of samples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Abbreviations

AL:

Active learning

PLS:

Partial least squares

SVMR:

Support vector machine regression

QSAR:

Quantitative structure–activity relationships

COMBINE:

COMparative binding energy analysis

cMMISMSA:

Classic molecular mechanism implicit solvent model surface access

HIV:

Human immunodeficiency virus

BRD4-BD1:

Bromodomain-containing protein 4 N-terminal bromodomain

References

  1. Ortiz AR, Pisabarro MT, Gago F, Wade RC (1995) J Med Chem 38(14):2681

    Article  CAS  PubMed  Google Scholar 

  2. Wang T, Wade RC (2002) J Med Chem 45(22):4828

    Article  CAS  PubMed  Google Scholar 

  3. Cuevas C, Pastor M, Pérez C, Gago F (2001) Comb Chem High Throughput Screen 4(8):627

    Article  CAS  PubMed  Google Scholar 

  4. Wang T, Wade RC (2001) J Med Chem 44(6):961

    Article  CAS  PubMed  Google Scholar 

  5. Pérez C, Pastor M, Ortiz AR, Gago F (1998) J Med Chem 41(6):836

    Article  PubMed  Google Scholar 

  6. Peón A, Coderch C, Gago F, González-Bello C (2013) ChemMedChem 8(5):740

    Article  CAS  PubMed  Google Scholar 

  7. Teruya K, Hattori Y, Shimamoto Y, Kobayashi K, Sanjoh A, Nakagawa A, Yamashita E, Akaji K (2016) Pept Sci 106(4):391

    Article  CAS  Google Scholar 

  8. Le X, Gu Q, Xu J (2015) RSC Adv 5(51):40536

    Article  CAS  Google Scholar 

  9. Arakawa M, Hasegawa K, Funatsu K (2008) Chemometr Intell Lab Syst 92(2):145

    Article  CAS  Google Scholar 

  10. Gil-Redondo R, Klett J, Gago F, Morreale A (2010) Proteins 78(1):162

    Article  CAS  PubMed  Google Scholar 

  11. Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) J Chem Inf Comput Sci 43(6):1947

    Article  CAS  PubMed  Google Scholar 

  12. Sheridan RP (2013) J Chem Inf Model 53(11):2837

    Article  CAS  PubMed  Google Scholar 

  13. Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V (2015) J Chem Inf Model 55(2):263

    Article  CAS  PubMed  Google Scholar 

  14. Sheridan RP, Wang WM, Liaw A, Ma J, Gifford EM (2016) J Chem Inf Model 56(12):2353

    Article  CAS  PubMed  Google Scholar 

  15. Reker D, Schneider G (2015) Drug Discov Today 20(4):458

    Article  PubMed  Google Scholar 

  16. Douak F, Melgani F, Alajlan N, Pasolli E, Bazi Y, Benoudjit N (2012) J Chemom 26(7):374

    Article  CAS  Google Scholar 

  17. Warmuth MK, Liao J, Rätsch G, Mathieson M, Putta S, Lemmen C (2003) J Chem Inf Comput Sci 43(2):667

    Article  CAS  PubMed  Google Scholar 

  18. Wang S-R, Yang C-G, Sánchez-Murcia PA, Snyder JP, Yan N, Sáez-Calvo G, Diaz JF, Gago F, Fang W-S (2015) Org Lett 17(24):6098

    Article  CAS  PubMed  Google Scholar 

  19. Ma Y-T, Yang Y, Cai P, Sun D-Y, Sánchez-Murcia PA, Zhang X-Y, Jia W-Q, Lei L, Guo M, Gago F (2018) J Nat Prod 81(3):524

    Article  CAS  PubMed  Google Scholar 

  20. Matesanz R, Barasoain I, Yang C-G, Wang L, Li X, De Ines C, Coderch C, Gago F, Barbero JJ, Andreu JM (2008) Chem Biol 15(6):573

    Article  CAS  PubMed  Google Scholar 

  21. Holloway MK, Wai JM, Halgren TA, Fitzgerald PM, Vacca JP, Dorsey BD, Levin RB, Thompson WJ, Chen LJ (1995) J Med Chem 38(2):305

    Article  CAS  PubMed  Google Scholar 

  22. Engelhardt H, Martin L, Smethurst C (2015) Pyridinones. 2015 Sep. 3

  23. Klett J, Núñez-Salgado A, Dos Santos HG, Cortés-Cabrera Al, Perona A, Gil-Redondo Rn, Abia D, Gago F, Morreale A (2012) J Chem Theory Comput 8(9):3395

    Article  CAS  PubMed  Google Scholar 

  24. Hassan SA, Guarnieri F, Mehler EL (2000) J Phys Chem B 104(27):6490

    Article  CAS  Google Scholar 

  25. Hassan SA, Guarnieri F, Mehler EL (2000) J Phys Chem B 104(27):6478

    Article  CAS  Google Scholar 

  26. Alvarez Y, Esteban-Torres M, Cortés-Cabrera Á, Gago F, Acebrón I, Benavente R, Mardo K, de las Rivas B, Muñoz R, Mancheño JM (2014) PLoS ONE 9(3):e92257

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Sánchez-Murcia PA, Cortés-Cabrera Á, Gago F (2017) J Comput-Aided Mol Des:1

  28. Ortiz AR, Pastor M, Palomer A, Cruciani G, Gago F, Wade RC (1997) J Med Chem 40(7):1136

    Article  CAS  PubMed  Google Scholar 

  29. da Silva AWS, Vranken WF (2012) BMC Res Notes 5(1):367

    Article  Google Scholar 

  30. Duke R, Giese T, Gohlke H, Goetz A, Homeyer N, Izadi S, Janowski P, Kaus J, Kovalenko A, Lee T (2016) AmberTools 16. University of California, San Francisco

    Google Scholar 

  31. Wang J, Wolf RM, Caldwell JW, Kollman PA, Case DA (2004) J Comput Chem 25(9):1157

    Article  CAS  PubMed  Google Scholar 

  32. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V (2011) J Mach Learn Res 12(Oct):2825

    Google Scholar 

  33. Breunig MM, Kriegel H-P, Ng RT, Sander J (2000) ACM Sigmod Rec 29(2):93

    Article  Google Scholar 

  34. Coderch C, Klett J, Morreale A, Díaz JF, Gago F (2012) ChemMedChem 7(5):836

    Article  CAS  PubMed  Google Scholar 

  35. Canales A, Nieto L, Rodríguez-Salarichs J, Sánchez-Murcia PA, Coderch C, Cortés-Cabrera A, Paterson I, Carlomagno T, Gago F, Andreu JM (2014) ACS Chem Biol 9(4):1033

    Article  CAS  PubMed  Google Scholar 

  36. Fusani L, Wall I, Palmer D, Cortes A (2018) Bioinformatics 34(11):1947

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

In memoriam Dr. Angel Ramirez Ortiz (1966–2008). We thank Prof. Dr. Federico Gago for providing the historical HIV-PR and taxanes data sets.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alvaro Cortes Cabrera.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 9804 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fusani, L., Cabrera, A.C. Active learning strategies with COMBINE analysis: new tricks for an old dog. J Comput Aided Mol Des 33, 287–294 (2019). https://doi.org/10.1007/s10822-018-0181-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-018-0181-3

Keywords