Skip to main content

Consensus of Sample-Balanced Classifiers for Identifying Ligand-Binding Residue by Co-evolutionary Physicochemical Characteristics of Amino Acids

  • Conference paper
  • 1583 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 375))

Abstract

Protein-ligand binding is an important mechanism for some proteins to perform their functions, and those binding sites are the residues of proteins that physically bind to ligands. So far, the state-of-the-art methods search for similar, known structures of the query and predict the binding sites based on the solved structures. However, such structural information is not commonly available. In this paper, we propose a sequence-based approach to identify protein-ligand binding residues. Due to the highly imbalanced samples between the ligand-binding sites and non ligand-binding sites, we constructed several balanced data sets, for each of which a random forest (RF)-based classifier was trained. The ensemble of these RF classifiers formed a sequence-based protein-ligand binding site predictor. Experimental results on CASP9 targets demonstrated that our method compared favorably with the state-of-the-art.

This work was supported Award Numbers KUS-CI-016-04 and GRP-CF-2011-19-P-Gao-Huang, made by King Abdullah University of Science and Technology (KAUST).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abbas, A., Kong, X.B., Liu, Z., et al.: Automatic Peak Selection by Abenjamini-hochberg-based Algorithm. PLoS One 8(1), e53112 (2013)

    Google Scholar 

  2. Alipanahi, B., Gao, X., Karakoc, E., et al.: Picky: A Novel Svd-based Nmr Spectra Peak Picking Pethod. Bioinformatics 25(12), i268–i275 (2009)

    Google Scholar 

  3. Alipanahi, B., Gao, X., Karakoc, E., et al.: Error Tolerant Nmr Backbone Resonance Assignment and Automated Structure Generation. J. Bioinform. Comput. Biol. 9(1), 15–41 (2011)

    Article  Google Scholar 

  4. Altschul, S.F., Madden, T.L., Schaffer, A.A., et al.: Gapped Blast and Psi-blast: A New Generation of Protein Database Search Programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)

    Article  Google Scholar 

  5. Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  6. Chen, P., Li, J.: Sequence-based Identification of Interface Residues by An Integrative Profile Combining Hydrophobic and Evolutionary Information. BMC Bioinformatics 11, 402 (2010)

    Article  Google Scholar 

  7. Chen, P., Li, J.: Prediction of Protein Long-range Contacts Using An Ensemble of Genetic Algorithm Classifiers with Sequence Profile Centers. BMC Struct. Biol. 10(Suppl. 1), S2 (2010)

    Google Scholar 

  8. Chen, P., Wong, L., Li, J.: Detection of Outlier Residues for Improving Interface Prediction in Protein Heterocomplexes. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(4), 1155–1165 (2012)

    Article  MathSciNet  Google Scholar 

  9. Chen, P., Li, J., Wong, L., et al.: Accurate Prediction of Hot Spot Residues Through Physicochemical Characteristics of Amino Acid Sequences. Proteins (2013)

    Google Scholar 

  10. Gao, X., Bu, D., Xu, J., et al.: Improving Consensus Contact Prediction via Server Correlation Reduction. BMC Struct. Biol. 9, 28 (2009)

    Article  Google Scholar 

  11. Gonzalez, A.J., Liao, L., Wu, C.H.: Predicting ligand binding residues and functional sites using multipositional correlations with graph theoretic clustering and kernel cca. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(4), 992–1001 (2012)

    Article  Google Scholar 

  12. Jang, R., Gao, X., Li, M.: Towards Fully Automated Structure-based NMR Resonance Assignment of 15N-labeled Proteins from Automatically Picked Peaks. J. Comput. Biol. 18(3), 347–363 (2011)

    Article  MathSciNet  Google Scholar 

  13. Jang, R., Gao, X., Li, M.: Combining automated peak tracking in SAR by NMR with structure-based backbone assignment from 15N-NOESY. BMC Bioinformatics 13(Suppl. 3), S4 (2012)

    Google Scholar 

  14. Kauffman, C., Karypis, G.: Librus: Combined Machine Learning and Homology Information for Sequence-based Ligand-binding Residue Prediction. Bioinformatics 25(23), 3099–3107 (2009)

    Article  Google Scholar 

  15. Kawashima, S., Pokarowski, P., Pokarowska, M., et al.: Aaindex: Amino Acid Index Database, Progress report 2008. Nucleic Acids Res. 36(Database issue), D202–D205 (2008)

    Google Scholar 

  16. Liu, Z., Abbas, A., Jing, B.Y., et al.: Wavpeak: Picking Nmr Peaks Through Wavelet-Based Smoothing and Volume-based Filtering. Bioinformatics 28(7), 914–920 (2012)

    Article  Google Scholar 

  17. Messih, M.A., Chitale, M., Bajic, V.B., et al.: Protein Domain Recurrence and Order Can Enhance Prediction of Protein Functions. Bioinformatics 28(18), i444–i450 (2012)

    Google Scholar 

  18. Palmer, R.A., Niwa, H.: X-ray Crystallographic Studies of Protein-ligand Interactions. Biochem. Soc. Trans. 31(Pt. 5), 973–979 (2003)

    Article  Google Scholar 

  19. Passerini, A., Punta, M., Ceroni, A., et al.: Identifying Cysteines and Histidines in Transition-metal-binding Sites Using Support Vector Machines and Neural Networks. Proteins 65(2), 305–316 (2006)

    Article  Google Scholar 

  20. Pintacuda, G., John, M., Su, X.C., et al.: Nmr Structure Determination of Protein-Ligand Complexes by Lanthanide Labeling. Acc. Chem. Res. 40(3), 206–212 (2007)

    Article  Google Scholar 

  21. Schmidt, T., Haas, J., Gallo Cassarino, T., et al.: Assessment of Ligand-binding Residue Predictions in Casp9. Proteins 79(Suppl. 10), 126–136 (2011)

    Article  Google Scholar 

  22. Wang, B., Chen, P., Huang, D.S., et al.: Predicting Protein Interaction Sites from Residue Spatial Sequence Profile and Evolution Rate. FEBS Lett. 580(2), 380–384 (2006)

    Article  Google Scholar 

  23. Wang, J., Li, Y., Wang, Q., et al.: Proclusensem: Predicting Membrane Protein Types by Fusing Different Modes of Pseudo Amino Acid Composition. Comput. Biol. Med. 42(5), 564–574 (2012)

    Article  Google Scholar 

  24. Wang, J., Gao, X., Wang, Q., et al.: Prodis-contshc: Learning Protein Dissimilarity Measures and Hierarchical Context Coherently for Protein-protein Comparison in Protein Database Retrieval. BMC Bioinformatics 13(Suppl. 7), S2 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, P. (2013). Consensus of Sample-Balanced Classifiers for Identifying Ligand-Binding Residue by Co-evolutionary Physicochemical Characteristics of Amino Acids. In: Huang, DS., Gupta, P., Wang, L., Gromiha, M. (eds) Emerging Intelligent Computing Technology and Applications. ICIC 2013. Communications in Computer and Information Science, vol 375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39678-6_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-39678-6_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-39677-9

  • Online ISBN: 978-3-642-39678-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics