Skip to main content
Log in

Abstract

Cysteines may form covalent bonds, known as disulfide bridges, that have an important role in stabilizing the native conformation of proteins. Several methods have been proposed for predicting the bonding state of cysteines, either using local context or using global protein descriptors. In this paper we introduce an SVM based predictor that operates in two stages. The first stage is a multi-class classifier that operates at the protein level, using either standard Gaussian or spectrum kernels. The second stage is a binary classifier that refines the prediction by exploiting local context enriched with evolutionary information in the form of multiple alignment profiles. At both stages, we enriched profile encoding with information about cysteine conservation. The prediction accuracy of the system is 85% measured by 5-fold cross validation, on a set of 716 proteins from the September 2001 PDB Select dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. P. Fariselli, P. Riccobelli, and R. Casadio, "Role of Evolutionary Information in Predicting the Disulfide-Bonding State of Cysteine in Proteins," Proteins, vol. 36, 1999, pp. 340-346.

    Article  MATH  Google Scholar 

  2. A. Fiser and I. Simon, "Predicting the Oxidation State of Cysteines by Multiple Sequence Alignment," Bioinformatics, vol. 16, no. 3, 2000, pp. 251-256.

    Article  Google Scholar 

  3. M. Mucchielli-Giorgi, S. Hazout, and P. Tuffery, "Predicting the Disulfide Bonding State of Cysteines Using Protein Descriptors," Proteins, vol. 46, 2002, pp. 243-249.

    Article  Google Scholar 

  4. P. Fariselli and R. Casadio, "Prediction of Disulfide Connectivity in Proteins," Bioinformatics, vol. 17, 2001, pp. 957-964.

    Article  Google Scholar 

  5. U. Hobohm and C. Sander, "Enlarged Representative Set of Protein Structures," Protein Science, vol. 3, 1994, pp. 522-524.

    Article  Google Scholar 

  6. C. Leslie, E. Eskin, and W. Noble, "The Spectrum Kernel: A String Kernel for SVM Protein Classification," in Proc. Pacific Symposium on Biocomputing, 2002, pp. 564-575.

  7. C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller, "Context-Specific Independence in Bayesian Networks," in Prof. 12th Conf. on Uncertainty in Artificial Intelligence, Morgan Kaufmann, 1996, pp. 115-123.

  8. V. Vapnik, Statistical Learning Theory, New York: John Wiley, 1998.

    MATH  Google Scholar 

  9. J. Kwok, "Moderating the Outputs of Support Vector Machine Classifiers," IEEE Transactions on Neural Networks, vol. 10, no. 5, 1999, pp. 1018-1031.

    Article  Google Scholar 

  10. J. Platt, "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods," in Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Scholkopf, and D. Schuurmans (Eds.), MIT Press, 2000.

  11. A. Passerini, M. Pontil, and P. Frasconi, "From Margins to Probabilities in Multiclass Learning Problems," in Proc. 15th European Conf. on Artificial Intelligence, F. van Harmelen (Ed.), 2002.

  12. J. Bridle, "Probabilistic Interpretation of Feedforward Classifi-cation Network Outputs, with Relationships to Statistical Pattern Recognition," in Neuro-Computing: Algorithms, Architectures, and Applications, F. Fogelman-Soulie and J. H´erault (Eds.), Springer-Verlag, 1989.

  13. R. Jacobs, M. Jordan, S. Nowlan, and G.E. Hinton, "Adaptive Mixtures of Local Experts," Neural Computation, vol. 3, no. 1, 1991, pp. 79-87.

    Article  Google Scholar 

  14. R. Collobert, S. Bengio, and Y. Bengio, "A Parallel Mixture of SVMs for Very Large Scale Problems," Neural Computation, vol. 14, no. 5, 2002.

  15. D. Haussler, "Convolution Kernels on Discrete Structures," 1999.

  16. D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Cambridge University Press, 1997.

  17. E. Ukkonen, "On-Line Construction of Suffix Trees," Algorithmica, vol. 14, no. 3, 1995, pp. 249-260.

    Article  MathSciNet  MATH  Google Scholar 

  18. W. Kabsch and C. Sander, "Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features," Biopolymers, vol. 22, 1983, pp. 2577-2637.

    Article  Google Scholar 

  19. R. Schneider, A. de Daruvar, and C. Sander, "The HSSP Database of Protein Structure-Sequence Alignments," Nucleic Acids Res., vol. 25, 1997, pp. 226-230.

    Article  Google Scholar 

  20. O. Bousquet and A. Elisseeff, "Stability and Generalization," Journal of Machine Learning Research, vol. 2, 2002.

  21. P. Frasconi, M. Gori, and A. Sperduti, "A General Framework for Adaptive Processing of Data Structures," IEEE Trans. on Neural Networks, vol. 9, 1998, pp. 768-786.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ceroni, A., Frasconi, P., Passerini, A. et al. Predicting the Disulfide Bonding State of Cysteines with Combinations of Kernel Machines. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 35, 287–295 (2003). https://doi.org/10.1023/B:VLSI.0000003026.58068.ce

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:VLSI.0000003026.58068.ce

Navigation