Abstract
Cysteines may form covalent bonds, known as disulfide bridges, that have an important role in stabilizing the native conformation of proteins. Several methods have been proposed for predicting the bonding state of cysteines, either using local context or using global protein descriptors. In this paper we introduce an SVM based predictor that operates in two stages. The first stage is a multi-class classifier that operates at the protein level, using either standard Gaussian or spectrum kernels. The second stage is a binary classifier that refines the prediction by exploiting local context enriched with evolutionary information in the form of multiple alignment profiles. At both stages, we enriched profile encoding with information about cysteine conservation. The prediction accuracy of the system is 85% measured by 5-fold cross validation, on a set of 716 proteins from the September 2001 PDB Select dataset.
Similar content being viewed by others
References
P. Fariselli, P. Riccobelli, and R. Casadio, "Role of Evolutionary Information in Predicting the Disulfide-Bonding State of Cysteine in Proteins," Proteins, vol. 36, 1999, pp. 340-346.
A. Fiser and I. Simon, "Predicting the Oxidation State of Cysteines by Multiple Sequence Alignment," Bioinformatics, vol. 16, no. 3, 2000, pp. 251-256.
M. Mucchielli-Giorgi, S. Hazout, and P. Tuffery, "Predicting the Disulfide Bonding State of Cysteines Using Protein Descriptors," Proteins, vol. 46, 2002, pp. 243-249.
P. Fariselli and R. Casadio, "Prediction of Disulfide Connectivity in Proteins," Bioinformatics, vol. 17, 2001, pp. 957-964.
U. Hobohm and C. Sander, "Enlarged Representative Set of Protein Structures," Protein Science, vol. 3, 1994, pp. 522-524.
C. Leslie, E. Eskin, and W. Noble, "The Spectrum Kernel: A String Kernel for SVM Protein Classification," in Proc. Pacific Symposium on Biocomputing, 2002, pp. 564-575.
C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller, "Context-Specific Independence in Bayesian Networks," in Prof. 12th Conf. on Uncertainty in Artificial Intelligence, Morgan Kaufmann, 1996, pp. 115-123.
V. Vapnik, Statistical Learning Theory, New York: John Wiley, 1998.
J. Kwok, "Moderating the Outputs of Support Vector Machine Classifiers," IEEE Transactions on Neural Networks, vol. 10, no. 5, 1999, pp. 1018-1031.
J. Platt, "Probabilistic Outputs for Support Vector Machines and Comparisons to Regularized Likelihood Methods," in Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Scholkopf, and D. Schuurmans (Eds.), MIT Press, 2000.
A. Passerini, M. Pontil, and P. Frasconi, "From Margins to Probabilities in Multiclass Learning Problems," in Proc. 15th European Conf. on Artificial Intelligence, F. van Harmelen (Ed.), 2002.
J. Bridle, "Probabilistic Interpretation of Feedforward Classifi-cation Network Outputs, with Relationships to Statistical Pattern Recognition," in Neuro-Computing: Algorithms, Architectures, and Applications, F. Fogelman-Soulie and J. H´erault (Eds.), Springer-Verlag, 1989.
R. Jacobs, M. Jordan, S. Nowlan, and G.E. Hinton, "Adaptive Mixtures of Local Experts," Neural Computation, vol. 3, no. 1, 1991, pp. 79-87.
R. Collobert, S. Bengio, and Y. Bengio, "A Parallel Mixture of SVMs for Very Large Scale Problems," Neural Computation, vol. 14, no. 5, 2002.
D. Haussler, "Convolution Kernels on Discrete Structures," 1999.
D. Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, Cambridge University Press, 1997.
E. Ukkonen, "On-Line Construction of Suffix Trees," Algorithmica, vol. 14, no. 3, 1995, pp. 249-260.
W. Kabsch and C. Sander, "Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features," Biopolymers, vol. 22, 1983, pp. 2577-2637.
R. Schneider, A. de Daruvar, and C. Sander, "The HSSP Database of Protein Structure-Sequence Alignments," Nucleic Acids Res., vol. 25, 1997, pp. 226-230.
O. Bousquet and A. Elisseeff, "Stability and Generalization," Journal of Machine Learning Research, vol. 2, 2002.
P. Frasconi, M. Gori, and A. Sperduti, "A General Framework for Adaptive Processing of Data Structures," IEEE Trans. on Neural Networks, vol. 9, 1998, pp. 768-786.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Ceroni, A., Frasconi, P., Passerini, A. et al. Predicting the Disulfide Bonding State of Cysteines with Combinations of Kernel Machines. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 35, 287–295 (2003). https://doi.org/10.1023/B:VLSI.0000003026.58068.ce
Published:
Issue Date:
DOI: https://doi.org/10.1023/B:VLSI.0000003026.58068.ce