Skip to main content

A Class of New Kernels Based on High-Scored Pairs of k-Peptides for SVMs and Its Application for Prediction of Protein Subcellular Localization

  • Conference paper
Transactions on Computational Systems Biology II

Part of the book series: Lecture Notes in Computer Science ((TCSB,volume 3680))

Abstract

A class of new kernels has been developed for vectors derived from a coding scheme of the k-peptide composition for protein sequences. Each kernel defines the biological similarity for two mapped k-peptide coding vectors. The mapping transforms a k-peptide coding vector into a new vector based on a matrix formed by high BLOSUM scores associated with pairs of k-peptides. In conjunction with the use of support vector machines, the effectiveness of the new kernels is evaluated against the conventional coding scheme of k-peptide (k ≤ 3) for the prediction of subcellular localizations of proteins in Gram-negative bacteria. It is demonstrated that the new method outperforms all the other methods in a 5-fold cross-validation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bannai, H., Tamada, Y., Maruyama, O., Nakai, K., Miyano, S.: Extensive feature detection of N-terminal protein sorting signals. Bioinformatics 18, 298–305 (2002)

    Article  Google Scholar 

  2. Cai, Y.D., Chou, K.C.: Predicting subcellular localization of proteins in a hybridization space. Bioinformatics 20, 1151–1156 (2003)

    Article  Google Scholar 

  3. Chou, K.C., Cai, Y.D.: Using functional domain composition and support vector machines for prediction of protein subcellular location. J. Biol. Chem. 277, 45765–4576 (2002)

    Google Scholar 

  4. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)

    Google Scholar 

  5. Emanuelsson, O., Nielsen, H., Brunak, S., von Heijne, G.: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300, 1005–1016 (2000)

    Article  Google Scholar 

  6. Emanuelsson, O.: Predicting protein subcellular localisation from amino acid sequence information. Brief. Bioinform. 3, 361–376 (2002)

    Article  Google Scholar 

  7. Feng, Z.P.: Prediction of the subcellular location of prokaryotic proteins based on a new representation of the amino acid composition. Biopolymers 58, 491–499 (2001)

    Article  Google Scholar 

  8. Gardy, J.L., et al.: PSORT-B: improving protein subcellular localization prediction for Gram-negative bacteria. Nucleic Acids Res. 31, 3613–3617 (2003)

    Article  Google Scholar 

  9. Gardy, J.L., et al.: PSORTb v.2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics 21, 617–623 (2005)

    Article  Google Scholar 

  10. von Heijne, G.: Signals for protein targeting into and across membranes. Subcell. Biochem. 22, 1–19 (1994)

    Google Scholar 

  11. Horton, P., Nakai, K.: PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem. Sci. 24, 34–36 (1999)

    Article  Google Scholar 

  12. Hua, S., Sun, Z.: Support vector machine approach for protein subcellular localization prediction. Bioinformatics 17, 721–728 (2001)

    Article  Google Scholar 

  13. Jaakkola, T., Diekhans, M., Haussler, D.: Using the Fisher kernel method to detect remote protein homologies. In: Proc. of the Seventh International Conference on Intelligent Systems for Molecular Biology, pp. 149–158 (1999)

    Google Scholar 

  14. Joachims, T.: Making Large Scale SVM Learning Practical. Advances in Kernel Methods-Support Vector Learning. MIT Press, Cambridge (1999)

    Google Scholar 

  15. Lei, Z., Dai, Y.: A novel approach for prediction of protein subcellular localization from sequence using Fourier analysis and support vector machines. In: Proc. of the Fourth ACM SIGKDD Workshop on Data Mining in Bioinformatics, pp. 11–17 (2004)

    Google Scholar 

  16. Lei, Z., Dai, Y.: A new kernel based on high-scored pairs of tri-peptides and its application in prediction of protein subcellular localization. In: Sunderam, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2005. LNCS, vol. 3515, pp. 903–910. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  17. Leslie, C., Eskin, E., Cohen, A., Weston, J., Noble, W.: Mismatch string kernels for discriminative protein classification. Bioinformatics 20, 467–476 (2004)

    Article  Google Scholar 

  18. Li, H., Jiang, T.: A class of edit kernels for SVMs to predict translation initiation sites in eukaryotic mRNAs. In: Proc. of the Eighth Annual International Conference on Research in Computational Molecular Biology (RECOMB), pp. 262–271 (2004)

    Google Scholar 

  19. Lu, Z., Szafron, D., Greiner, R., Lu, P., Wishart, D.S., Poulin, B., Anvik, J., Macdonell, C., Eisner, R.: Predicting subcellular localization of proteins using machine-learned classifiers. Bioinformatics 20, 547–556 (2004)

    Article  Google Scholar 

  20. Meinicke, P., Tech, M., Morgenstern, B., Merkl, R.: Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites. BMC Bioinformatics 5, 169 (2004)

    Article  Google Scholar 

  21. Menne, K.M.L., Hermjakob, H., Apweiler, R.: A comparison of signal sequence prediction methods using a test set of signal peptides. Bioinformatics 16, 741–742 (2000)

    Article  Google Scholar 

  22. Morik, K., Brockhausen, P., Joachims, T.: Combining statistical learning with a knowledge-based approach - A case study in intensive care monitoring. In: Proc. of the Sixteenth International Conference on Machine Learning, pp. 268–277 (1999)

    Google Scholar 

  23. Nair, R., Rost, B.: Sequence conserved for subcellular localization. Protein Sci. 11, 2836–2847 (2002)

    Article  Google Scholar 

  24. Nakai, K.: Protein sorting signals and prediction of subcellular localization. Adv. Protein. Chem. 54, 277–344 (2000)

    Article  Google Scholar 

  25. Nakai, K., Kanehisa, M.: Expert system for predicting protein localization sites in Gram-negative bacteria. Proteins 11, 95–110 (1991)

    Article  Google Scholar 

  26. Nielsen, H., Engelbrecht, J., Brunak, S., von Heijne, G.: A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Int. J. Neural Syst. 8, 581–599 (1997)

    Article  Google Scholar 

  27. Park, K., Kanehisa, M.: Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs. Bioinformatics 19, 1656–1663 (2003)

    Article  Google Scholar 

  28. Reinhardt, A., Hubbard, T.: Using neural networks for prediction of the subcellular location of proteins. Nucleic Acids Res. 26, 2230–2236 (1998)

    Article  Google Scholar 

  29. Tusnady, G.E., Simon, I.: Principles governing amino acid composition of integral membrane proteins: application to topology prediction. J. Mol. Biol. 283, 489–506 (1998)

    Article  Google Scholar 

  30. Tusnady, G.E., Simon, I.: The HMMTOP transmembrane topology prediction server. Bioinformatics 17, 849–850 (2001)

    Article  Google Scholar 

  31. Yu, C.S., Lin, C.J., Hwang, J.K.: Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n-peptide compositions. Protein Sci. 13, 1402–1406 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lei, Z., Dai, Y. (2005). A Class of New Kernels Based on High-Scored Pairs of k-Peptides for SVMs and Its Application for Prediction of Protein Subcellular Localization. In: Priami, C., Zelikovsky, A. (eds) Transactions on Computational Systems Biology II. Lecture Notes in Computer Science(), vol 3680. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11567752_3

Download citation

  • DOI: https://doi.org/10.1007/11567752_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29401-6

  • Online ISBN: 978-3-540-31661-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics