Skip to main content

A Unified String Kernel for Biology Sequence

  • Conference paper
  • 1528 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5227))

Abstract

In this paper, we introduce a unified String Kernel. Based on this unified string kernel, we construct improved sparse kernel and composite kernel. Using the same target families and the same test and training set splits as in the protein classification experiments from Weston, we do experiments with these new kernels. The results show that our kernels are superior to previously developed string kernel.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Leslie, C., Eskin, E., Noble, W.S.: The Spectrum Kernel: A String Kernel for SVM Protein Classification. In: Proceedings of the Pacific Symposium on Biocomputing (PSB), Kaua’i, Hawaii (2002)

    Google Scholar 

  2. Leslie, C., Eskin, E., Noble, W.S.: Mismatch String Kernels for SVM Protein Classification. Adv. Neural Inf. Process. Syst. 20, 467–476 (2003)

    Google Scholar 

  3. Leslie, C., Eskin, E., Cohen, A., Weston, J., Noble, W.S.: Mismatch String Kernels for Discriminative Protein Classification. Bioinformatics 20, 467–476 (2004)

    Article  Google Scholar 

  4. Jaakkola, T., Diekhans, M., Haussler, D.: A Discriminative Framework for Detecting Remote Protein Homologies. J. Comput. Biol. 7, 95–114 (2000)

    Article  Google Scholar 

  5. Kuang, R., Ie, E., Wang, K., Wang, K., Siddiqi, M., Freund, Y., Leslie, C.: Profile-based String Kernels for Remote Homology Detection and Motif extraction. J. Bioinform. Comput. Biol. 3, 527–550 (2005)

    Article  Google Scholar 

  6. Vishwanathan, S., Smola, A.: Fast Kernels for String and Tree Matching. Adv. Neural Inf. Process. Syst. (2002)

    Google Scholar 

  7. Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text Classification Using String Kernels. Journal of Machine Learning Research 2, 419–444 (2002)

    Article  MATH  Google Scholar 

  8. Rangwala, H., Karypis, G.: Profile-based Direct Kernels for Remote Homology Detection and Fold Recognition. Bioinformatics 21, 4239–4247 (2005)

    Article  Google Scholar 

  9. Leslie, C., Kuang, R.: Fast String Kernels Using Inexact Matching for Protein Sequences. Journal of Machine Learning Research 5, 1435–1455 (2004)

    MathSciNet  Google Scholar 

  10. Vinokourov, A., Soklakov, A.N., Saunders, C.: A Probabilistic Framework for Mismatch and Profile String Kernels. In: Proceedings of the 13th European Symposium on Artificial Neural Networks, pp. 325–330 (2005)

    Google Scholar 

  11. Lingner, T., Meinicke, P.: Remote Homology Detection Based on Oligomer Distances. Bioinformatics 22, 2224–2231 (2006)

    Article  Google Scholar 

  12. Saigo, H., Vert, J.P., Ueda, N., Akutsu, T.: Protein Homology Detection Using String Alignment Kernels. Bioinformatics 20, 1682–1689 (2004)

    Article  Google Scholar 

  13. Eskin, E., Snir, S.: The Homology Kernel: A Biologically Motivated Sequence Embedding into Euclidean Space. In: Proceedings of the 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 179–186 (2005)

    Google Scholar 

  14. Ben-Hur, A., Noble, W.S.: Kernel Methods for Predicting Protein Cprotein Interactions. Bioinformatics 21, i38–i46 (2005)

    Article  Google Scholar 

  15. Mak, B., Kwok, J.T., Ho, S.: A Study of Various Composite Kernels for Kernel Eigenvoice Speaker Adaptation. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Montreal, Canada, vol. 1, pp. 325–328 (May 2004)

    Google Scholar 

  16. Diego, I.M., Moguerza, J.M., Munoz, A.: Combining Kernel Information for Support Vector Classification. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 102–111. Springer, Heidelberg (2004)

    Google Scholar 

  17. Hanley, J.A., McNeil, B.J.: The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) curve. Radiology 143, 29–36 (1982)

    Google Scholar 

  18. Weston, J., Leslie, C., Ie, E., Zhou, D., Elisseeff, A., Noble, W.S.: Semi-supervised Protein Classification Using Cluster Kernels. Bioinformatics 21, 3241–3247 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

De-Shuang Huang Donald C. Wunsch II Daniel S. Levine Kang-Hyun Jo

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yuan, D., Yang, S., Lai, G. (2008). A Unified String Kernel for Biology Sequence. In: Huang, DS., Wunsch, D.C., Levine, D.S., Jo, KH. (eds) Advanced Intelligent Computing Theories and Applications. With Aspects of Artificial Intelligence. ICIC 2008. Lecture Notes in Computer Science(), vol 5227. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85984-0_76

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85984-0_76

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85983-3

  • Online ISBN: 978-3-540-85984-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics