Skip to main content

Peptidase Detection and Classification Using Enhanced Kernel Methods with Feature Selection

  • Conference paper
  • 822 Accesses

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 93))

Abstract

The process of protein sequentialization that has been taking place for the last decade has been creating very large amounts of data, for which the knowledge is limited. Retrieving information from these proteins is the next step. For that, computational techniques are indispensable. Although there isn’t yet a silver bullet approach to solve the problem of enzyme detection and classification, machine learning formulations such as the state-of-the-art support vector machine (SVM) appear among the most reliable options. Here is presented a framework specialized in peptidase analysis, namely for detection and classification according to the hierarchies demarked in the MEROPS database. Feature selection with SVM-RFE is used to improve the discriminative models and build classifiers computationally more efficient.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.: Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)

    Article  Google Scholar 

  2. Chang, C., Lin, C.: LIBSVM: a Library for Support Vector Machines (2004)

    Google Scholar 

  3. Jaakkola, T., Diekhans, M., Haussler, D.: Using the Fisher Kernel Method to Detect Remote Protein Homologies. In: Proc. Int. Conf. Intell. Syst. Mol. Biol. (1999)

    Google Scholar 

  4. Krogh, A., Brown, M., Mian, I., Sjolander, K., Haussler, D.: Hidden markov models in computational biology: Applications to protein modeling. J. Mol. Biol. 235, 1501–1531 (1994)

    Article  Google Scholar 

  5. Kuang, R., Ie, E., Wang, K., Siddiqi, M., Freund, Y., Leslie, C.: Profile-based string kernels for remote homology detection and motif extraction. J. Bioinform. Comput. Biol. 3, 527–550 (2005), doi:10.1142/S021972000500120X

    Article  Google Scholar 

  6. Leslie, C., Eskin, E., Noble, W.: The spectrum kernel: astring kernel for SVM protein classification. In: Proc. Pac. Symp. Biocomput., vol. 7, pp. 564–575 (2002)

    Google Scholar 

  7. Leslie, C., Eskin, E., Cohen, A., Weston, J., Noble, W.: Mismatch string kernels for discriminative protein classification. Bioinform. 20, 467–476 (2004), doi:10.1093/bioinformatics/btg431

    Google Scholar 

  8. Melvin, I., Ie, E., Kuang, R., Weston, J., Noble, W., Leslie, C.: Svm-fold: a tool for discriminative multi-class protein fold and superfamily recognition. BMC Bioinform. 8(4) (2007)

    Google Scholar 

  9. Aydin, Z., Altunbasak, Y., Pakatci, I., Erdogan, H.: Training Set Reduction Methods for Protein Secondary Structure Prediction in Single-Sequence Condition. In: Proc. 29th Annual Int. Conf. IEEE EMBS (2007)

    Google Scholar 

  10. Kurgan, L., Chen, K.: Prediction of protein structural class for the twilight zone sequences. Biochem. Biophys. Res. Commun. 357(2), 453–460 (2007)

    Article  Google Scholar 

  11. Cheng, J., Baldi, P.: A machine learning information retrieval approach to protein fold recognition. Bioinform. 22(12), 1456–1463 (2006)

    Article  Google Scholar 

  12. Mei, S., Fei, W.: Amino acid classification based spectrum kernel fusion for protein subnuclear localization. BMC Bioinform. 11(Suppl. 1), 17 (2010)

    Article  Google Scholar 

  13. Du, P., Li, Y.: Prediction of protein submitochondria locations by hybridizing pseudo-amino acid composition with various physicochemical features of segmented sequence. BMC Bioinform. 7, 518 (2006), doi:10.1186/1471-2105-7-518

    Article  Google Scholar 

  14. Lanckriet, G., Deng, M., Cristianini, N., Jordan, M., Noble, W.: Kernel-based data fusion and its application to protein function prediction in yeast. Pac. Symp. Biocomput., 300–311 (2004)

    Google Scholar 

  15. Kuang, R., Gu, J., Cai, H., Wang, Y.: Improved Prediction of Malaria Degradomes by Supervised Learning with SVM and Profile Kernel. Genetica 36(1), 189–209 (2009)

    Article  Google Scholar 

  16. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46, 389–422 (2002)

    Article  MATH  Google Scholar 

  17. Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP: A structural classification of proteins database for the investigation of sequences and structure. J. Mol. Biol. 247, 536–540 (1995)

    Google Scholar 

  18. Vapnik, V.: Statistical learning theory. Adaptive and Learning Systems for Signal Processing, Communications and Control. Wiley, Chichester (1998)

    MATH  Google Scholar 

  19. Niijima, S., Kuhara, S.: Recursive gene selection based on maximum margin criterion: a comparison with SVM-RFE. BMC Bioinform. 7 (2006), doi:10.1186/1471-2105-7-543

    Google Scholar 

  20. Ding, Y., Wilkins, D.: Improving the performance of SVM-RFE to select genes in microarray data. BMC Bioinform. 7 (2006), doi:10.1186/1471-2105-7-S2-S12

    Google Scholar 

  21. Tang, Y., Zhang, Y., Huang, Z.: Development of two-stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis. IEEE/ACM Transac. Comput. Biol. Bioinform. 4, 365–381 (2007)

    Article  Google Scholar 

  22. Vapnik, V.: Statistical learning theory. Wiley, New York (1998)

    MATH  Google Scholar 

  23. Varshavsky, R., Fromer, M., Man, A., Linial, M.: When less is more: improving classification of protein families with a minimal set of global features. In: Giancarlo, R., Hannenhalli, S. (eds.) WABI 2007. LNCS (LNBI), vol. 4645, pp. 12–24. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  24. Website of the Laboratory of Mass Spectrometry and Gaseous Ion Chemistry of the University of Rockefeller, http://prowl.rockefeller.edu (accessed October 1, 2009)

  25. Chen, K., Kurgan, L., Ruan, J.: Optimization of the sliding window size for protein structure prediction. In: Int. Conf. Comput. Intell. Bioinfo. Comput. Biol., pp. 366–372 (2006)

    Google Scholar 

  26. Yang, X., Wang, B.: Weave amino acid sequences for protein secondary structure prediction. In: 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 80–88 (2003)

    Google Scholar 

  27. Rawlings, N., Barrett, A., Bateman, A.: MEROPS: the peptidase database. Nucleic Acids Res. 38 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Morgado, L., Pereira, C., Veríssimo, P., Dourado, A. (2011). Peptidase Detection and Classification Using Enhanced Kernel Methods with Feature Selection. In: Rocha, M.P., Rodríguez, J.M.C., Fdez-Riverola, F., Valencia, A. (eds) 5th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2011). Advances in Intelligent and Soft Computing, vol 93. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19914-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19914-1_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19913-4

  • Online ISBN: 978-3-642-19914-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics