Skip to main content

A Novel Method for Classifying Subfamilies and Sub-subfamilies of G-Protein Coupled Receptors

  • Conference paper
Biological and Medical Data Analysis (ISBMDA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4345))

Included in the following conference series:

Abstract

G-protein coupled receptors (GPCRs) are a large superfamily of integral membrane proteins that transduce signals across the cell membrane. Because of that important property and other physiological roles undertaken by the GPCR family, they have been an important target of therapeutic drugs. The function of many GPCRs is not known and accurate classification of GPCRs can help us to predict their function. In this study we suggest a kernel based method to classify them at the subfamily and sub-subfamily level. To enhance the accuracy and sensitivity of classifiers at the sub-subfamily level that we were facing with a low number of sequences (imbalanced data), we used our new synthetic protein sequence oversampling (SPSO) algorithm and could gain an overall accuracy and Matthew’s correlation coefficient (MCC) of 98.4 % and 0.98 for class A, nearly 100% and 1 for class B and 96.95% and 0.91 for class C, respectively, at the subfamily level and overall accuracy and MCC of 97.93% and 0.95 at the sub-subfamily level. The results shows that Our oversampling technique can be used for other applications of protein classification with the problem of imbalanced data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Attwood, T.K., Croning, M.D.R., Gaulton, A.: Deriving structural and functional insights from a ligand-based hierarchical classification of G-protein coupled receptors. Protein Eng. 15, 7–12 (2002)

    Article  Google Scholar 

  2. Herbert, T.E., Bouvier, M.: Structural and functional aspects of G protein-coupled receptor oligomerization. Biochem. Cell Biol. 76, 1–11 (1998)

    Article  Google Scholar 

  3. Horn, F., Bettler, E., Oliveira, L., Campagne, L.F., Cohhen, F.E., Vriend, G.: GPCRDB information system for G protein-coupled receptors. Nucleic Acids Res. 31(1), 294–297 (2003)

    Article  Google Scholar 

  4. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleaic Acids Res 25, 3389–3402 (1997)

    Article  Google Scholar 

  5. Kim, J., Moriyama, E.N., Warr, C.G., Clyne, P.J., Carlson, J.R.: Identification of novel multi-transmembrane proteins from genomic databases using quasi-periodic structural properties. Bioinformatics 16(9), 767–775 (2000)

    Article  Google Scholar 

  6. Elrod, D.W., Chou, K.C.: A study on the correlation of G-protein-coupled receptor types with amino acid composition. Protein Eng. 15, 713–715 (2002)

    Article  Google Scholar 

  7. Qian, B., Soyer, O.S., Neubig, R.R.: Depicting a protein’s two faces: GPCR classification by phylogenetic tree-based HMM. FEBS Lett. 554, 95 (2003)

    Google Scholar 

  8. Karchin, R., Karplus, K., Haussler, D.: Classifying G-protein coupled receptors with support vector machines. Bioinformatics 18(1), 147–159 (2002)

    Article  Google Scholar 

  9. Jaakkola, T., Diekhans, M., Haussler, D.: A discriminative framework for detecting remote protein homologies. Journal of Computational Biology 7(1-2), 95–114 (2000)

    Article  Google Scholar 

  10. Leslie, C., Eskin, E., Noble, W.S.: The spectrum kernel: A string kernel for SVM protein classification. In: Altman, R.B., Dunker, A.K., Hunter, L., Lauderdale, K., Klein, T.E. (eds.) Proceedings of the Pacific Symposium on Biocomputing, pp. 564–575. World Scientific, New Jersey (2002)

    Google Scholar 

  11. Leslie, C., Eskin, E., Cohen, A., Weston, J., Noble, W.S.: Mismatch string kernel for SVM protein classification. Advances in Neural Information Processing System 15, 1441–1448 (2003)

    Google Scholar 

  12. Vert, J.-P., Saigo, H., Akustu, T.: Convolution and local alignment kernel. In: Schölkopf, B., Tsuda, K., Vert, J.-P. (eds.) Kernel Methods in Compuatational Biology. The MIT Press, Cambridge

    Google Scholar 

  13. Huang, Y., Cai, J., Li, Y.D.: Classifying G-protein coupled receptors with bagging classification tree. Computationa Biology and Chemistry 28, 275–280 (2004)

    Article  MATH  Google Scholar 

  14. Bairoch, A., Apweiler, R.: The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids res. 29, 346–349 (2001)

    Article  Google Scholar 

  15. Saigo, H., Vert, J.P., Ueda, N., Akustu, T.: Protein homology detection using string alignment kernels. Bioinformatics 20(11), 1682–1689 (2004)

    Article  Google Scholar 

  16. Haussler, D.: Convolution kernels on discrete structures. Technical Report UCSC-CRL-99-10, Department of Computer Science, University of California at Santa Cruz (1999)

    Google Scholar 

  17. Pazzini, M., Marz, C., Murphi, P., Ali, K., Hume, T., Bruk, C.: Reducing misclassification costs. In: proceedings of the Eleventh International Conference on Machine Learning, pp. 217–225 (1994)

    Google Scholar 

  18. Japkowicz, N., Myers, C., Gluch, M.: A novelty detection approach to classification. In: Proceeding of the Fourteenth International Joint Conference on Artificial Intelilligence, pp. 10–15 (1995)

    Google Scholar 

  19. Japkowicz, N.: Learning from imbalanved data sets:A Comparison of various strategies. In: Proceedings of Learning from Imbalanced Data, pp. 10–15 (2000)

    Google Scholar 

  20. Veropoulos, K., Campbell, C., Cristianini, N.: Controlling the sensitivity of support vector machines. In: Proceedings of the International Joint Conference on AI, pp. 55–60 (1999)

    Google Scholar 

  21. Bhasin, M., Raghava, G.P.S.: GPCRpred: an SVM-based method for prediction of families and subfamilies of G-protein coupled receptors. Nucleic Acids res. 32, 383–389 (2004)

    Article  Google Scholar 

  22. Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTALW: Improving the sesitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

    Article  Google Scholar 

  23. Joachims, T.: Macking large scale svm learning practical. Technical Report LS8-24, Universitat Dortmond (1998)

    Google Scholar 

  24. Beigi, M., Zell, A.: SPSO: Synthetic Protein Sequence Oversampling for imbalanced protein data and remote homilogy detection. In: VII International Symposium on Biological and Medical Data Analysis ISBMDA (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Beigi, M., Zell, A. (2006). A Novel Method for Classifying Subfamilies and Sub-subfamilies of G-Protein Coupled Receptors. In: Maglaveras, N., Chouvarda, I., Koutkias, V., Brause, R. (eds) Biological and Medical Data Analysis. ISBMDA 2006. Lecture Notes in Computer Science(), vol 4345. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11946465_3

Download citation

  • DOI: https://doi.org/10.1007/11946465_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68063-5

  • Online ISBN: 978-3-540-68065-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics