Skip to main content

Multi-label Correlated Semi-supervised Learning for Protein Function Prediction

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 6674))

Abstract

The advent of large volume of molecular interactions has led to the emergence of a considerable number of computational approaches for studying protein function in the context of network. These algorithms, however, treat each functional class independently and thereby suffer from a difficulty of assigning multiple functions to a protein simultaneously. We propose here a new semi-supervised algorithm, called MCSL, by considering the correlations among functional categories which improves the performance significantly. The guiding intuition is that a protein can receive label information not only from its neighbors annotated with the same category in functional-linkage network, but also from its partners labeled with other classes in category network if their respective neighborhood topologies are a good match. We encode this intuition as a two-dimensional version of network-based learning with local and global consistency. Experiments on a Saccharomyces cerevisiae protein-protein interaction network show that our algorithm can achieve superior performance compared with four state-of-the-art methods by 5-fold cross validation with 66 second-level and 77 informative MIPS functional categories respectively. Furthermore, we make predictions for the 204 uncharacterized proteins and most of these assignments could be directly found in or indirectly inferred from SGD database.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Breitkreutz, B.J., Stark, C., Reguly, T., et al.: The BioGRID Interaction Database: 2008 update. Nucleic Acids Res. 36(Database issue), D637–D640 (2008)

    Google Scholar 

  2. Chen, G., Song, Y., Wang, F., Zhang, C.: Semi-supervised Multi-label Learning by Solving a Sylvester Equation. In: SIAM International Conference on Data Mining (2008)

    Google Scholar 

  3. Chua, H.N., Sung, W.K., Wong, L.: Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22, 1623–1630 (2006)

    Article  Google Scholar 

  4. Edgar, R., Domrachev, M., Lash, A.E.: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30, 207–210 (2002)

    Article  Google Scholar 

  5. Fan, R.-E., Lin, C.-J.: A Study on Threshold Selection for Multi-label Classification. Technical Report, National Taiwan University (2007)

    Google Scholar 

  6. Gavin, A.C., Bosche, M., Krause, R., Grandi, P., Marzioch, M., Bauer, A., Schultz, J., Rick, J.M., Michon, A.M., Cruciat, C.M., et al.: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002)

    Article  Google Scholar 

  7. Harbison, C.T., Gordon, D.B., Lee, T.I., Rinaldi, N.J., Macisaac, K.D., Danford, T.W., Hannett, N.M., Tagne, J.-B., Reynolds, D.B., Yoo, J., et al.: Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004)

    Article  Google Scholar 

  8. Hishigaki, H., Nakai, K., Ono, T., Tanigami, A., Takagi, T.: Assessment of prediction accuracy of protein function from proteinCprotein interaction data. Yeast 18, 523–531 (2001)

    Article  Google Scholar 

  9. Ito, T., Tashiro, K., Muta, S., Ozawa, R., Chiba, T., Nishizawa, M., Yamamoto, K., Kuhara, S., Sakaki, Y.: Toward a protein-protein interaction map of the budding yeast: a comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. Proc. Natl Acad. Sci. USA 97, 1143–1147 (2000)

    Article  Google Scholar 

  10. Karaoz, U., Murali, T.M., Letovsky, S., Zheng, Y., Ding, C., Cantor, C.R., Kasif, S.: Whole-genome annotation by using evidence integration in functional-linkage networks. Proc. Natl. Acad. Sci. USA 101, 2888–2893 (2004)

    Article  Google Scholar 

  11. Nabieva, E., Jim, K., Agarwal, A., Chazelle, B., Singh, M.: Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21(Suppl 1), i302–i310 (2005)

    Article  Google Scholar 

  12. Pavlidis, P., Weston, J., Cai, J., Grundy, W.N.: Gene functional classification from heterogeneous data. In: Proceedings of the Fifth Annual International Conference on Computational Biology. ACM Press, Montreal (2001)

    Google Scholar 

  13. Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D., Yeates, T.O.: Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. USA 96, 4285–4288 (1999)

    Article  Google Scholar 

  14. Schwikowski, B., Uetz, P., Fields, S.: A network of proteinCprotein interactions in yeast. Nat. Biotechnol. 18, 1257–1261 (2000)

    Article  Google Scholar 

  15. Sharan, R., Ulitsky, I., Shamir, R.: Network-based prediction of protein function. Molecular Systems Biology 3, 88 (2007)

    Article  Google Scholar 

  16. Singh, R., Xu, J., Berger, B.: Global alignment of multiple protein interaction networks with application to functional orthology detection. Proc. Natl. Acad. Sci. USA 105, 12763–12768 (2008)

    Article  Google Scholar 

  17. Vazquez, A., Flammini, A., Maritan, A., Vespignani, A.: Global protein function prediction from proteinCprotein interaction networks. Nat. Biotechnol. 21, 697–700 (2003)

    Article  Google Scholar 

  18. Zha, Z., Mei, T., Wang, J., Wang, Z., Hua, X.: Graph-based semi-supervised learning with multi-label. In: IEEE International Conference on Multiamedia and Expo (2008)

    Google Scholar 

  19. Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Scholkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems (NIPS), vol. 16, pp. 321–328. MIT Press, Cambridge (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jiang, J.Q. (2011). Multi-label Correlated Semi-supervised Learning for Protein Function Prediction. In: Chen, J., Wang, J., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2011. Lecture Notes in Computer Science(), vol 6674. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21260-4_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21260-4_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21259-8

  • Online ISBN: 978-3-642-21260-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics