Skip to main content

Protein Function Prediction Using Multi-label Learning and ISOMAP Embedding

  • Conference paper
  • First Online:
Bio-Inspired Computing -- Theories and Applications (BIC-TA 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 562))

Included in the following conference series:

  • 1933 Accesses

Abstract

As more and more high-throughput proteome data are collected, automated annotation of protein function has been one of the most challenging problems of the post-genomic era. To address this challenge, we propose a novel functional annotation framework incorporating manifold embedding and multi-label classification to predict protein function on protein-protein interaction (PPI) network. Unlike the existing approaches that depend on the original network, our method weights it by edge betweenness, and embeds simultaneously the annotated and unannotated proteins into an Euclidean metric space via isometric feature mapping (ISOMAP). Then, with these low-dimensional coordinates, the protein expressions are quantified and the functional assignment is transformed into a multi-label classification problem. The approach results in a set of feasible functional labels for each unannotated protein. We conduct extensive experiments on yeast PPI database to evaluate the performance of different multi-label learning methods. The results demonstrate that the proposed method is an effective tool for protein function prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wang, X., Miao, Y., Cheng, M.: Finding motifs in DNA sequences using low-dispersion sequences. J. Comput. Biol. 21(4), 320–329 (2014)

    Article  MathSciNet  Google Scholar 

  2. Wang, X., Miao, Y.: GAEM: a hybrid algorithm incorporating GA with EM for planted edited motif finding problem. Curr. Bioinform. 9(5), 463–469 (2014)

    Article  MathSciNet  Google Scholar 

  3. Hamp, T., et al.: Homology-based inference sets the bar high for protein function prediction. BMC Bioinform. 14(3), 327–346 (2013)

    Google Scholar 

  4. Radivojac, P., et al.: A large-scale evaluation of computational protein function prediction. Nat. Methods 10(3), 221–227 (2013)

    Article  Google Scholar 

  5. Wass, M.N., Sternberg, M.J.: ConFunc–functional annotation in the twilight zone. Bioinformatics 24, 798–806 (2008)

    Article  Google Scholar 

  6. Jones, C.E., Schwerdt, J., Bretag, T.A., Baumann, U., Brown, A.L.: GOSLING: a rulebased protein annotator using BLAST and GO. Bioinformatics 24, 2628–2629 (2008)

    Article  Google Scholar 

  7. Sokolov, A., Ben-Hur, A.: Hierarchical classification of gene ontology terms using the GOstruct method. J. Bioinf. Comput. Biol. 8, 357–376 (2010)

    Article  Google Scholar 

  8. Piovesan, D., et al.: Protein function prediction using guilty by association from interaction networks. Amino Acids 7, 1–10 (2015)

    Google Scholar 

  9. Vazquez, A., Flammini, A., Maritan, A., Vespignani, A.: Global protein function function prediction from protein-protein interaction networks. Nat. Biotechnol. 21(6), 697–700 (2003)

    Article  Google Scholar 

  10. Chua, H., Sung, W., Wong, L.: Exploiting indirect neighbours and topological weighted to predict protein function from protein-protein inteactions. Bioinformatics 22(13), 1623–1630 (2006)

    Article  Google Scholar 

  11. Nabieva, E., Jim, K., Agarwal, A., Chazelle, B., Singh, M.: Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21, 302–310 (2005)

    Article  Google Scholar 

  12. You, Z.H., Lei, Y.K., Huang, D.S., Zhou, X.B.: Using mainfold embedding for asessing and predicting protein interactions from high-throughput experimental data. Bioinformatics 26(21), 2744–2751 (2010)

    Article  Google Scholar 

  13. Zhao, H.F., Sun, D.D., Wang, R.F., Luo, B.: A network-based approach for protein functions prediction using locally linear embedding. In: 4th International Conference on Bioinformatics and Biomedical Engineering, pp. 1–4. IEEE Press, Chengdu (2010)

    Google Scholar 

  14. Huang, L., et al.: Link clustering with extended link similarity and EQ evaluation division. PLoS One 8(6), e66005 (2013)

    Article  Google Scholar 

  15. Elisseeff, A., Weston, J., Becker, S.: A kernel method for multi-labbelled classification. In: Dietterich, T.G., Ghahramani, Z. (eds.) Advances in Neural Information Processing Systems 14, pp. 681–687. MIT Press, Cambridge (2002)

    Google Scholar 

  16. Zhang, M.L., Zhou, Z.H.: Multi-label neural networks with applications to functional genomics and text categorization. IEEE Transl. Knowl. Data Eng. 18(10), 1338–1351 (2006)

    Article  Google Scholar 

  17. Zhang, M.L., Zhou, Z.H.: ML-kNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)

    Article  MATH  Google Scholar 

  18. Desmond, J.: Higham,: fitting a geometric graph to a protein-protein interaction network. Bioinformatics 24, 1093–1099 (2008)

    Article  Google Scholar 

  19. Tenenbaum, J.B.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319 (2000)

    Article  Google Scholar 

  20. Zhang, M.L., Zhang, K.: Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 999–1007. Washington (2010)

    Google Scholar 

  21. Zhang, M.L.: ML-RBF: RBF neural networks for multi-label learning. Neural Process. Lett. 29(2), 61–74 (2009)

    Article  Google Scholar 

  22. Zhang, M.L., Peña, J.M., Robles, V.: Feature selection for multi-label naive bayes classification. Inf. Sci. 179(19), 3218–3229 (2009)

    Article  MATH  Google Scholar 

  23. Zhang, M.L.: LIFT: Multi-label learning with label-specific features. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, pp. 1609–1614. Barcelona, Spain (2011)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 61402002), the National Science Foundation of Anhui Province (No. 1408085QF120), and the Key Foundation of Natural Science Research for Institution of Higher Education of Anhui province (No. KJ2013A007).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dengdi Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liang, H., Sun, D., Ding, Z., Ge, M. (2015). Protein Function Prediction Using Multi-label Learning and ISOMAP Embedding. In: Gong, M., Linqiang, P., Tao, S., Tang, K., Zhang, X. (eds) Bio-Inspired Computing -- Theories and Applications. BIC-TA 2015. Communications in Computer and Information Science, vol 562. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-49014-3_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-49014-3_23

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-49013-6

  • Online ISBN: 978-3-662-49014-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics