Skip to main content

Evaluation of Machine Learning Algorithms on Protein-Protein Interactions

  • Conference paper

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 242))

Abstract

Protein-protein interactions are important for the majority of biological processes. A significant number of computational methods have been developed to predict protein-protein interactions using proteins’ sequence, structural and genomic data. Hence, this fact motivated us to perform a comparative study of various machine learning methods, training them on the set of known protein-protein interactions, using proteins’ global and local attributes. The results of the classifiers were evaluated through cross-validation and several performance measures were computed. It was noticed from the results that support vector machine outperformed other classifiers. This fact has also been established through statistical test, called Wilcoxon rank sum test, at 5% significance level.

An Erratum for this chapter can be found at http://dx.doi.org/10.1007/978-3-319-02309-9_71

An erratum to this chapter can be found at http://dx.doi.org/10.1007/978-3-319-02309-0_71

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press (1996)

    Google Scholar 

  2. Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  3. Breitkreutz, B.J., Stark, C., Reguly, T., Boucher, L., Breitkreutz, A., Livstone, M., Oughtred, R., Lackner, D.H., Bähler, J., Wood, V., Dolinski, K., Tyers, M.: The BioGRID interaction database: 2008 update. Nucleic Acids Research 36, D637–D640 (2008)

    Google Scholar 

  4. Burger, L., van Nimwegen, E.: Accurate prediction of protein-protein interactions from sequence alignments using a bayesian method. Molecular Systems Biology 4 (2008)

    Google Scholar 

  5. Chatr-aryamontri, A., Ceol, A., Palazzi, L.M., Nardelli, G., Schneider, M.V., Castagnoli, L., Cesareni, G.: MINT: the molecular interaction database. Nucleic Acids Research 35, D572–D574 (2007)

    Google Scholar 

  6. Chu, Y.S., Liu, Y.Q., Wu, Q.: SVM-based prediction of protein-protein interactions of glucosinolate biosynthesis. In: Proceedings of International Conference on Machine Learning and Cybernetics (ICMLC 2012), vol. 2, pp. 471–476. IEEE (2012)

    Google Scholar 

  7. Deane, C.M., Salwiński, Ł., Xenarios, I., Eisenberg, D.: Protein interactions: Two methods for assessment of the reliability of high throughput observations. Molecular & Cellular Proteomics 1(5), 349–356 (2002)

    Article  Google Scholar 

  8. Hollander, M., Wolfe, D.A.: Nonparametric Statistical Methods, 2nd edn. Wiley-Interscience (1999)

    Google Scholar 

  9. John, G.H., Langley, P.: Estimating continuous distributions in bayesian classifiers. In: Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence (UAI 1995), pp. 338–345. Morgan Kaufmann Publishers Inc. (1995)

    Google Scholar 

  10. Kerrien, S., Alam-Faruque, Y., Aranda, B., Bancarz, I., Bridge, A., Derow, C., Dimmer, E., Feuermann, M., Friedrichsen, A., Huntley, R.P., Kohler, C., Khadake, J., Leroy, C., Liban, A., Lieftink, C., Montecchi-Palazzi, L., Orchard, S.E., Risse, J., Robbe, K., Roechert, B., Thorneycroft, D., Zhang, Y., Apweiler, R., Hermjakob, H.: IntAct–open source resource for molecular interaction data. Nucleic Acids Research 35, D561–D565 (2007)

    Google Scholar 

  11. Klingström, T., Plewczyński, D.: Protein-protein interaction and pathway databases, a graphical review. Briefings in Bioinformatics 12(6), 702–713 (2010)

    Article  Google Scholar 

  12. MacKay, D.J.C.: The evidence framework applied to classification networks. Neural Computation 4(5), 720–736 (1992)

    Article  Google Scholar 

  13. Muley, V.Y.: Improved computational prediction and analysis of protein - protein interaction networks. Ph.D. thesis, Manipal University, References pp. 138–150, Appendix 151–157 (2012)

    Google Scholar 

  14. Plewczynski, D., Tkacz, A., Wyrwicz, L.S., Rychlewski, L., Ginalski, K.: AutoMotif Server for prediction of phosphorylation sites in proteins using support vector machine: 2007 update. Journal of Molecular Modeling 14(1), 69–76 (2008)

    Article  Google Scholar 

  15. Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42(3), 203–231 (2001)

    Article  MATH  Google Scholar 

  16. Reyes, J.A.: Machine learning for the prediction of protein-protein interactions. Ph.D. thesis, University of Glasgow (2010)

    Google Scholar 

  17. Saha, I., Maulik, U., Bandyopadhyay, S., Plewczynski, D.: Improvement of new automatic differential fuzzy clustering using SVM classifier for microarray analysis. Expert Systems with Applications 38(12), 15,122–15,133 (2011)

    Google Scholar 

  18. Saha, I., Mazzocco, G., Plewczynski, D.: Consensus classification of human leukocyte antigen class II proteins. Immunogenetics 65(2), 97–105 (2013)

    Article  Google Scholar 

  19. Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U., Eisenberg, D.: The database of interacting proteins: 2004 update. Nucleic Acids Research 32, D449–D451 (2004)

    Google Scholar 

  20. The Gene Ontology Consortium: Gene Ontology: tool for the unification of biology. Nature Genetics 25(1), 25–29 (2000)

    Google Scholar 

  21. Vapnik, V.: The nature of statistical learning theory. Springer (1995)

    Google Scholar 

  22. Vapnik, V.: Statistical Learning Theory. Wiley-Interscience (1998)

    Google Scholar 

  23. Wang, Y., Wang, J., Yang, Z., Deng, N.: Sequence-based protein-protein interaction prediction via support vector machine. Journal of Systems Science and Complexity 23(5), 1012–1023 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  24. Yellaboina, S., Tasneem, A., Zaykin, D.V., Raghavachari, B., Jothi, R.: DOMINE: a comprehensive collection of known and predicted domain-domain interactions. Nucleic Acids Research 39, D730–D735 (2011)

    Google Scholar 

  25. Yu, G., Li, F., Qin, Y., Bo, X., Wu, Y., Wang, S.: GOSemSim: an R package for measuring semantic similarity among GO terms and gene products. Bioinformatics 26(7), 976–978 (2010)

    Article  Google Scholar 

  26. Yuan, Y., Shaw, M.J.: Induction of fuzzy decision trees. Fuzzy Sets and Systems 69(2), 125–139 (1995)

    Article  MathSciNet  Google Scholar 

  27. Zhang, L.V., Wong, S.L., King, O.D., Roth, F.P.: Predicting co-complexed protein pairs using genomic and proteomic data integration. BMC Bioinformatics 5(1), 38 (2004)

    Article  Google Scholar 

  28. Zhao, X.W., Ma, Z.Q., Yin, M.H.: Predicting protein-protein interactions by combing various sequence-derived features into the general form of chou’s pseudo amino acid composition. Protein and Peptide Letters 19(5), 492–500 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Indrajit Saha .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Saha, I. et al. (2014). Evaluation of Machine Learning Algorithms on Protein-Protein Interactions. In: Gruca, D., Czachórski, T., Kozielski, S. (eds) Man-Machine Interactions 3. Advances in Intelligent Systems and Computing, vol 242. Springer, Cham. https://doi.org/10.1007/978-3-319-02309-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-02309-0_22

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-02308-3

  • Online ISBN: 978-3-319-02309-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics