Skip to main content

Advertisement

Log in

Impact of the Continuous Evolution of Gene Ontology on the Performance of Similarity Measures for Scoring Confidence of Protein Interactions

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Gene ontology (GO) is a comprehensive resource for the properties of gene products and their relationships. A similarity measure can be defined between two gene products by utilizing GO, and the corresponding similarity score can be treated as a likelihood to interact between them physically. However, GO is being updated regularly by the addition of new terms and removal/merging of obsolete terms. Therefore, the similarity score of interaction may differ from one instance of GO to another. In this paper, we systematically study the impact of the continuous evolution of GO on the performance of similarity measures for the task of scoring confidence of protein–protein interactions (PPIs). We find that the performance of a similarity measure gets affected due to the continuous evolution of GO. We further observe that the degree of robustness of a similarity measure is highly influenced by the particular setting we consider.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Adhikari A, Singh S, Dutta, A, Dutta B. A novel information theoretic approach for finding semantic similarity in wordnet. In: TENCON 2015-2015 IEEE Region 10 Conference, 2015; pp. 1–6. IEEE.

  2. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. Nat genet. 2000;25(1):25–9.

    Article  Google Scholar 

  3. Azuaje F, Wang H, Bodenreider O. Ontology-driven similarity approaches to supporting gene functional assessment. In: Proceedings of the ISMB’2005 SIG Meeting on Bio-ontologies, 2005; p. 9–10.

  4. Bandyopadhyay S, Mallick K. A new path based hybrid measure for gene ontology similarity. IEEE/ACM Trans Comput Biol Bioinform (TCBB). 2014;11(1):116–27.

    Article  Google Scholar 

  5. Benabderrahmane S, Smail-Tabbone M, Poch O, Napoli A, Devignes MD. Intelligo: a new vector-based semantic similarity measure including annotation origin. BMC Bioinform. 2010;11(1):588.

    Article  Google Scholar 

  6. Carey V, Redestig H. Roc: utilities for roc, with uarray focus. r package version 1.16. 0. 2008.

  7. Cheng J, Cline M, Martin J, Finkelstein D, Awad T, Kulp D, Siani-Rose MA. A knowledge-based clustering algorithm driven by gene ontology. J Biopharm Stat. 2004;14(3):687–700.

    Article  MathSciNet  Google Scholar 

  8. Collins SR, Kemmeren P, Zhao XC, Greenblatt JF, Spencer F, Holstege FC, Weissman JS, Krogan NJ. Toward a comprehensive atlas of the physical interactome of saccharomyces cerevisiae. Mol Cell Proteom. 2007;6(3):439–50.

    Article  Google Scholar 

  9. Couto FM, Silva MJ, Coutinh, PM. Semantic similarity over the gene ontology: family correlation and selecting disjunctive ancestors. In: Proceedings of the 14th ACM International Conference on Information and knowledge management, 2005; p. 343–344. ACM.

  10. Couto FM, Silva MJ, Coutinho PM. Measuring semantic similarity between gene ontology terms. Data Knowl Eng. 2007;61(1):137–52.

    Article  Google Scholar 

  11. del Pozo A, Pazos F, Valencia A. Defining functional distances over gene ontology. BMC Bioinform. 2008;9(1):50.

    Article  Google Scholar 

  12. Guo X, Liu R, Shriver CD, Hu H, Liebman MN. Assessing semantic similarity measures for the characterization of human regulatory pathways. Bioinformatics. 2006;22(8):967–73.

    Article  Google Scholar 

  13. Harispe S, Sánchez D, Ranwez S, Janaqi S, Montmain J. A framework for unifying ontology-based semantic similarity measures: A study in the biomedical domain. J Biomed Inform. 2014;48:38–53.

    Article  Google Scholar 

  14. Hu P, Bader G, Wigle DA, Emili A. Computational prediction of cancer-gene function. Nat Rev Cancer. 2007;7(1):23–34.

    Article  Google Scholar 

  15. Huh WK, Falvo JV, Gerke LC, Carroll AS, Howson RW, Weissman JS, O’Shea EK. Global analysis of protein localization in budding yeast. Nature. 2003;425(6959):686–91.

    Article  Google Scholar 

  16. Jain S, Bader GD. An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology. BMC Bioinform. 2010;11(1):562.

    Article  Google Scholar 

  17. Jiang JJ, Conrath DW. Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of 10th International Conference on research in computational linguistics (ROCLING-97). 1997.

  18. Lastra-Díaz JJ, García-Serrano A. A new family of information content models with an experimental survey on wordnet. Knowl-Based Syst. 2015;89:509–26.

    Article  Google Scholar 

  19. Li B, Wang JZ, Feltus FA, Zhou J, Luo F. Effectively integrating information content and structural relationship to improve the go-based similarity measure between proteins. In: Proceedings of BIOCOMP-10, 2010; p. 166–172.

  20. Lin D. An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on machine learning, vol. 98. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA; 1998. p. 296–304.

  21. Liu L, Dai X, Du C, Wang H, Lu J. A new hybrid semantic similarity computation method based on gene ontology. In: Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on, 2014; p. 849–853. IEEE.

  22. Lord P, Steven R, Brass A, Goble C. Semantic similarity measures as tools for exploring the gene ontology. In: Pacific Symposium on biocomputing, 2003; p. 601–612.

  23. Lord PW, Stevens RD, Brass A, Goble CA. Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics. 2003;19(10):1275–83.

    Article  Google Scholar 

  24. Mazandu GK, Chimusa ER, Mulder NJ. Gene ontology semantic similarity tools: survey on features and challenges for biological knowledge discovery. Brief Bioinform. 2016;18(5):886–901.

    Google Scholar 

  25. Mazandu GK, Mulder NJ. A topology-based metric for measuring term similarity in the gene ontology. Adv Bioinform. 2012;2012:975783.

    Article  Google Scholar 

  26. Mistry M, Pavlidis P. Gene ontology term overlap as a measure of gene functional similarity. BMC Bioinform. 2008;9(1):327.

    Article  Google Scholar 

  27. Nagar A, Al-Mubaid H. A new path length measure based on go for gene similarity with evaluation using sgd pathways. In: Computer-based medical systems, 2008. CBMS’08. 21st IEEE International Symposium on, 2008; p. 590–595. IEEE.

  28. Paul M, Anand A. A new family of similarity measures for scoring confidence of protein interactions using gene ontology. bioRxiv. 2018; p. 459107.

  29. Paul M, Anand A. Impact of low-confidence interactions on computational identification of protein complexes. J Bioinform Comput Biol. 2020;18(4):2050025.

    Article  Google Scholar 

  30. Paul M, Anand A, Pyne S. Impact of the continuous evolution of gene ontology on similarity measures. In: Deka B, Maji P, Mitra S, Bhattacharyya DK, Bora PK, Pal SK, editors. Pattern recognition and machine intelligence - 8th international conference, PReMI 2019, Tezpur, India, December 17–20, 2019, Proceedings, Part II. Lecture Notes in Computer Science, Vol. 11942. Springer; 2019. p. 122–129.

  31. Pesquita C. Semantic similarity in the gene ontology. Methods Mol Biol. 2017;1446:161–73.

    Article  Google Scholar 

  32. Pesquita C, Faria D, Bastos ., Ferreira AE, Falcão AO. Couto FM. Metrics for go based protein semantic similarity: a systematic evaluation. In: BMC bioinformatics, vol. 9. BioMed Central; 2008. , p. S4.

  33. Pesquita C, Faria D, Falcao AO, Lord P, Couto FM. Semantic similarity in biomedical ontologies. PLoS Comput Biol. 2009;5(7):e1000443.

    Article  MathSciNet  Google Scholar 

  34. Rada R, Mili H, Bicknell E, Blettner M. Development and application of a metric on semantic nets. IEEE Trans Syst Man Cybern. 1989;19(1):17–30.

    Article  Google Scholar 

  35. Razick S, Magklaras G, Donaldson IM. irefindex: a consolidated protein interaction database with provenance. BMC Bioinform. 2008;9(1):1.

    Article  Google Scholar 

  36. Resnik P. Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on artificial intelligence,. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA; 1995. p. 448–453.

  37. Rhodes DR, Tomlins SA, Varambally S, Mahavisno V, Barrette T, Kalyana-Sundaram S, Ghosh D, Pandey A, Chinnaiyan AM. Probabilistic model of the human protein-protein interaction network. Nat Biotechnol. 2005;23(8):951–9.

    Article  Google Scholar 

  38. Sánchez D, Batet M. A new model to compute the information content of concepts from taxonomic knowledge. Int J Semant Web Inf Syst (IJSWIS). 2012;8(2):34–50.

    Article  Google Scholar 

  39. Sánchez D, Batet M, Isern D. Ontology-based information content computation. Knowl-Based Syst. 2011;24(2):297–303.

    Article  Google Scholar 

  40. Schlicker A, Domingues FS, Rahnenführer J, Lengauer T. A new measure for functional similarity of gene products based on gene ontology. BMC Bioinform. 2006;7(1):302.

    Article  Google Scholar 

  41. Seco N, Veale T, Hayes J. An intrinsic information content metric for semantic similarity in wordnet. In: ECAI, vol. 16, 2004; p. 1089.

  42. Sevilla JL, Segura V, Podhorski A, Guruceaga E, Mato JM, Martinez-Cruz LA, Corrales FJ, Rubio A. Correlation between gene expression and go semantic similarity. IEEE/ACM Trans Comput Biol Bioinform. 2005;2(4):330–8.

    Article  Google Scholar 

  43. Sing T, Sander O, Beerenwinkel N, Lengauer T. Rocr: visualizing classifier performance in r. Bioinformatics. 2005;21(20):3940–1.

    Article  Google Scholar 

  44. Song X, Li L, Srimani PK, Yu PS, Wang JZ. Measure the semantic similarity of go terms using aggregate information content. IEEE/ACM Trans Comput Biol Bioinform (TCBB). 2014;11(3):468–76.

    Article  Google Scholar 

  45. Teng Z, Guo M, Liu X, Dai Q, Wang C, Xuan P. Measuring gene functional similarity based on group-wise comparison of go terms. Bioinformatics. 2013;29(11):1424–32.

    Article  Google Scholar 

  46. Wang JZ, Du Z, Payattakool R, Yu PS, Chen CF. A new method to measure the semantic similarity of go terms. Bioinformatics. 2007;23(10):1274–81.

    Article  Google Scholar 

  47. Wu H, Su Z, Mao F, Olman V, Xu Y. Prediction of functional modules based on comparative genome analysis and gene ontology application. Nucleic Acids Res. 2005;33(9):2822–37.

    Article  Google Scholar 

  48. Wu X, Pang E, Lin K, Pei ZM. Improving the measurement of semantic similarity between gene ontology terms and gene products: insights from an edge-and ic-based hybrid method. PLoS One. 2013;8(5):e66745.

    Article  Google Scholar 

  49. Xenarios I, Rice DW, Salwinski L, Baron MK, Marcotte EM, Eisenberg D. Dip: the database of interacting proteins. Nucleic Acids Res. 2000;28(1):289–91.

    Article  Google Scholar 

  50. Xu T, Du L, Zhou Y. Evaluation of go-based functional similarity measures using S. cerevisiae protein interaction and expression profile data. BMC Bioinform. 2008;9(1):472.

    Article  Google Scholar 

  51. Xu Y, Guo M, Shi W, Liu X, Wang C. A novel insight into gene ontology semantic similarity. Genomics. 2013;101(6):368–75.

    Article  Google Scholar 

  52. Yu G, Li F, Qin Y, Bo X, Wu Y, Wang S. Gosemsim: an r package for measuring semantic similarity among go terms and gene products. Bioinformatics. 2010;26(7):976–8.

    Article  Google Scholar 

  53. Yu H, Gao L, Tu K, Guo Z. Broadly predicting specific gene functions with expression similarity and taxonomy similarity. Gene. 2005;352:75–81.

    Article  Google Scholar 

  54. Zhang C, Wei X, Omenn GS, Zhang Y. Structure and protein interaction-based gene ontology annotations reveal likely functions of uncharacterized proteins on human chromosome 17. J Proteome Res. 2018;17(12):4186–96.

    Article  Google Scholar 

  55. Zhang SB, Lai JH. Semantic similarity measurement between gene ontology terms based on exclusively inherited shared information. Gene. 2015;558(1):108–17.

    Article  Google Scholar 

  56. Zhou Z, Wang Y, Gu J. A new model of information content for semantic similarity in wordnet. In: Future Generation Communication and Networking Symposia, 2008. FGCNS’08. Second International Conference on, vol. 3, 2008; p. 85–89. IEEE.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Madhusudan Paul.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Computational Biology and Biomedical Informatics” guest edited by Dhruba Kr Bhattacharyya, Sushmita Mitra and Jugal Kr Kalita.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Paul, M., Anand, A. & Pyne, S. Impact of the Continuous Evolution of Gene Ontology on the Performance of Similarity Measures for Scoring Confidence of Protein Interactions. SN COMPUT. SCI. 1, 351 (2020). https://doi.org/10.1007/s42979-020-00350-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-020-00350-5

Keywords

Navigation