Abstract
Gene Ontology (GO) is a taxonomy of biological terms related to the properties of genes and gene products. It can be used to define a similarity measure between two gene products and assign a confidence score to protein-protein interactions (PPIs). GO is being evolved regularly by the addition/deletion/merging of terms. However, there is no study which evaluates the robustness of a particular similarity measure over the evolution of GO. By robustness of a similarity measure, we mean it should either improve or keep its performance similar over the evolution of GO. In this paper, we systematically study the same for the task of scoring confidence of PPIs using GO-based similarity measures. We observe that the performance of similarity measures gets affected due to the regular updates of GO. We find that similarity measures are not robust in all conditions, rather they keep their performance quite similar over the evolution of GO in certain conditions.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Gene Ontology (GO) [1] is a taxonomy of biological terms to represent the properties of genes and/or gene products (e.g., proteins)Footnote 1. It is organized as a DAG (directed acyclic graph) to describe the relationship among the terms. Gene products are annotated to pertinent GO terms through annotation corpora. There are three GOs: biological process (BP), cellular component (CC), and molecular function (MF). Lord et al. [7] did the first pioneering work by utilizing the ontology-based semantic similarity measure (SSM) in the field of genomics. SSM is a quantitative function, \( SSM (t_1,t_2)\), that measures the closeness between two terms \(t_1\) and \(t_2\) based on their semantic representations in a given ontology. Subsequently, a variety of GO-based SSMs have been proposed and successfully applied to different genomics applications [4, 9].
The high similarity score between two proteins indicates that either they are annotated with similar cellular components (if CC-based GO is used), or with similar biological processes (if BP-based GO is used). This gives an indirect evidence that the two proteins are likely to be interacting compare to other pairs, which has a low similarity score. Hence several studies have used GO-based SSM between two gene products (involved in a PPI) as a confidence score of the interaction. However, GO is being updated regularly with the addition, deletion, and merging of terms along with their annotations. This may affect similarity score between a protein-pair calculated over different versions of the ontology. However, to the best of our knowledge, there is no study which systematically studies the effect of the evolution of GO over SSMs. In this paper, we systematically study whether changes in GO affect the performance of similarity measures. In particular, we focus on GO-based SSMs. Further, we compare multiple GO-based SSMs under this setting for the task of scoring confidence of PPIs.
Section 2 briefly discusses the necessary backgrounds and terminologies. In Sect. 3, we discuss datasets and different GO versions used along with evaluation metrics. Results are discussed and analyzed in Sect. 4.
2 Background
Semantic Similarity Measure (SSM). SSMs can be categorized mainly into two approaches: edge- and node-based [10]. The edge-based approach mainly considers the shared paths between two ontology terms and does not account annotation information of terms. Node-based SSMs compute the similarity between two terms by comparing their properties, common ancestors, and their descendants. This approach is less sensitive to the topological structure of the ontology but more sensitive to change in annotations. SSMs such as [2, 14] try to combine both node- and edge-based approaches and are commonly referred to as the hybrid approach. Few methods, such as TCSS [4], are developed based on the complex structure of GO DAG.
SSMs are defined for two individual terms, but a protein is annotated with a set of terms. So if two proteins \(p_1\) and \(p_2\) are annotated with a set of terms S and T, respectively, then \( SSM (p_1,p_2)\) is calculated as \( SSM (S,T)\) which requires combining SSM between individual term-pairs. Generally, three types of strategies are used in the literature: maximum (MAX), average (Avg), and best-match average (BMA). In MAX and Avg strategies, the similarity between S and T is calculated as the maximum and average of the set \(S \times T\), respectively. SSMs between two sets of terms can be treated as a matrix. BMA is defined as the average of all maximum similarity scores on each row and column of the matrix.
3 Experimental Design
GOs and SSMs. We consider BP and CC ontologies along with MAX and BMA in the evaluation. These ontologies and strategies are the most relevant for scoring confidence of PPIs [8]. We exclude electronically inferred annotations (IEA) as they are not verified by human experts. Further, we consider only those PPIs where both the interacting proteins are annotated to at least one GO term other than the root.
We select five different Bioconductor versions of GO and corresponding annotation corpora: 3.0 (2014-09-13), 3.1 (2015-03-13), 3.2 (2015-09-19), 3.3 (2016-03-05), and 3.4 (2016-09-21). We consider six state-of-the-art SSMs proposed by Resnik [12], Lin [6], Schlicker et al. [13], Jiang and Conrath [5], Wang et al. [14], and Jain and Bader [4], referred to as Resnik, Lin, Rel, Jiang, Wang, and TCSS, respectively, in the rest of the paper. Resnik and TCSS with MAX strategy have been considered to be the best SSMs for scoring confidence of PPIs by several studies [4, 9]. We also consider RDS, RNS, and RES, proposed recently by Paul and Anand [8]. The selected nine SSMs encompass all types of SSMs, as discussed in Sect. 2.
Datasets. We utilize the core subsets of the yeast PPIs from the DIP database (Database of Interacting Proteins) [15] downloaded on 29.10.2015 as positive instances. As done in [4], an equal number of negative PPI instances are generated independently by randomly choosing protein pairs annotated in BP and CC and are not present in the iRefWeb database [11], a combined database of all known PPIs, accessed on 27.11.2015.
Proteins involved in a pathway are more likely to interact among themselves and likely to be annotated to the same or similar GO terms and thus should show high similarity scores. A set of 11 yeast (S. cerevisiae) KEGG pathways is selected as in [8]. During the selection of pathways, the authors of [8] try to maintain a trade-off between functional diversity and computational time required for the experiment.
Evaluation Metrics. A similarity measure can classify a set of PPIs into two groups: positives and negatives, for a given cutoff similarity score. Hence an SSM can be treated as a binary classifier. We utilize the area under the ROC curve (AUC) as an evaluation metric for binary classifiers.
For each KEGG pathway, an intra-set average similarity is computed as the average of all pairwise similarities of proteins within the pathway. An inter-set average similarity for every two pathways is computed as the average of all pairwise cross-similarities of proteins between the two pathways. A discriminating power (DP) of a pathway is defined as the ratio between intra-set average similarity and the average of all inter-set average similarities between that pathway and other pathways as in [3]. Thus the DP quantifies the ability of an SSM to distinguish among various functionally different sets of proteins (e.g., KEGG pathways).
4 Results and Discussion
ROC curve analysis: Table 1 summarizes AUC of the top five SSMs for the different versions of BP ontology. Insignificant change in AUC values for all SSMs indicates that the evolution of GO has no impact on their classification performance. This can be explained easily. An AUC of 1 implies a perfect classifier, while an area of 0.5 indicates a random classifier. So, the practical range of AUC for a reasonably good classifier is very limited (Generally, [0.7, 1]). Unless the majority of the PPIs get affected (due to the changes in GO), it is unexpected to observe high variability in AUCs over the different versions of GO. By affected we mean for a given PPI, an SSM produces different similarity scores for different GO versions. In fact, the majority of PPIs (in the PPI dataset) does not get affected significantly due to the changes in GO.
To see the closer picture of the impact, we find those PPIs whose similarity scores change over the versions of GO. For each SSM, we select the common PPIs (more than \(99\%\) of PPIs are common) among the five GO versions. For each of the selected PPIs, the standard deviation of the five similarity scores corresponding to the five GO versions is calculated. Then we sort the PPIs according to their standard deviation (in descending order) and select the top \(10\%\) PPIs. The selected PPIs are the most affected \(10\%\) PPIs due to the changes in GO. An equal number of negative PPIs are selected from the already generated negative PPIs for the corresponding SSM. Finally, AUC is computed for the selected positive and negative PPIs for each GO version. The resultant AUCs of two best performing SSMs for the different versions of GO-BP are demonstrated in Table 2.
Now, the performance variations of SSMs among GO versions are quite visible. For RES, we observe relative changes of approximately \(8\%\) and \(4\%\) while using MAX and BMA strategies respectively. Similarly, for TCSS, relative changes of approximately \(6\%\) and \(7\%\) while using MAX and BMA strategies. These changes are observed between versions 3.0 and 3.4. Similar observations are made for the other SSMs and using other ontologies. We also observe that across all measures, the overall variability is higher in CC than BP.
To find a general pattern of variability among SSMs, we repeat the aforementioned process for different cutoffs (\(100\%\) to the top \(10\%\)) of affected PPIs. Here a cutoff of \(100\%\) implies that all PPIs are considered and hence, the majority of them have no change in their similarity score. The mean AUCs (of five GO versions) achieved by SSMs in increasing order of variability of PPIs are shown in Fig. 1.
SSMs with BMA strategy shows robustness compared to MAX strategy. Almost all SSMs with BMA strategy either improve or keep their performance similar from their initial performance as variability increases in both the ontologies. Particularly in BP, the improvement is more smooth and consistent. However, with MAX strategy, the performance is quite fluctuating, and the irregularity is more in CC. Therefore it seems that MAX strategy overestimates in many cases, especially in CC.
All SSMs exhibit higher robustness in BP than CC. If we examine the same for each SSM separately, we get further insights (See Figs. 2 and 3). With all data considered (\(100\%\)), SSMs with MAX strategy gives better AUC in comparison with BMA. However, as variability increases (by removing PPIs having no changes over GO evolution), SSMs with BMA obtain higher AUCs. In TCSS, although BMA increases its performances continuously, it is unable to cross the performance of MAX, particularly in BP. In fact, the difference of performance between MAX and BMA of TCSS and Resnik is reducing as variability increases, and they show almost similar performances with very high variable PPIs (>50%).
RES-BMA continuously produces the highest AUCs as variability increases. In general, RES, RNS, and TCSS show comparatively high robustness. With the top \(10\%\) variable PPIs, the highest mean AUC is 0.949/0.957 (BP/CC) produced by RES-BMA while the second-highest mean AUC is 0.922/ 0.940 (BP/CC) produced by TCSS with MAX or BMA.
Set-discriminating power of KEGG pathways: For each GO versions and SSM, we calculate DP values of each pathway with respect to other 10 pathways. Then we take version-wise (GO) mean DP values. Table 3 shows the mean DP values of all the 11 pathways for each GO-BP version and SSM.
The majority of SSMs produce quite similar DP values over the evolution of GO since less number of PPIs are affected due to the changes in GO. RES almost continuously produces higher DP values in both the ontologies, particularly, with BMA strategy. TCSS shows competitive performances in both the ontologies while Jiang achieves good DP values in BP only. The significant differences between MAX and BMA strategies, in both BP and CC simultaneously, are observed with RES, TCSS, and some extend with RNS only.
RES-BMA shows continuous and significant improvement over the evolution of GO. We can assume that the newer GO version represents more accurate and complete information than the older, and the robust SSMs should reflect that positively. RES-BMA almost continuously improves its DP value over the evolution of BP ontology (5.59, 5.76, 6.38, 6.58, and 6.50) except for the last version (Ver. 3.4), whereas other SSMs keep their performances quite similar. In fact, the changes, particularly, in edges, between the two GO-BP versions (Ver. 3.3 to Ver. 3.4) are very less (+0.30%) in comparison with other versions (The avg. successive change is +2.91%). Hence the changes are reflected better way with RES-BMA than the others.
5 Conclusion
In this paper, we systematically study how similarity measures get affected due to the evolution of gene ontology for the task of scoring confidence of PPIs. We observe that the performance of each measure gets affected due to the regular updates of GO. All SSMs exhibit satisfactory robustness with BMA strategy in BP ontology only. SSMs with MAX strategy have the tendency to overestimate, particularly in CC. Although, RES-BMA, TCSS-BMA and RNS-BMA exhibit comparatively good robustness, the changes in GO is reflected better way with RES-BMA than the others.
Notes
- 1.
Hereafter we refer to gene products only.
References
Ashburner, M., et al.: Gene ontology: tool for the unification of biology. Nature Genet. 25(1), 25–29 (2000)
Bandyopadhyay, S., Mallick, K.: A new path based hybrid measure for gene ontology similarity. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 11(1), 116–127 (2014)
Benabderrahmane, S., Smail-Tabbone, M., Poch, O., Napoli, A., Devignes, M.D.: IntelliGO: a new vector-based semantic similarity measure including annotation origin. BMC Bioinform. 11(1), 588 (2010)
Jain, S., Bader, G.D.: An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology. BMC Bioinform. 11(1), 562 (2010)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of 10th International Conference on Research In Computational Linguistics, ROCLING 1997 (1997)
Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning, vol. 98, pp. 296–304. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics 19(10), 1275–1283 (2003)
Paul, M., Anand, A.: A new family of similarity measures for scoring confidence of protein interactions using gene ontology, p. 459107. bioRxiv (2018)
Pesquita, C.: Semantic similarity in the gene ontology. In: Dessimoz, C., Škunca, N. (eds.) The Gene Ontology Handbook. MMB, vol. 1446, pp. 161–173. Springer, New York (2017). https://doi.org/10.1007/978-1-4939-3743-1_12
Pesquita, C., Faria, D., Falcao, A.O., Lord, P., Couto, F.M.: Semantic similarity in biomedical ontologies. PLoS Comput. Biol. 5(7), e1000443 (2009)
Razick, S., Magklaras, G., Donaldson, I.M.: iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinform. 9(1), 1 (2008)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence, pp. 448–453. Morgan Kaufmann Publishers Inc., San Francisco (1995)
Schlicker, A., Domingues, F.S., Rahnenführer, J., Lengauer, T.: A new measure for functional similarity of gene products based on gene ontology. BMC Bioinform. 7(1), 302 (2006)
Wang, J.Z., Du, Z., Payattakool, R., Yu, P.S., Chen, C.F.: A new method to measure the semantic similarity of go terms. Bioinformatics 23(10), 1274–1281 (2007)
Xenarios, I., Rice, D.W., Salwinski, L., Baron, M.K., Marcotte, E.M., Eisenberg, D.: DIP: the database of interacting proteins. Nucleic Acids Res. 28(1), 289–291 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Paul, M., Anand, A., Pyne, S. (2019). Impact of the Continuous Evolution of Gene Ontology on Similarity Measures. In: Deka, B., Maji, P., Mitra, S., Bhattacharyya, D., Bora, P., Pal, S. (eds) Pattern Recognition and Machine Intelligence. PReMI 2019. Lecture Notes in Computer Science(), vol 11942. Springer, Cham. https://doi.org/10.1007/978-3-030-34872-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-34872-4_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34871-7
Online ISBN: 978-3-030-34872-4
eBook Packages: Computer ScienceComputer Science (R0)