Skip to main content
Log in

Using SSN-Analyzer for analysis of semantic similarity networks

  • Original Article
  • Published:
Network Modeling Analysis in Health Informatics and Bioinformatics Aims and scope Submit manuscript

Abstract

Semantic similarity measures (SSMs) are used to evaluate the similarity among terms of an ontology. Biological entities, e.g., gene products, are often annotated with terms extracted from existing ontologies. A common application is to find the similarity or dissimilarity among two entities through the application of SSMs to their annotations. More recently, researchers have introduced the semantic similarity networks (SSNs), i.e., edge-weighted graphs where the nodes are concepts (e.g., proteins) and each edge has an associated weight that represents the semantic similarity among related pairs of nodes. Community detection algorithms that analyze SSNs may reveal clusters of functionally associated concepts. For instance, the application of these algorithms on networks built upon of proteins may find protein complexes. SSNs have a high number of arcs with low weight. The application of classical community detection algorithms on raw networks exhibits low performance. To improve the performance of such algorithms, a possible approach is to simplify the structure of SSNs through a preprocessing step able to delete arcs likened to noise. Thus, we propose a novel preprocessing strategy to simplify SSNs implemented in an open-source tool: SSN-Analyzer. As proof of concept, we demonstrate that community detection algorithms applied to filtered (thresholded) networks, have better performances in terms of biological relevance of the results, with respect to the use of raw unfiltered networks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://shiny.rstudio.com.

  2. http://www.cran-r.org.

  3. http://wodaklab.org/cyc2008/.

  4. http://wodaklab.org/cyc2008/.

References

  • Agapito G, Guzzi PH, Cannataro M (2013) Visualization of protein interaction networks: problems and solutions. BMC Bioinform 14(Suppl 1):S1

    Article  Google Scholar 

  • Ala U, Piro R, Grassi E, Damasco C, Silengo L, Oti M, Provero P, Cunto F (2008) Prediction of human disease genes by human-mouse conserved coexpression analysis. PLoS Comput Biol 4(3):e1000,043. doi:10.1371/journal.pcbi.1000043

  • Alpert C, Kahng A, Yao S (1999) Spectral partitioning with multiple eigenvectors. Discret Appl Math 90(1–3):3–26. doi:10.1016/S0166-218X(98)00083-3

    Article  MathSciNet  MATH  Google Scholar 

  • Bader G, Hogue C (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform 27:1–27. http://www.biomedcentral.com/1471-2105/4/2

  • Bertolazzi P, Bock ME, Guerra C (2013) On the functional and structural characterization of hubs in protein-protein interaction networks. Biotechnol Adv 31(2):274–286. doi:10.1016/j.biotechadv.2012.12.002

    Article  Google Scholar 

  • Blatt M, Wiseman S, Domany E (1996) Superparamagnetic clustering of data. Phys Rev Lett 76(18):3251–3254

    Article  Google Scholar 

  • Bolla M, Tusnády G (1994) Spectra and optimal partitions of weighted graphs. Discret Math 128(1):1–20

    Article  MATH  Google Scholar 

  • Brohée S, van Helden J (2006) Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinform 7:488. doi:10.1186/1471-2105-7-488

    Article  Google Scholar 

  • Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J, Binns D, Harte N, Lopez R, Apweiler R (2004) The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucl Acids Res 32(suppl\_1):D262–266. doi:10.1093/nar/gkh021

    Article  Google Scholar 

  • Cannataro M, Guzzi PH, Veltri P (2010) Protein-to-protein interactions: technologies, databases, and algorithms. ACM Comput Surv 43:1:1–1:36. doi:10.1145/1824795.1824796

    Article  Google Scholar 

  • Cannataro M, Guzzi PH, Sarica A (2013) Data mining and life sciences applications on the grid. Wiley Interdiscip Rev Data Min Knowl Discov 3(3):216–238

    Article  Google Scholar 

  • Chung F (1994) Spectral graph theory. In: Regional conference series in mathematics, vol 92. American Mathematical Society, Providence

  • Cvetković D, Simić SK (2010) Towards a spectral theory of graphs based on the signless laplacian, ii. Linear Algebra Appl 432(9):2257–2272

    Article  MathSciNet  MATH  Google Scholar 

  • Ding C, He X, Zha H (2001) A spectral method to separate disconnected and nearly-disconnected web graph components. In: Proceedings of the seventh ACM international conference on knowledge discovery and data mining, 26–29 August 2001, San Francisco

  • Enright AJ, Van Dongen S, Ouzounis C (2002) An efficient algorithm for large-scale detection of protein families. Nucl Acids Res 30(7):1575–1584

    Article  Google Scholar 

  • Freeman T, Goldovsky L, Brosch M, van Dongen S, Maziere P, Grocock R, Freilich S, Thornton J, Enright A (2007) Construction, visualization, and clustering of transcription networks from microarray expression data. PLoS Comput Biol 3(10):e206. doi:10.1371/journal.pcbi.0030206

    Article  Google Scholar 

  • Guldener U, Munsterkotter M, Oesterheld M, Pagel P, Ruepp A, Mewes H, Stumpflen V (2006) Mpact: the mips protein interaction resource on yeast. Nucl Acids Res 34:D436–441. doi:10.1093/nar/gkj003

    Article  Google Scholar 

  • Guzzi PH, Mina M (2012) Investigating bias in semantic similarity measures for analysis of protein interactions. In: Proceedings of 1st international workshop on pattern recognition in proteomics, structural biology and bioinformatics (PR PS BB 2011), pp 71–80. doi:10.1393/ncc/i2012-11336-0

  • Guzzi P, Mina M, Guerra C, Cannataro M (2012) Semantic similarity analysis of protein data: assessment with biological features and issues. Brief Bioinform 13(5):569–585. doi:10.1093/bib/bbr066. http://bib.oxfordjournals.org/content/early/2011/12/02/bib.bbr066.short

  • Harispe S, Sanchez D, Ranwez S, Janaqi S, Montmain J (2013) A framework for unifying ontology-based semantic similarity measures: a study in the biomedical domain. J Biomed Inf. doi:10.1016/j.jbi.2013.11.006

  • Ji J, Zhang A, Liu C, Quan X, Liu Z (2013) Survey: functional module detection from protein-protein interaction networks. IEEE Trans Knowl Data Eng 99(PrePrints). doi:10.1109/TKDE.2012.225

  • King AD, Przulj N, Jurisica I (2004) Protein complex prediction via cost-based clustering. Bioinformatics (Oxford, England) 20(17):3013–20. doi:10.1093/bioinformatics/bth351. http://www.ncbi.nlm.nih.gov/pubmed/15180928

  • Lee H, Hsu A, Sajdak J, Qin J, Pavlidis P (2004) Coexpression analysis of human genes across many microarray data sets. Genome Res 14:1085–1094. doi:10.1101/gr.1910904

    Article  Google Scholar 

  • Lin D (1998) An information-theoretic definition of similarity. Morgan Kaufmann, San Francisco, pp 296–304. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.55.1832

  • Ma X, Gao L (2012) Biological network analysis:insights into structure and functions. Brief Funct Genom 11(6):434–442. doi:10.1093/bfgp/els045

    Article  MathSciNet  Google Scholar 

  • Merris R (1994) Laplacian matrices of graphs: a survey. Linear Algebra Appl 197:143–176

    Article  MathSciNet  Google Scholar 

  • Mina M, Guzzi PH (2012) Alignmcl: comparative analysis of protein interaction networks through markov clustering. In: BIBM workshops. IEEE Computer Society Press, pp 174–181

  • Mina M, Guzzi PH (2014) Improving the robustness of local network alignment: design and extensive assessmentof a markov clustering-based approach. IEEE/ACM Trans Comput Biol Bioinform 11(3):561–572. doi:10.1109/TCBB.2014.2318707

    Article  Google Scholar 

  • Mohar B (1991) The Laplacian spectrum of graphs. In: Graph theory, combinatorics, and applications. Computers & Mathematics with Applications, vol 48. issue 5–6, pp 715–724. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.96.2577

  • Nassa G, Tarallo R, Ambrosino C, Bamundo A, Ferraro L, Paris O, Ravo M, Guzzi PH, Cannataro M, Baumann M, Nyman TA, Nola E, Weisz A (2011) A large set of estrogen receptor interacting proteins identified by tandem affinity purification in hormone-responsive human breast cancer cell nuclei. Proteomics 43:159–165. doi:10.1002/pmic.201000344

    Article  Google Scholar 

  • Ng AY, Jordan MI, Weiss Y et al (2002) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 2:849–856

    Google Scholar 

  • Ovaska K, Laakso M, Hautaniemi S (2008) Fast gene ontology based clustering for microarray experiments. BioData Min 1(1):11

    Article  Google Scholar 

  • Pesquita C, Faria D, Falcao A, Lord P, Couto FM (2009) Semantic similarity in biomedical ontologies. PLoS Comput Biol 5(7):e1000,443. doi:10.1371/journal.pcbi.1000443

  • Resnik P et al (1999) Semantic similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J Artif Intell Res 11:95–130

    MATH  Google Scholar 

  • Rito T, Wang Z, Deane CM, Reinert G (2010) How threshold behaviour affects the use of subgraphs for network comparison. Bioinformatics 26(18):i611–i617. doi:10.1093/bioinformatics/btq386. http://bioinformatics.oxfordjournals.org/content/26/18/i611.abstract

Download references

Acknowledgments

This work has been partially supported by the following research projects funded by MIUR: PRIN 2010–2011 2010NFEB9L_003; PON04a2_D “DICET-INMOTO-ORCHESTRA”; PON04a2_C Staywell SH 2.0.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pietro H. Guzzi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guzzi, P.H., Milano, M., Veltri, P. et al. Using SSN-Analyzer for analysis of semantic similarity networks. Netw Model Anal Health Inform Bioinforma 4, 6 (2015). https://doi.org/10.1007/s13721-015-0077-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13721-015-0077-2

Keywords

Navigation