Abstract
Semantic similarity plays a vital role within a myriad of shared data applications, such as data and information integration. A first step towards building such applications is to determine concepts, which are semantically similar to each other. One way to compute this similarity of two concepts is to assess their word similarity by exploiting different knowledge sources, e.g., ontologies, thesauri, domain corpora, etc. Over the last few years, several approaches to similarity assessment based on quantifying information content of concepts have been proposed and have shown encouraging performance. For all these approaches, the Least Common Subsumer (LCS) of two concepts plays an important role in determining their similarity. In this paper, we investigate the influence the choice of this node (or a set of nodes) on the quality of the similarity assessment. In particular, we develop a particle swarm optimization approach that optimally discovers LCSs. An empirical evaluation, based on well-established biomedical benchmarks and ontologies, illustrates the accuracy of the proposed approach, and demonstrates that similarity estimations provided by our approach are significantly more correlated with human ratings of similarity than those obtained via related works.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al-Mubaid, H., Nguyen, H.A.: Measuring semantic similarity between biomedical concepts within multiple ontologies. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 39(4), 389–398 (2009)
Batet, M., Harispe, S., Ranwez, S., Sánchez, D., Ranwez, V.: An information theoretic approach to improve semantic similarity assessments across multiple ontologies. Info. Sci. 283, 197–210 (2014)
Batet, M., Sánchez, D., Valls, A., Gibert, K.: Semantic similarity estimation from multiple ontologies. Appl. Intell. 38(1), 29–44 (2013)
Bock, J., Hettenhausen, J.: Discrete particle swarm optimisation for ontology alignment. Inf. Sci. 192, 152–173 (2012)
Correa, E.S., Freitas, A.A., Johnson, C.G.: A new discrete particle swarm algorithm applied to attribute selection in a bioinformatics data set. In: Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation, pp. 35–42. ACM (2006)
Correa, E.S., Freitas, A.A., Johnson, C.G.: Particle swarm and Bayesian networks applied to attribute selection for protein functional classification. In: Proceedings of the 9th Annual Conference on Companion on Genetic and Evolutionary Computation, pp. 2651–2658. ACM (2007)
Harispe, S., Ranwez, S., Janaqi, S., Montmain, J.: The semantic measures library and toolkit: fast computation of semantic similarity and relatedness using biomedical ontologies. Bioinformatics 30(5), 740–742 (2013)
Hliaoutakis, A.: Semantic similarity measures in mesh ontology and their application to information retrieval on medline. Master’s thesis (2005)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. arXiv preprint cmp-lg/9709008 (1997)
Leacock, C., Chodorow, M.: Combining local context and WordNet similarity for word sense identification. WordNet Electron. Lexical Database 49(2), 265–283 (1998)
Lin, D., et al.: An information-theoretic definition of similarity. In: ICML, vol. 98, pp. 296–304. Citeseer (1998)
Martı, S., Valls, A., SáNchez, D., et al.: Semantically-grounded construction of centroids for datasets with textual attributes. Knowl.-Based Syst. 35, 160–172 (2012)
Nelson, S.J., Johnston, W.D., Humphreys, B.L.: Relationships in medical subject headings (MeSH). In: Bean, C.A., Green, R. (eds.) Relationships in the Organization of Knowledge. Information Science and Knowledge Management, vol. 2, pp. 171–184. Springer, Dordrecht (2001). doi:10.1007/978-94-015-9696-1_11
Patwardhan, S., Banerjee, S., Pedersen, T.: Using measures of semantic relatedness for word sense disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 241–257. Springer, Heidelberg (2003). doi:10.1007/3-540-36456-0_24
Pedersen, T., Pakhomov, S.V., Patwardhan, S., Chute, C.G.: Measures of semantic similarity and relatedness in the biomedical domain. J. Biomed. Inform. 40(3), 288–299 (2007)
Petrakis, E.G., Varelas, G., Hliaoutakis, A., Raftopoulou, P.: X-similarity: computing semantic similarity between concepts from different ontologies. JDIM 4(4), 233–237 (2006)
Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cybern. 19(1), 17–30 (1989)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. arXiv preprint cmp-lg/9511007 (1995)
Rodríguez, M.A., Egenhofer, M.J.: Determining semantic similarity among entity classes from different ontologies. IEEE Trans. Knowl. Data Eng. 15(2), 442–456 (2003)
Sánchez, D., Batet, M.: A new model to compute the information content of concepts from taxonomic knowledge. Int. J. Semant. Web Info. Syst. (IJSWIS) 8(2), 34–50 (2012)
Sánchez, D., Batet, M.: A semantic similarity method based on information content exploiting multiple ontologies. Expert Syst. Appl. 40(4), 1393–1399 (2013)
Sánchez, D., Batet, M., Isern, D., Valls, A.: Ontology-based semantic similarity: a new feature-based approach. Expert Syst. Appl. 39(9), 7718–7728 (2012)
Sánchez, D., Solé-Ribalta, A., Batet, M., Serratosa, F.: Enabling semantic similarity estimation across multiple ontologies: an evaluation in the biomedical domain. J. Biomed. Inform. 45(1), 141–155 (2012)
Saruladha, K., Aghila, G., Bhuvaneswary, A.: Information content based semantic similarity for cross ontological concepts. Int. J. Eng. Sci. Tech. 3(6), 327–336 (2011)
Seco, N., Veale, T., Hayes, J.: An intrinsic information content metric for semantic similarity in WordNet. In: Proceedings of the 16th European Conference on Artificial Intelligence, pp. 1089–1090. IOS Press (2004)
Spackman, K.: SNOMED CT milestones: endorsements are added to already-impressive standards credentials. Healthc. Inf. Bus. Mag. info. Commun. Syst. 21(9), 54–56 (2004)
Sy, M.-F., Ranwez, S., Montmain, J., Regnault, A., Crampes, M., Ranwez, V.: User centered and ontology based information retrieval system for life sciences. BMC Bioinform. 13, S4 (2012)
Vicient, C., Sánchez, D., Moreno, A.: An automatic approach for ontology-based feature extraction from heterogeneous textualresources. Eng. Appl. Artif. Intell. 26(3), 1092–1106 (2013)
Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, pp. 133–138 (1994)
Acknowledgments
The work has been (partly) funded by the Deutsche Forschungsgemeinschaft (DFG) as part of CRC 1076 AquaDiva. S. Babalou is also supported by a scholarship from German Academic Exchange Service (DAAD).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Babalou, S., Algergawy, A., König-Ries, B. (2017). A Particle Swarm-Based Approach for Semantic Similarity Computation. In: Panetto, H., et al. On the Move to Meaningful Internet Systems. OTM 2017 Conferences. OTM 2017. Lecture Notes in Computer Science(), vol 10574. Springer, Cham. https://doi.org/10.1007/978-3-319-69459-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-69459-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69458-0
Online ISBN: 978-3-319-69459-7
eBook Packages: Computer ScienceComputer Science (R0)