Abstract
This paper addresses the information-theoretic definition of semantic similarity based on the notion of information content, and presents an evolution of a novel approach for evaluating semantic similarity in a taxonomy. Such an approach takes into account not only the generic sense of a concept but also its intended sense in a given context. In particular, a method for computing the semantic relatedness of concepts in RDF knowledge graphs is used for evaluating the relevance of the intended sense of a concept with respect to its generic sense. The experiment of this work shows that the relatedness method based on triple patterns adopted in this paper leads to higher correlation values with human judgment with respect to the ones obtained according to the original proposal that is based on a triple weights relatedness measure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Change history
09 March 2023
In an older version of this paper, the online xml is fine but in the pdf the characters > and < are not recognized on page 69. This has been corrected.
Notes
- 1.
\(ic(p) + ic(r_o \vert p) = ic(Pr(X=p)) + ic(Pr(Y=r_o \vert X=p)) = -log(Pr(X=p)) -log(Pr(Y=r_o \vert X=p)) = -log(Pr(X=p)Pr(Y=r_o \vert X=p)) = -log(Pr(X=p), Pr(Y=r_o)) = ic(Pr(X=p), Pr(Y=r_o)) = ic(Pr(p, r_o)) = ic(p, r_o)\).
- 2.
A direct link is an arc connecting two adjacent nodes, whereas an indirect link is a path with length greater than 1.
- 3.
In the original work [23], this function is defined as \(C_d(p_j,r_a,n)\), where n represents the value of the function \(C_d(p_j,r_a)\).
- 4.
Analogously to the function \(C_d\), in [23], these two functions are defined as \(C_{io}(p_j,r_a,n)\) and \(C_{ii}(p_j,r_a,n)\), respectively.
References
Abdelrahman, A.M.B., Kayed, A.: A survey on semantic similarity measures between concepts in health domain. Am. J. Comput. Math. 5, 204–214 (2015)
Adhikari, A., Singh, S., Mondal, D., Dutta, B., Dutta, A.: A novel information theoretic framework for finding semantic similarity in wordnet. CoRR, arXiv:1607.05422, abs/1607.05422 (2016)
Adhikari, A., Dutta, B., Dutta, A., Mondal, D., Singh, S.: An intrinsic information content-based semantic similarity measure considering the disjoint common subsumers of concepts of an ontology. J. Assoc. Inf. Sci. Technol. 69(8), 1023–1034 (2018)
Ajumder, G.O.M., Akray, P.A.P., Elbukh, A.L.G.: Measuring semantic textual similarity of sentences using modified information content and lexical taxonomy. Int. J. Comput. Linguist. Appl. 7(2), 65–85 (2016)
Banu, A., Fatima, S.S., Khan, K.: Information content based semantic similarity measure for concepts subsumed by multiple concepts. Int. J. Web Appl. 7(3), 85–94 (2015)
Batet, M., Sànchez, D.: Leveraging synonymy and polysemy to improve semantic similarity assessments based on intrinsic information content. Artif. Intell. Rev. 53(3), 2023–2041 (2020)
Cazzanti, L., Gupta, M.R.: Information-theoretic and set-theoretic similarity, pp. 1836–1840. IEEE International Symposium on Information Theory, Seattle, WA (2006)
Chandrasekaran, D., Mago, V.: Evolution of semantic similarity - a survey. ACM Comput. Surv. 54(2), Article 41 (2021)
El Vaigh, C.B., Goasdoué, F., Gravier, G., Sébillot, P.: A novel path-based entity relatedness measure for efficient collective entity linking. In: Pan, J.Z., et al. (eds.) ISWC 2020. LNCS, vol. 12506, pp. 164–182. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62419-4_10
Formica, A., Pourabbas, E.: Content based similarity of geographic classes organized as partition hierarchies. Knowl. Inf. Syst. 20(2), 221–241 (2009)
Formica, A., Missikoff, M., Pourabbas, E., Taglino, F.: Semantic search for matching user requests with profiled enterprises. Comput. Ind. 64(3), 191–202 (2013)
Formica, A.: Similarity reasoning in formal concept analysis: from one- to many-valued contexts. Knowl. Inf. Syst. 60(2), 715–739 (2019)
Formica, A., Mazzei, M., Pourabbas, E., Rafanelli, M.: Approximate query answering based on topological neighborhood and semantic similarity in openstreetmap. IEEE Access 8, 87011–87030 (2020)
Formica, A., Taglino, F.: An enriched information-theoretic definition of semantic similarity in a taxonomy. IEEE Access 9, 100583–100593 (2021)
Hadj Taieb, M.A., Zesch, T. Aouicha, M.B.: A survey of semantic relatedness evaluation datasets and procedures. Artif. Intell. Rev. 53, 4407–4448 (2020)
Hulpuş, I., Prangnawarat, N., Hayes, C.: Path-based semantic relatedness on linked data and its use to word and entity disambiguation. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 442–457. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_26
Jeong, S., Yim, J.H., Lee, H.J., Sohn, M.M.: Semantic similarity calculation method using information contents-based edge weighting. J. Internet Serv. Inf. Secur. 7(1), 40–53 (2017)
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of International Conference on Research Computational Linguistics (ROCLING X), Taiwan (1997)
Lastra-Dìaz, J.J., Garcìa-Serrano, A.: A new family of information content models with an experimental survey on WordNet. Knowl.-Based Syst. 89, 509–526 (2015)
Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the International Conference on Machine Learning, Madison, Wisconsin, USA, pp. 296–304. Morgan Kaufmann (1998)
Meymandpour, R., Davis, J.G.: A semantic similarity measure for linked data: an information content-based approach. Knowl.-Based Syst. 109, 276–293 (2016)
Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cognit. Process. 6(1), 1–28 (1991)
Passant, A.: Measuring semantic distance on linking data and using it for resources recommendations. In: Proceedings of the AAAI Spring Symposium on Linked Data Meets Artificial Intelligence (2010)
Pirrò, G.: A semantic similarity metric combining features and intrinsic information content. Data Knowl. Eng. 68(11), 1289–1308 (2009)
Rada, R., Mili, H., Bichnell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cybern. 9, 17–30 (1989)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the Int. Joint Conference on Artificial Intelligence, Montreal, Quebec, Canada, August 20–25, pp. 448–453. Morgan Kaufmann (1995)
Resnik, P.: Semantic Similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J. Artif. Intell. Res. 11, 95–130 (1999)
Schwering, A.: Approaches to semantic similarity measurement for geo-spatial data: a survey. Trans. GIS 12(1), 5–29 (2008)
Schuhmacher, M., Ponzetto, S.P.: Knowledge-based graph document modeling. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, (WSDM), New York, USA, pp. 543–552 (2014)
Tversky, A.: Features of similarity. Psychol. Rev. 84, 327–352 (1977)
Taglino, F., Formica, A.: Semantic similarity with concept senses. Mendeley Data, V1 (2022). https://data.mendeley.com/datasets/994p293zcf
Wang, F., Wang, N., Cai, S., Zhang, W.: A similarity measure in formal concept analysis containing general semantic information and domain information. IEEE Access 8, 75303–75312 (2020)
Weller-Fahy, D.J., Borghetti, B.J., Sodemann, A.A.: A Survey of Distance and similarity measures used within network intrusion anomaly detection. IEEE Commun. Surv. Tutor. 17(1), 70–91 (2015)
Witten, I.H., Milne, D.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links.. In: Proceedings of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy, pp. 25–30. AAAI Press, Chicago, USA (2008)
Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting of the Associations for Computational Linguistics, Las Cruces, New Mexico, pp. 133–138 (1994)
Zhang, X., Sun, S., Zhang, K.: An information content-based approach for measuring concept semantic similarity in wordnet. Wireless Pers. Commun. 103(1), 117–132 (2018). https://doi.org/10.1007/s11277-018-5429-7
Zhou, W., Wang, H., Chao, J., Zhang, W., Yu, Y.: LODDO: using linked open data description overlap to measure semantic relatedness between named entities. In: Pan, J.Z. et al. (eds.) Proceedings of Joint International Semantic Technology Conference, JIST 2011 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer-Verlag GmbH Germany, part of Springer Nature
About this chapter
Cite this chapter
Formica, A., Taglino, F. (2023). Semantic Similarity in a Taxonomy by Evaluating the Relatedness of Concept Senses with the Linked Data Semantic Distance. In: Hameurlain, A., Tjoa, A.M. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems LIII. Lecture Notes in Computer Science(), vol 13840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-66863-4_3
Download citation
DOI: https://doi.org/10.1007/978-3-662-66863-4_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-66862-7
Online ISBN: 978-3-662-66863-4
eBook Packages: Computer ScienceComputer Science (R0)