Skip to main content

Semantic Similarity in a Taxonomy by Evaluating the Relatedness of Concept Senses with the Linked Data Semantic Distance

  • Chapter
  • First Online:
Transactions on Large-Scale Data- and Knowledge-Centered Systems LIII

Abstract

This paper addresses the information-theoretic definition of semantic similarity based on the notion of information content, and presents an evolution of a novel approach for evaluating semantic similarity in a taxonomy. Such an approach takes into account not only the generic sense of a concept but also its intended sense in a given context. In particular, a method for computing the semantic relatedness of concepts in RDF knowledge graphs is used for evaluating the relevance of the intended sense of a concept with respect to its generic sense. The experiment of this work shows that the relatedness method based on triple patterns adopted in this paper leads to higher correlation values with human judgment with respect to the ones obtained according to the original proposal that is based on a triple weights relatedness measure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Change history

  • 09 March 2023

    In an older version of this paper, the online xml is fine but in the pdf the characters > and < are not recognized on page 69. This has been corrected.

Notes

  1. 1.

    \(ic(p) + ic(r_o \vert p) = ic(Pr(X=p)) + ic(Pr(Y=r_o \vert X=p)) = -log(Pr(X=p)) -log(Pr(Y=r_o \vert X=p)) = -log(Pr(X=p)Pr(Y=r_o \vert X=p)) = -log(Pr(X=p), Pr(Y=r_o)) = ic(Pr(X=p), Pr(Y=r_o)) = ic(Pr(p, r_o)) = ic(p, r_o)\).

  2. 2.

    A direct link is an arc connecting two adjacent nodes, whereas an indirect link is a path with length greater than 1.

  3. 3.

    In the original work [23], this function is defined as \(C_d(p_j,r_a,n)\), where n represents the value of the function \(C_d(p_j,r_a)\).

  4. 4.

    Analogously to the function \(C_d\), in [23], these two functions are defined as \(C_{io}(p_j,r_a,n)\) and \(C_{ii}(p_j,r_a,n)\), respectively.

References

  1. Abdelrahman, A.M.B., Kayed, A.: A survey on semantic similarity measures between concepts in health domain. Am. J. Comput. Math. 5, 204–214 (2015)

    Article  Google Scholar 

  2. Adhikari, A., Singh, S., Mondal, D., Dutta, B., Dutta, A.: A novel information theoretic framework for finding semantic similarity in wordnet. CoRR, arXiv:1607.05422, abs/1607.05422 (2016)

  3. Adhikari, A., Dutta, B., Dutta, A., Mondal, D., Singh, S.: An intrinsic information content-based semantic similarity measure considering the disjoint common subsumers of concepts of an ontology. J. Assoc. Inf. Sci. Technol. 69(8), 1023–1034 (2018)

    Article  Google Scholar 

  4. Ajumder, G.O.M., Akray, P.A.P., Elbukh, A.L.G.: Measuring semantic textual similarity of sentences using modified information content and lexical taxonomy. Int. J. Comput. Linguist. Appl. 7(2), 65–85 (2016)

    Google Scholar 

  5. Banu, A., Fatima, S.S., Khan, K.: Information content based semantic similarity measure for concepts subsumed by multiple concepts. Int. J. Web Appl. 7(3), 85–94 (2015)

    Google Scholar 

  6. Batet, M., Sànchez, D.: Leveraging synonymy and polysemy to improve semantic similarity assessments based on intrinsic information content. Artif. Intell. Rev. 53(3), 2023–2041 (2020)

    Article  Google Scholar 

  7. Cazzanti, L., Gupta, M.R.: Information-theoretic and set-theoretic similarity, pp. 1836–1840. IEEE International Symposium on Information Theory, Seattle, WA (2006)

    Google Scholar 

  8. Chandrasekaran, D., Mago, V.: Evolution of semantic similarity - a survey. ACM Comput. Surv. 54(2), Article 41 (2021)

    Google Scholar 

  9. El Vaigh, C.B., Goasdoué, F., Gravier, G., Sébillot, P.: A novel path-based entity relatedness measure for efficient collective entity linking. In: Pan, J.Z., et al. (eds.) ISWC 2020. LNCS, vol. 12506, pp. 164–182. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62419-4_10

    Chapter  Google Scholar 

  10. Formica, A., Pourabbas, E.: Content based similarity of geographic classes organized as partition hierarchies. Knowl. Inf. Syst. 20(2), 221–241 (2009)

    Article  Google Scholar 

  11. Formica, A., Missikoff, M., Pourabbas, E., Taglino, F.: Semantic search for matching user requests with profiled enterprises. Comput. Ind. 64(3), 191–202 (2013)

    Article  Google Scholar 

  12. Formica, A.: Similarity reasoning in formal concept analysis: from one- to many-valued contexts. Knowl. Inf. Syst. 60(2), 715–739 (2019)

    Article  Google Scholar 

  13. Formica, A., Mazzei, M., Pourabbas, E., Rafanelli, M.: Approximate query answering based on topological neighborhood and semantic similarity in openstreetmap. IEEE Access 8, 87011–87030 (2020)

    Article  Google Scholar 

  14. Formica, A., Taglino, F.: An enriched information-theoretic definition of semantic similarity in a taxonomy. IEEE Access 9, 100583–100593 (2021)

    Article  Google Scholar 

  15. Hadj Taieb, M.A., Zesch, T. Aouicha, M.B.: A survey of semantic relatedness evaluation datasets and procedures. Artif. Intell. Rev. 53, 4407–4448 (2020)

    Google Scholar 

  16. Hulpuş, I., Prangnawarat, N., Hayes, C.: Path-based semantic relatedness on linked data and its use to word and entity disambiguation. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 442–457. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_26

    Chapter  Google Scholar 

  17. Jeong, S., Yim, J.H., Lee, H.J., Sohn, M.M.: Semantic similarity calculation method using information contents-based edge weighting. J. Internet Serv. Inf. Secur. 7(1), 40–53 (2017)

    Google Scholar 

  18. Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of International Conference on Research Computational Linguistics (ROCLING X), Taiwan (1997)

    Google Scholar 

  19. Lastra-Dìaz, J.J., Garcìa-Serrano, A.: A new family of information content models with an experimental survey on WordNet. Knowl.-Based Syst. 89, 509–526 (2015)

    Article  Google Scholar 

  20. Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the International Conference on Machine Learning, Madison, Wisconsin, USA, pp. 296–304. Morgan Kaufmann (1998)

    Google Scholar 

  21. Meymandpour, R., Davis, J.G.: A semantic similarity measure for linked data: an information content-based approach. Knowl.-Based Syst. 109, 276–293 (2016)

    Article  Google Scholar 

  22. Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cognit. Process. 6(1), 1–28 (1991)

    Article  Google Scholar 

  23. Passant, A.: Measuring semantic distance on linking data and using it for resources recommendations. In: Proceedings of the AAAI Spring Symposium on Linked Data Meets Artificial Intelligence (2010)

    Google Scholar 

  24. Pirrò, G.: A semantic similarity metric combining features and intrinsic information content. Data Knowl. Eng. 68(11), 1289–1308 (2009)

    Article  Google Scholar 

  25. Rada, R., Mili, H., Bichnell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cybern. 9, 17–30 (1989)

    Article  Google Scholar 

  26. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the Int. Joint Conference on Artificial Intelligence, Montreal, Quebec, Canada, August 20–25, pp. 448–453. Morgan Kaufmann (1995)

    Google Scholar 

  27. Resnik, P.: Semantic Similarity in a taxonomy: an information-based measure and its application to problems of ambiguity in natural language. J. Artif. Intell. Res. 11, 95–130 (1999)

    Article  MATH  Google Scholar 

  28. Schwering, A.: Approaches to semantic similarity measurement for geo-spatial data: a survey. Trans. GIS 12(1), 5–29 (2008)

    Article  Google Scholar 

  29. Schuhmacher, M., Ponzetto, S.P.: Knowledge-based graph document modeling. In: Proceedings of the 7th ACM International Conference on Web Search and Data Mining, (WSDM), New York, USA, pp. 543–552 (2014)

    Google Scholar 

  30. Tversky, A.: Features of similarity. Psychol. Rev. 84, 327–352 (1977)

    Article  Google Scholar 

  31. Taglino, F., Formica, A.: Semantic similarity with concept senses. Mendeley Data, V1 (2022). https://data.mendeley.com/datasets/994p293zcf

  32. Wang, F., Wang, N., Cai, S., Zhang, W.: A similarity measure in formal concept analysis containing general semantic information and domain information. IEEE Access 8, 75303–75312 (2020)

    Article  Google Scholar 

  33. Weller-Fahy, D.J., Borghetti, B.J., Sodemann, A.A.: A Survey of Distance and similarity measures used within network intrusion anomaly detection. IEEE Commun. Surv. Tutor. 17(1), 70–91 (2015)

    Google Scholar 

  34. Witten, I.H., Milne, D.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links.. In: Proceedings of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy, pp. 25–30. AAAI Press, Chicago, USA (2008)

    Google Scholar 

  35. Wu, Z., Palmer, M.: Verb semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting of the Associations for Computational Linguistics, Las Cruces, New Mexico, pp. 133–138 (1994)

    Google Scholar 

  36. Zhang, X., Sun, S., Zhang, K.: An information content-based approach for measuring concept semantic similarity in wordnet. Wireless Pers. Commun. 103(1), 117–132 (2018). https://doi.org/10.1007/s11277-018-5429-7

    Article  Google Scholar 

  37. Zhou, W., Wang, H., Chao, J., Zhang, W., Yu, Y.: LODDO: using linked open data description overlap to measure semantic relatedness between named entities. In: Pan, J.Z. et al. (eds.) Proceedings of Joint International Semantic Technology Conference, JIST 2011 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Formica .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer-Verlag GmbH Germany, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Formica, A., Taglino, F. (2023). Semantic Similarity in a Taxonomy by Evaluating the Relatedness of Concept Senses with the Linked Data Semantic Distance. In: Hameurlain, A., Tjoa, A.M. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems LIII. Lecture Notes in Computer Science(), vol 13840. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-66863-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-66863-4_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-66862-7

  • Online ISBN: 978-3-662-66863-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics