Abstract
The Web is a constantly growing repository of information. Amount of data that becomes available exceeds our abilities to search and examine this data in a reasonable time and with a practical effort. The data is stored in forms of documents, texts and web pages, which are not suitable for comprehensive analysis and search. In order to make the data stored on the Internet more accessible, a new model of data representation has been introduced—linked data. Linked data provides an open platform for representing and storing structured data as well as metadata. In this paper, we propose a novel approach for calculating the degree of similarity between two entities in the web of linked data. The idea is based on the fact that entities are submerged in the linked data and their semantics is defined via their connections to other entities. Therefore, similarity between two entities is determined by comparing connections of two entities to other entities. Firstly, the approach is introduced to determine semantic similarity in a context-free manner. This method does not select specific types of connections but takes into consideration all of them. Secondly, a context-aware approach is presented as a modification of the original method. In this case, a context is defined by a set of connection types—only connections of specific types are considered for similarity determination. The proposed approach uses concepts of possibility theory to determine lower and upper bounds of similarity intervals. We evaluate the proposed similarity assessment process by applying it to real-world datasets, and we compare it to other related methods.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Albertoni R, De Martino M (2010) Semantic similarity and selection of resources published according to linked data best practice. In: OTM 2010 workshops on the move to meaningful internet systems, pp 378–383
Albertoni R, Camossi E, De Martino M, Giannini F, Monti M (2008) Context enabled semantic granularity. In: Knowledge-based intelligent information and engineering systems, pp 682–688
Berners-Lee T, Hendler J (2001) Scientific publishing on the semantic web. Nature 410:1023–1024
Bizer C, Heath T, Berners-Lee T (2009) Linked data-the story so far. Int J Semant Web Inf Syst 4:1–22
Boros M, Eckert W, Gallwitz F, Gorz G, Hanrieder G, Niemann H (1996) Towards understanding spontaneous speech: word accuracy vs. concept accuracy in spoken language. Proceedings of fourth international conference on ICSLP 96, vol 2, pp 1009–1012
D. Hossein Zadeh P, Reformat MZ (2012a) Assimilation of information in linked data based knowledge base. In: 14th international conference on information processing and management of uncertainty in knowledge-based systems, Catania, 9–13 July 2012
D. Hossein Zadeh P, Reformat MZ (2012b) Feature-based similarity assessment in ontology using fuzzy set theory. In: IEEE 2010 international conference on fuzzy systems (FUZZ), pp 1462–1468
D. Hossein Zadeh P, Reformat MZ (2012c) Ontology as knowledge base for determining asymmetric and context-dependent similarity. J Inf Sci (submitted)
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41:391–407
DuBois D, Prade HM (1980) Fuzzy sets and systems: theory and applications. Academic Press, New York
Dubois D, Prade H (2003) Possibility theory and its applications: a retrospective and prospective view. In: FUZZ’03 the 12th IEEE international conference on fuzzy systems, p 5
Dubois D, Prade H, Harding E (1988) Possibility theory: an approach to computerized processing of uncertainty. Plenum press, New York
Frakes WB, Baeza-Yates R (1992) Information retrieval: data structures and algorithms, vol 7632. PTR Prentice-Hall Inc, Eaglewood Cliffs
Giunchiglia F, Yatskevich M, Shvaiko P (2007) Semantic matching: algorithms and implementation. J Data Semant IX, University of Trento, Trento, pp 1–38
Giuseppe P (2009) A semantic similarity metric combining features and intrinsic information content. Data Knowl Eng 68:1289–1308
Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5:199–220
Han L, Sun L, Chen G, Xie L (2006) ADSS: an approach to determining semantic similarity. Adv Eng Softw 37:129–132
Hliaoutakis A, Varelas G, Voutsakis E, Petrakis EGM, Milios E (2006) Information retrieval by semantic similarity. Int J Semant Web Inf Syst 2:55–73
Johannesson M (1997) Modelling asymmetric similarity with prominence. Lund University Cognitive Studies, Lund
Klir GJ, Folger TA (1988) Fuzzy sets, uncertainty, and information. Prentice Hall, Englewood Cliffs
Landauer TK, Dumais ST (1997) A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol Rev 104:211
Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25:259–284
Lassila O, Swick R (1999) Resource description framework (RDF) model and syntax specification. World wide web consortium technical reports and publications. http://www.w3.org/TR/1999/REC-rdf-syntax-19990222
Leacock C, Chodorow M (1998) Combining local context with WordNet similarity for word sense identification. In: WordNet: a lexical reference system and its application, MIT Press, Cambridge, pp 265–283
Lee TB, Hendler J, Lassila O (2001) The semantic web. Sci Am 284:34–43
Leibniz GW (1975) Philosophical papers and letters. Kluwer Academic Publishers, Dordrecht
Li Y, Bandar ZA, McLean D (2003) An approach for measuring semantic similarity between words using multiple information sources. Knowl Data Eng IEEE Trans 15:871–882
Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of the fifteenth international conference on machine learning, Madison, pp 296–304
Navigli R, Velardi P (2005) Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 27, pp 1075–1086
Nosofsky RM (1991) Stimulus bias, asymmetric similarity, and classification. Cognit Psychol 23:94–140
Oliva J, Serrano JI, Del Castillo MD, Iglesias A (2011) SyMSS: a syntax-based measure for short-text semantic similarity. Data Knowl Eng 70(4):390–405
Pedersen T, Pakhomov SVS, Patwardhan S, Chute CG (2007) Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform 40:288–299
Rada R, Mili H, Bicknell E (1989) Development and application of a metricon semantic nets. IEEE Trans Syst Man Cybern 19:17–30
Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. IJCAI 448–453
Shadbolt N, Hall W, Berners-Lee T (2006) The semantic web revisited. IEEE Intell Syst 21:96–101
Sheng H, Chen H, Yu T, Feng Y (2010) Linked data based semantic similarity and data mining. In: IEEE 2010 international conference on Information reuse and integration (IRI), pp 104–108
Simmons S, Estes Z (2006) Using latent semantic analysis to estimate similarity. In: Proceedings of the Cognitive Science Society, pp 2169–2173
Taylor JM (2010) Ontology-based view of natural language meaning: the case of humor detection. J Ambient Intell Humaniz Comput 1:221–234
Tversky A (1977) Features of similarity. Psychol Rev 84:327–352
Volz J, Bizer C, Gaedke M, Kobilarov G (2009) Silk—a link discovery framework for the web of data. In: Proceedings of the second linked data on the Web workshop, Madrid
Wu Z, Palmer M (1994) Verbs semantics and lexical selection. In: Proceedings of the 32nd annual meeting on association for computational linguistics, Las Cruces, pp 133–138
Zadeh LA (1999) Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst 100:9–34
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
D. Hossein Zadeh, P., Reformat, M.Z. Context-aware similarity assessment within semantic space formed in linked data. J Ambient Intell Human Comput 4, 515–532 (2013). https://doi.org/10.1007/s12652-012-0154-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-012-0154-7