Abstract
Computing information content (IC) of a concept is a core issue for semantic similarity measures of IC-based. So far, little works focused on calculating the IC of multiple inheritance nodes. So in this paper, a new IC computing model is proposed to calculate the IC of node (including single inheritance node and multiple inheritances node) in WordNet. This model calculates the IC of the concept through the parameters hypernyms, hyponyms, relative depth, maximum nodes and siblings. Experimental results of “poison” snippet in WordNet taxonomy shows that this model can effectively deal with the cases of single inheritance and multiple inheritance. Meanwhile, the results indicate this model is sensitive to distinguish the IC value of nodes while one of relative depth, hyponym, hypernym or sibling is different. Based on proposed model, a taxonomical semantic similarity measure is proposed to compute the semantic similarity of multiple inheritance nodes. Finally, this paper compares proposed approach with other similarity measures based IC in the given fragment of WordNet classification tree, and the results show that the proposed similarity approach acquires good performance.


Similar content being viewed by others

References
Adhikari, A., Singh, S., Dutta, A., & Dutta, B. (2015). A novel information theoretic approach for finding semantic similarity in WordNet (pp. 1–6). At Macau, China: IEEE TENCON.
Cai, Y., Zhang, Q., Lu, W., et al. (2017). A hybrid approach for measuring semantic similarity based on IC-weighted path distance in WordNet. Journal of Intelligent Information Systems, 1, 1–25.
Lofi, C. (2016). Measuring semantic similarity and relatedness with distributional and knowledge-based approaches. Information and Media Technologies, 10(3), 493–501.
Hadj Taieb, M. A., Ben Aouicha, M., & Ben Hamadou, A. (2014). Ontology-based approach for measuring semantic similarity. Engineering Applications of Artificial Intelligence, 36(C), 238–261.
Lin, D. (1999). WordNet: an electronic lexical database. Computational Linguistics, 25(2), 292–296.
Hadj Taieb, M. A., Ben Aouicha, M., & Ben Hamadou, A. (2013). Computing semantic relatedness using wikipedia features. Knowledge-Based Systems, 50(50), 260–278.
Aouicha, M. B., & Taieb, M. A. (2015). Computing semantic similarity between biomedical concepts using new information content approach. Journal of Biomedical Informatics, 59(1), 258–275.
Lu, W., Cai, Y., Che, X., et al. (2016). Joint semantic similarity assessment with raw corpus and structured ontology for semantic-oriented service discovery. Personal and Ubiquitous Computing, 20(3), 311–323.
Pirró*, G., & Euzenat, J. (2010). A feature and information theoretic framework for semantic similarity and relatedness. The semantic Web—ISWC 2010 (pp. 615–630). Berlin Heidelberg: Springer.
Sánchez, D., Solé-Ribalta, A., Batet, M., & Serratosa, F. (2012). Enabling semantic similarity estimation acrossmultiple ontologies: an evaluation in the biomedical domain. Journal of Biomedical Informatics, 45(1), 141–155.
Zhang, Y., Shang, L., Huang, L., et al. (2016). A hybrid similarity measure method for patent portfolio analysis. Journal of Informetrics, 10(4), 1108–1130.
Jiang, J. J., & Conrath, D. W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of international conference on research in computational linguistics (pp. 22–24). Taipei, Taiwan.
Seco, N., Veale, T., & Hayes, J. (2004). An intrinsic information content metric for semantic similarity in WordNet. In Eureopean conference on artificial intelligence, Ecai’2004, including prestigious applicants of intelligent systems, Pais 2004, Valencia, Spain (pp. 1089–1090).
Zhou, Z., Wang, Y., & Gu, J. (2008). A new model of information content for semantic similarity in WordNet. In International conference on future generation communication and NETWORKING symposia (pp. 85–89). IEEE.
Sebti, A., & Barfroush, A. A. (2008). A new word sense similarity measure in WordNet. In IEEE international multi conference on computer science and information technology (pp. 369–373).
Meng, L., Gu, J., & Zhou, Z. (2012). A new model of information content based on concepts topology for measuring semantic similarity in WordNet. International Journal of Grid and Distributed Computing, 5(3), 81–94.
David Sanchez and Montserrat Batet. (2012). A new model to compute the information content of concepts from taxonomic knowledge. International Journal on Semantic Web Information Systems Archive, 8(2), 34–50.
Devitt, A., and Vogel, C. (2004). The topology of WordNet: Some metrics. In Proceedings of GWC-04, 2nd global WordNet conference (pp. 106–111).
Pirró*, G. (2009). A semantic similarity metric combining features and intrinsic information content. Data & Knowledge Engineering, 68(11), 1289–1308.
Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In International joint conference on artificial intelligence (pp. 448–453). Morgan Kaufmann Publishers Inc.
Tversky, A. (1977). Features of similarity. Psychological Review, 84(4), 290–3052.
Rada, R., Mili, H., Bicknell, E., et al. (1989). Development and application of a metric on semantic nets. IEEE Transactions on Systems Man and Cybernetics, 19(1), 17–30.
Wu, Z., & Palmer, M. (1995). Verb semantics and lexical selection. In ACL proceedings of annual meeting on association for computational linguistics (pp. 133–138).
Leacock, C., & Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. WordNet: An Electronic Lexical Database, 49(2), 265–283.
Acknowledgements
The authors acknowledge the National Natural Science Foundation of China (Grant No.: 61562072).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, X., Sun, S. & Zhang, K. An information Content-Based Approach for Measuring Concept Semantic Similarity in WordNet. Wireless Pers Commun 103, 117–132 (2018). https://doi.org/10.1007/s11277-018-5429-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-018-5429-7