Loading [a11y]/accessibility-menu.js
A Large Probabilistic Semantic Network Based Approach to Compute Term Similarity | IEEE Journals & Magazine | IEEE Xplore

A Large Probabilistic Semantic Network Based Approach to Compute Term Similarity


Abstract:

Measuring semantic similarity between two terms is essential for a variety of text analytics and understanding applications. Currently, there are two main approaches for ...Show More

Abstract:

Measuring semantic similarity between two terms is essential for a variety of text analytics and understanding applications. Currently, there are two main approaches for this task, namely the knowledge based and the corpus based approaches. However, existing approaches are more suitable for semantic similarity between words rather than the more general multi-word expressions (MWEs), and they do not scale very well. Contrary to these existing techniques, we propose an efficient and effective approach for semantic similarity using a large scale semantic network. This semantic network is automatically acquired from billions of web documents. It consists of millions of concepts, which explicitly model the context of semantic relationships. In this paper, we first show how to map two terms into the concept space, and compare their similarity there. Then, we introduce a clustering approach to orthogonalize the concept space in order to improve the accuracy of the similarity measure. Finally, we conduct extensive studies to demonstrate that our approach can accurately compute the semantic similarity between terms of MWEs and with ambiguity, and significantly outperforms 12 competing methods under Pearson Correlation Coefficient. Meanwhile, our approach is much more efficient than all competing algorithms, and can be used to compute semantic similarity in a large scale.
Published in: IEEE Transactions on Knowledge and Data Engineering ( Volume: 27, Issue: 10, 01 October 2015)
Page(s): 2604 - 2617
Date of Publication: 03 April 2015

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.