Unsupervised similarity-based word sense disambiguation using context vectors and sentential word importance

Published: 16 May 2012


The process of identifying the actual meanings of words in a given text fragment has a long history in the field of computational linguistics. Due to its importance in understanding the semantics of natural language, it is considered one of the most challenging problems facing this field. In this article we propose a new unsupervised similarity-based word sense disambiguation (WSD) algorithm that operates by computing the semantic similarity between glosses of the target word and a context vector. The sense of the target word is determined as that for which the similarity between gloss and context vector is greatest. Thus, whereas conventional unsupervised WSD methods are based on measuring pairwise similarity between words, our approach is based on measuring semantic similarity between sentences. This enables it to utilize a higher degree of semantic information, and is more consistent with the way that human beings disambiguate; that is, by considering the greater context in which the word appears. We also show how performance can be further improved by incorporating a preliminary step in which the relative importance of words within the original text fragment is estimated, thereby providing an ordering that can be used to determine the sequence in which words should be disambiguated. We provide empirical results that show that our method performs favorably against the state-of-the-art unsupervised word sense disambiguation methods, as evaluated on several benchmark datasets through different models of evaluation.


    One of the major challenges for computers remains the ability to deal with text formulated in a natural language. While significant advances in computational linguistics, artificial intelligence, computing power, and other aspects have given us tools like search engines that are practically indispensable for many tasks, we also deal with other issues such as translation and virtual assistants, with performance ranging from amazing to appalling. One of the core underlying issues is the ability of humans to quickly determine the meaning of a natural language statement, usually without much effort. This interpretation of natural language statements relies on a combination of analyzing the structure of sentences and determining the intended meaning of words and phrases. For computers, the second aspect is far more challenging, both conceptually and computationally. In this paper, the authors describe an approach that addresses the word sense disambiguation (WSD) problem by measuring the semantic similarity between different possible interpretations of a particular word and its context, represented by the remaining words within the text fragment under consideration. This approach is also referred to as knowledge-based WSD, in contrast to the corpus-based methods that rely on the availability of a set of training data consisting of words labeled with the correct meaning. Knowledge-based WSD has been pursued by other researchers with various similarity measures and computational refinements, but it is seriously affected by the amount of calculations resulting from the comparison of different interpretations for all of the words under consideration. Naive approaches typically lead to combinatorial increases in time or space requirements, whereas refinements often require restrictions such as considering only shorter fragments of text. The new method's computational complexity is quadratic with the number of words in the context, whereas other approaches are often exponential with the size of the context window (also a measure of the number of words). Further improvements are achieved by establishing the order in which the words of the text fragment are considered for disambiguation. This relies on a graph-based approach to weigh the "importance" of words within a text fragment, similar to Google's PageRank algorithm for ordering Web pages in search results. The paper is well written, nicely structured, and reasonably easy to follow. It offers a good overview of the current state of WSD, with brief descriptions of commonly used methods. The results obtained by the authors are better than the methods they compared them against, both in standalone experiments based on commonly used datasets and in more challenging tasks involving sentence similarity measurement and sentence clustering. Especially for the sentence similarity task, the authors report significantly better results over previous approaches, also surpassing the mean performance of human participants in the experiment. Online Computing Reviews Service

