Abstract:
Measuring word similarity is a core issue because it has many applications in natural language processing. Although many studies have been reported and the techniques hav...Show MoreMetadata
Abstract:
Measuring word similarity is a core issue because it has many applications in natural language processing. Although many studies have been reported and the techniques have been developed for addressing this issue for English, however, the study dealing with the applications, analyses and evaluation word similarity techniques to Vietnamese still has not reported yet. Especially, there is still lack of the benchmark Vietnamese dataset for evaluating these techniques. In this paper, we report three main topics including: firstly, construct a benchmark dataset for evaluation of similar techniques to the Vietnamese language; secondly, experiment with some similarity techniques based on WordNet and word embeddings; and finally, propose an extension for Lesk algorithm in order to improving the efficiency of similar measuring with Vietnamese language.
Date of Conference: 19-21 October 2017
Date Added to IEEE Xplore: 23 November 2017
ISBN Information: