Abstract:
The representation of texts in the current methods of information extraction, and TextMining in general, does not always reflect the dependencies between descriptors. In ...Show MoreMetadata
Abstract:
The representation of texts in the current methods of information extraction, and TextMining in general, does not always reflect the dependencies between descriptors. In the vector representation, for example, descriptors related are often considered to be either totally independent or totally similar. This type of approach can be considered as a coarse resolution of the document. We propose in this paper a new method of information retrieval based on signal representation and spectral processing at different levels of resolution of documents. It is a new way to exploit the power and properties of the multiresolution analysis of wavelet transform. To illustrate the interest that could present this approach, we have applied it to an Arabic corpus. Our approach in this context demonstrates an ability to achieve higher accuracy compared to the standard vector representation.
Date of Conference: 27-30 May 2013
Date Added to IEEE Xplore: 03 October 2013
Electronic ISBN:978-1-4799-0792-2