K-means based method for overlapping document clustering

In this paper, we propose a k-means based method for overlapping document clustering, which allows to specify by the user the number of groups to be built. Our experiments with different corpora show that our proposal allows obtaining better results in terms of FBcubed than other recent works for overlapping document clustering reported in the literature.

We need overlap clustering algorithms for text, and this improve in k-means. This algorithm has important result with different corpus. The use of grouping in different corpus is useful for multiple applications. It is necessary today for recommender systems and in the search and retrieval of information.

Beatriz Beltrán


