Loading [a11y]/accessibility-menu.js
Kernel spectral document clustering using unsupervised precision-recall metrics | IEEE Conference Publication | IEEE Xplore

Kernel spectral document clustering using unsupervised precision-recall metrics


Abstract:

Kernel Spectral Clustering (KSC) solves a weighted kernel principal component analysis problem in a primal-dual optimization framework. The KSC model is built on a small ...Show More

Abstract:

Kernel Spectral Clustering (KSC) solves a weighted kernel principal component analysis problem in a primal-dual optimization framework. The KSC model is built on a small subset of data using a proper training, model selection and a test phase. The clustering model is obtained using the dual solution of the problem and has a powerful out-of-sample extensions property which allows cluster affiliation for previously unseen data points. In the model selection phase, we estimate the appropriate number of clusters using a metric that evaluates the quality of the clusters. Traditional quality indices like inertia, Davies-Bouldin (DB) index and silhouette (SIL) are known to be method-dependent and not perform well in case of complex heterogeneous data like textual data. In this paper, we utilize the quality evaluation techniques based on an unsupervised version of Precision, Recall and F-measure proposed in [1] to come up with a new kernel spectral document clustering (KSDC) model which generates homogeneous clusters of documents. We compare the quality of the clusters obtained by the proposed KSDC technique with k-means and neural gas algorithm, which are more oriented towards these metrics, on several real world textual data.
Date of Conference: 12-17 July 2015
Date Added to IEEE Xplore: 01 October 2015
ISBN Information:

ISSN Information:

Conference Location: Killarney, Ireland

Contact IEEE to Subscribe

References

References is not available for this document.