A clustering framework based on subjective and objective validity criteria

Published: 02 February 2008 Publication History


Clustering, as an unsupervised learning process is a challenging problem, especially in cases of high-dimensional datasets. Clustering result quality can benefit from user constraints and objective validity assessment. In this article, we propose a semisupervised framework for learning the weighted Euclidean subspace, where the best clustering can be achieved. Our approach capitalizes on: (i) user constraints; and (ii) the quality of intermediate clustering results in terms of their structural properties. The proposed framework uses the clustering algorithm and the validity measure as its parameters. We develop and discuss algorithms for learning and tuning the weights of contributing dimensions and defining the “best” clustering obtained by satisfying user constraints. Experimental results on benchmark datasets demonstrate the superiority of the proposed approach in terms of improved clustering accuracy.


Information & Contributors


Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 1, Issue 4
January 2008
143 pages
Issue’s Table of Contents
Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 February 2008
Accepted: 01 August 2007
Revised: 01 March 2007
Received: 01 August 2006
Published in TKDD Volume 1, Issue 4


Author Tags

  1. Semisupervised learning
  2. cluster validity
  3. data mining
  4. similarity measure learning
  5. space learning


  • Research-article
  • Research
  • Refereed

