Abstract:
Clustering is a widely used machine learning technique for unlabelled data. One of the recently proposed techniques is the twin support vector clustering (TWSVC) algorith...Show MoreMetadata
Abstract:
Clustering is a widely used machine learning technique for unlabelled data. One of the recently proposed techniques is the twin support vector clustering (TWSVC) algorithm. The idea of TWSVC is to generate hyperplanes for each cluster. TWSVC utilizes the hinge loss function to penalize the misclassification. However, the hinge loss relies on shortest distance between different clusters, and is unstable for noise-corrupted datasets, and for re-sampling. In this paper, we propose a novel Sparse Pinball loss Twin Support Vector Clustering (SPTSVC). The proposed SPTSVC involves the \epsilon-insensitive pinball loss function to formulate a sparse solution. Pinball loss function provides noise-insensitivity and re-sampling stability. The \epsilon-insensitive zone provides sparsity to the model and improves testing time. Numerical experiments on synthetic as well as real world benchmark datasets are performed to show the efficacy of the proposed model. An analysis on the sparsity of various clustering algorithms is presented in this work. In order to show the feasibility and applicability of the proposed SPTSVC on biomedical data, experiments have been performed on epilepsy and breast cancer datasets.
Published in: IEEE Journal of Biomedical and Health Informatics ( Volume: 25, Issue: 10, October 2021)