Authors:
Konstantin Bogdanoski
;
Kostadin Mishev
and
Dimitar Trajanov
Affiliation:
Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Rugjer Boshkovikj 16, Skopje, North Macedonia
Keyword(s):
Unsupervised Learning, Clustering, Hierarchical Clustering, Data Visualization, Machine Learning, Algorithm Optimisation, Machine Learning Tools, Blanket Clusterer, Silhouette Score.
Abstract:
We propose a generic hierarchical clustering algorithm - named Blanket Clusterer, which allows researchers to examine their data and verify the results gained from other machine learning techniques. We also integrate a three-dimensional visualization plugin that provides better understanding of the clustering results. We verify the tool on a specific use-case, i.e., measuring the clustering techniques performances on a textual dataset based solely on ICD-9 descriptions encoded using the Word2Vec distributed representations. The verification shows that Blanket Clusterer provides an efficient pipeline for evaluating and interpreting the most frequently used clustering methods in unsupervised learning.