A novel scale-invariant, dynamic method for hierarchical clustering of data affected by measurement uncertainty

https://doi.org/10.1016/j.cam.2018.05.062Get rights and content
Under an Elsevier user license
open archive

Abstract

An enhanced technique for hierarchical agglomerative clustering is presented. Classical clusterings suffer from non-uniqueness, resulting from the adopted scaling of data and from the arbitrary choice of the function to measure the proximity between elements. Moreover, most classical methods cannot account for the effect of measurement uncertainty on initial data, when present.

To overcome these limitations, the definition of a weighted, asymmetric function is introduced to quantify the proximity between any two elements. The data weighting depends dynamically on the degree of advancement of the clustering procedure. The novel proximity measure is derived from a geometric approach to the clustering, and it allows to both disengage the result from the data scaling, and to indicate the robustness of a clustering against the measurement uncertainty of initial data.

The method applies to both flat and hierarchical clustering, maintaining the computational cost of the classical methods.

MSC

62H30
68T99

Keywords

Hierarchical clustering
Non-uniqueness
Proximity measure
Computational cost
Uncertainty

Cited by (0)