Preserving similarity order for unsupervised clustering

doi:10.1016/j.patcog.2022.108670

Pattern Recognition

Volume 128, August 2022, 108670

https://doi.org/10.1016/j.patcog.2022.108670 Get rights and content

Highlights

•
Our method takes the ordering of pairwise distance as the supervisory signal to learn the similarity score function.
•
Our similarity score function captures both local structure and global structure of the data sample distribution.
•
We propose a simple but effective strategy to identify the boundary samples from a given dataset.

Abstract

Unsupervised clustering categorizes a sample set into several groups, where the samples in the same group share high-level concepts. As the clustering performances are heavily determined by the metric to assess the similarity between sample pairs, we propose to learn a deep similarity score function and use it to capture the correlations between sample pairs for improved clustering. We formulate the learning procedure in a ranking framework and introduce two new supervisory signals to train our model. Specifically, we train the similarity score function to guarantee 1) a sample should have a higher level of similarity with its nearest neighbors than others in order to achieve correct clustering, and 2) the ordering of the similarity between neighboring sample pairs should be preserved in order to achieve robust clustering. To this end, we not only study the relevance between neighboring sample pairs for local structure learning, but also study the relevance between each sample and the boundary samples for global structure learning. Extensive experiments on seven public available datasets validate the effectiveness of our proposed framework, including face image clustering, object image clustering, and real-world image clustering.

Introduction

Unsupervised machine learning techniques mine knowledge from datasets where no sample labels are available. These techniques are widely applied in a range of applications, such as visual similarity analysis [1], domain adaptation [2], surveillance [3], and image retrieval [4]. Clustering is one of the most popular unsupervised learning tasks in artificial intelligence and computer vision [5], [6]. It categorizes a collection of unlabeled data samples into several groups, where the samples in the same group are expected to share high-level concepts. Clustering techniques are not only widely used in data analytics, but also applicable in automatic data labeling and data visualization.

Due to the curse of dimensionality, some popular clustering techniques, including K-means and Gaussian mixture model, are not directly applicable to high-dimensional data samples. To solve this problem, many unsupervised dimension reduction or low-dimensional representation learning techniques are proposed to reveal the intrinsic structure of a given dataset, such as multi-dimensional scaling (MDS) [7], locally linear embedding (LLE) [8], and spectral embedding (SE) [9]. These subspace learning techniques can improve the clustering performances by learning a more clustering-friendly representation space. In other words, these methods enhance the intra-cluster similarity and the inter-cluster dissimilarity simultaneously in the representation space.

In order to learn proper low-level sample representations, many unsupervised dataset analysis techniques define their objective functions based on the pairwise difference [7], [8], [9], i.e., $y_{i} - y_{j}$ , where $y_{i}$ and $y_{j}$ are embeddings of two data samples (Section 2.1 provides more details). Thus, we can consider the pairwise difference as the implicit supervisory signal in representation learning. The importance of the pairwise difference or pairwise distance in unsupervised data analysis lies in its capability in revealing the relationships among the data samples. In a clustering task, however, the ordering of the pairwise distances directly influences the final clustering results, and is more important than the values themselves in most cases. Yet to the best of our knowledge, the existing research has not taken into consideration the ordering of pairwise distances for unsupervised clustering. To this end, we propose to learn a score function in a ranking framework based on the ordering of pairwise distances. The learned similarity score function is catered for the dataset and able to reveal the intrinsic dataset structure.

The existing methods [7], [8], [9] are also limited to low-level features and cannot discover the deep correlation between samples. Driven by the tremendous success of deep neural networks in various computer vision tasks [10], [11], researchers propose to boost the clustering performance with deep representations [12], [13]. In comparison with the supervised deep learning, the training procedures of these unsupervised deep learning methods [14], [15] are more difficult due to the non-availability of sample labels. To alleviate this difficulty, a number of supervisory signals are proposed, such as the soft cluster label [14] and K-means-friendliness of the representations [15]. Researchers also guide the training procedure of convolutional neural networks with k-means [16] and constrained dominate sets [17]. Different from the existing deep clustering methods, the proposed representation learning method is formulated in a ranking framework and adopts the ordering of pairwise distances as the supervisory signal. In summary, our proposed method maps the data samples into an embedding space where 1) the ordering of similarities between neighboring sample pairs is preserved and 2) the similarities between neighboring sample pairs are larger than those between distant sample pairs.

In our ranking framework for score function learning, we define the relevant set for each data sample to be the union of a neighboring sample set and a boundary sample set. While we can easily find the neighboring samples for a given sample, few methods exist for boundary sample discovery. To this end, we propose a simple but effective method to identify the boundary samples and provide our theoretical analysis. While the pairwise relevances between a sample and its neighbors capture the local structure, the ones between a sample and the boundary samples capture the global structure. In this way, we are able to exploit both the local structure and the global structure in our proposed unsupervised clustering.

We highlight our contributions as follows:

•
Unlike the existing works that take the values of the pairwise distance as the supervisory signals, our proposed takes the ordering of these values, which can directly influence the clustering results, as the supervisory signal.
•
In comparison with the existing state of the arts, our proposed method not only captures the local structure by the relevance between a sample and its neighbors, but also captures the global structure by the relevance between a sample and the boundary samples.
•
Inspired from the observation that a boundary sample is more separable from its neighbors, we propose a simple but effective strategy to identify the boundary samples from a given dataset.

The rest of this paper is organized as follows. Section 2 presents the related works on unsupervised dataset analysis and learning to rank. Section 3 proposes our method on deep clustering. Section 4 shows how we identify the relevant sample set. Section 5 conducts experiments to evaluate the proposed method and Section 6 concludes this paper.

Section snippets

Related work

To pave the way for our proposed research, we briefly overview the existing relevant work on unsupervised dataset analysis, and reformulate their objective functions to show the importance of pairwise differences. In addition, we also present the related work on learning to rank.

Approach

As detailed in Section 2.1, the existing unsupervised data analysis approaches guide the data sample embedding procedure with the pairwise difference between samples. However, our extensive literature survey reveals that none of the existing work has explicitly studied the ordering of pairwise similarities (or distances) in unsupervised clustering. We explicitly take into consideration the ordering information of the pairwise similarities and formulate them in a ranking framework for embedding

Relevant Sample Set

To reveal both the local and global structure with the learned score function, we can simply consider that a data sample $x_{i}$ is relevant to the whole sample set, i.e., $R_{x_{i}} = X$ . However, it will be computationally expensive for large datasets and it is difficult to disentangle the local structure from the global structure. For efficient and effective learning, we consider that a sample $x_{i}$ is relevant to the union of two sets, i.e., $R_{x_{i}} = N (x_{i}) \cup Ω (X)$ , with the neighboring sample set $N (x_{i})$ to reveal the

Experiments

To evaluate our proposed methods, we carry out extensive experiments in a number of phases. In the first phase, we show the effectiveness of our boundary sample discovery method on synthetic datasets and object image datasets. In the second phase, we conduct image clustering on three different tasks, including face image clustering, object image clustering, and real-world image clustering. We also analyze the influence of the size of relevant sample set and the initial representations on the

Conclusion and future work

In this paper, we propose a new method to learn a similarity score function and achieve improved performances for unsupervised clustering. Compared with the existing state of the arts, the novelty and the value of our proposed can be validated by three original contributions: (i) an ordering of pairwise differences is introduced as the supervisory signal; (ii) not only the relevance between neighboring sample pairs is considered to capture the local structure, but also the relevance between

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

The authors wish to acknowledge the financial support from: (i) Natural Science Foundation China (NSFC) under the Grant no. 62032015; and (ii) Natural Science Foundation China (NSFC) under the Grant no. 62172285.

Jinghua Wang received the BEng degree from Shandong University, China, in 2005, the MS degree from the Harbin Institute of Technology, China, in 2009, and the PhD degree from The Hong Kong Polytechnic University, Hong Kong, in 2013. He is currently an Assistant Professor with College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China. His current research interests include computer vision and machine learning.

References (41)

A. Sanakoyeu et al.
Deep unsupervised learning of visual similarities
Pattern Recognit.
(2018)
B. Yang et al.
Learning domain-shared group-sparse representation for unsupervised domain adaptation
Pattern Recognit.
(2018)
I. Huerta et al.
Combining where and what in change detection for unsupervised foreground learning in surveillance
Pattern Recognit.
(2015)
D.C.G. Pedronette et al.
Unsupervised manifold learning through reciprocal kNN graph and connected components for image retrieval tasks
Pattern Recognit.
(2018)
L. Huang et al.
Multi-view intact space clustering
Pattern Recognit.
(2019)
J. Lipor et al.
Clustering quality metrics for subspace clustering
Pattern Recognit.
(2020)
F. Mandanas et al.
M-estimators for robust multidimensional scaling employing l21 norm regularization
Pattern Recognit.
(2018)
Y. Pan et al.
Weighted locally linear embedding for dimension reduction
Pattern Recognit.
(2009)
J. Wang et al.
Unsupervised deep clustering via adaptive GMM modeling and optimization
Neurocomputing
(2021)
Y. Yi et al.
Joint graph optimization and projection learning for dimensionality reduction
Pattern Recognit.
(2019)

A. Gupta et al.

Parameterized principal component analysis

Pattern Recognit.

(2018)

Y. Zhang et al.

Semi-supervised local multi-manifold isomap by linear embedding for feature extraction

Pattern Recognit.

(2018)

X. Shi et al.

Pairwise based deep ranking hashing for histopathology image classification and retrieval

Pattern Recognit.

(2018)

A.Y. Ng et al.

On spectral clustering: analysis and an algorithm

NIPS

(2001)

G.E. Hinton et al.

Reducing the dimensionality of data with neural networks

Science

(2006)

J. Wang et al.

An unsupervised deep learning framework via integrated optimization of representation learning and GMM-based modeling

14th Asian Conference on Computer Vision

(2018)

P. Zhou et al.

Deep adversarial subspace clustering

CVPR

(2018)

J. Xie et al.

Unsupervised deep embedding for clustering analysis

ICML

(2016)

K.G. Dizaji et al.

Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization

ICCV

(2017)

H. Fan et al.

Unsupervised person re-identification: clustering and fine-tuning

ACM Trans. Multimedia Comput.Commun. Appl.

(2017)

Cited by (2)

Leveraging tensor kernels to reduce objective function mismatch in deep clustering
2024, Pattern Recognition
Objective Function Mismatch (OFM) occurs when the optimization of one objective has a negative impact on the optimization of another objective. In this work we study OFM in deep clustering, and find that the popular autoencoder-based approach to deep clustering can lead to both reduced clustering performance, and a significant amount of OFM between the reconstruction and clustering objectives. To reduce the mismatch, while maintaining the structure-preserving property of an auxiliary objective, we propose a set of new auxiliary objectives for deep clustering, referred to as the Unsupervised Companion Objectives (UCOs). The UCOs rely on a kernel function to formulate a clustering objective on intermediate representations in the network. Generally, intermediate representations can include other dimensions, for instance spatial or temporal, in addition to the feature dimension. We therefore argue that the naïve approach of vectorizing and applying a vector kernel is suboptimal for such representations, as it ignores the information contained in the other dimensions. To address this drawback, we equip the UCOs with structure-exploiting tensor kernels, designed for tensors of arbitrary rank. The UCOs can thus be adapted to a broad class of network architectures. We also propose a novel, regression-based measure of OFM, allowing us to accurately quantify the amount of OFM observed during training. Our experiments show that the OFM between the UCOs and the main clustering objective is lower, compared to a similar autoencoder-based model. Further, we illustrate that the UCOs improve the clustering performance of the model, in contrast to the autoencoder-based approach. The code for our experiments is available at https://github.com/danieltrosten/tk-uco.
Leveraging tensor kernels to reduce objective function mismatch in deep clustering
2020, arXiv

Li Wang received the BE degree from Southeast University, China, in 2006 and the ME degree from Shanghai Jiao Tong University, China, in 2009. He received the PhD degree from School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, in 2016. Currently, he is a research scientist at Institute for Infocomm Research, A*STAR, Singapore. His research interests include deep learning, computer vision and image processing.

Jianmin Jiang received PhD from the University of Nottingham, UK, in 1994. From 1997 to 2001, he worked as a full professor of Computing at the University of Glamorgan, Wales, UK. In 2002, he joined the University of Bradford, UK, as a Chair Professor of Digital Media, and Director of Digital Media & Systems Research Institute. He worked at the University of Surrey, UK, as a full professor during 2010–2014 and a distinguished professor (1000-plan) at Tianjin University, China, during 2010–2013. He is currently a Distinguished Professor and director of the Research Institute for Future Media Computing at the College of Computer Science & Software Engineering, Shenzhen University, China. He was a chartered engineer, fellow of IEE, fellow of RSA, member of EPSRC College in the UK, and EU FP-6/7 evaluator. His research interests include, image/video processing in compressed domain, digital video coding, medical imaging, computer graphics, machine learning and AI applications in digital media processing, retrieval and analysis. He has published around 400 refereed research papers.

View full text

Preserving similarity order for unsupervised clustering

Highlights

Abstract

Introduction

Section snippets

Related work

Approach

Relevant Sample Set

Experiments

Conclusion and future work

Declaration of Competing Interest

Acknowledgment

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Neurocomputing

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

Pattern Recognit.

On spectral clustering: analysis and an algorithm

NIPS

Reducing the dimensionality of data with neural networks

Science

An unsupervised deep learning framework via integrated optimization of representation learning and GMM-based modeling

14th Asian Conference on Computer Vision

Deep adversarial subspace clustering

CVPR

Unsupervised deep embedding for clustering analysis

ICML

Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization

ICCV

Unsupervised person re-identification: clustering and fine-tuning

ACM Trans. Multimedia Comput.Commun. Appl.