Preserving similarity order for unsupervised clustering
Introduction
Unsupervised machine learning techniques mine knowledge from datasets where no sample labels are available. These techniques are widely applied in a range of applications, such as visual similarity analysis [1], domain adaptation [2], surveillance [3], and image retrieval [4]. Clustering is one of the most popular unsupervised learning tasks in artificial intelligence and computer vision [5], [6]. It categorizes a collection of unlabeled data samples into several groups, where the samples in the same group are expected to share high-level concepts. Clustering techniques are not only widely used in data analytics, but also applicable in automatic data labeling and data visualization.
Due to the curse of dimensionality, some popular clustering techniques, including K-means and Gaussian mixture model, are not directly applicable to high-dimensional data samples. To solve this problem, many unsupervised dimension reduction or low-dimensional representation learning techniques are proposed to reveal the intrinsic structure of a given dataset, such as multi-dimensional scaling (MDS) [7], locally linear embedding (LLE) [8], and spectral embedding (SE) [9]. These subspace learning techniques can improve the clustering performances by learning a more clustering-friendly representation space. In other words, these methods enhance the intra-cluster similarity and the inter-cluster dissimilarity simultaneously in the representation space.
In order to learn proper low-level sample representations, many unsupervised dataset analysis techniques define their objective functions based on the pairwise difference [7], [8], [9], i.e., , where and are embeddings of two data samples (Section 2.1 provides more details). Thus, we can consider the pairwise difference as the implicit supervisory signal in representation learning. The importance of the pairwise difference or pairwise distance in unsupervised data analysis lies in its capability in revealing the relationships among the data samples. In a clustering task, however, the ordering of the pairwise distances directly influences the final clustering results, and is more important than the values themselves in most cases. Yet to the best of our knowledge, the existing research has not taken into consideration the ordering of pairwise distances for unsupervised clustering. To this end, we propose to learn a score function in a ranking framework based on the ordering of pairwise distances. The learned similarity score function is catered for the dataset and able to reveal the intrinsic dataset structure.
The existing methods [7], [8], [9] are also limited to low-level features and cannot discover the deep correlation between samples. Driven by the tremendous success of deep neural networks in various computer vision tasks [10], [11], researchers propose to boost the clustering performance with deep representations [12], [13]. In comparison with the supervised deep learning, the training procedures of these unsupervised deep learning methods [14], [15] are more difficult due to the non-availability of sample labels. To alleviate this difficulty, a number of supervisory signals are proposed, such as the soft cluster label [14] and K-means-friendliness of the representations [15]. Researchers also guide the training procedure of convolutional neural networks with k-means [16] and constrained dominate sets [17]. Different from the existing deep clustering methods, the proposed representation learning method is formulated in a ranking framework and adopts the ordering of pairwise distances as the supervisory signal. In summary, our proposed method maps the data samples into an embedding space where 1) the ordering of similarities between neighboring sample pairs is preserved and 2) the similarities between neighboring sample pairs are larger than those between distant sample pairs.
In our ranking framework for score function learning, we define the relevant set for each data sample to be the union of a neighboring sample set and a boundary sample set. While we can easily find the neighboring samples for a given sample, few methods exist for boundary sample discovery. To this end, we propose a simple but effective method to identify the boundary samples and provide our theoretical analysis. While the pairwise relevances between a sample and its neighbors capture the local structure, the ones between a sample and the boundary samples capture the global structure. In this way, we are able to exploit both the local structure and the global structure in our proposed unsupervised clustering.
We highlight our contributions as follows:
- •
Unlike the existing works that take the values of the pairwise distance as the supervisory signals, our proposed takes the ordering of these values, which can directly influence the clustering results, as the supervisory signal.
- •
In comparison with the existing state of the arts, our proposed method not only captures the local structure by the relevance between a sample and its neighbors, but also captures the global structure by the relevance between a sample and the boundary samples.
- •
Inspired from the observation that a boundary sample is more separable from its neighbors, we propose a simple but effective strategy to identify the boundary samples from a given dataset.
The rest of this paper is organized as follows. Section 2 presents the related works on unsupervised dataset analysis and learning to rank. Section 3 proposes our method on deep clustering. Section 4 shows how we identify the relevant sample set. Section 5 conducts experiments to evaluate the proposed method and Section 6 concludes this paper.
Section snippets
Related work
To pave the way for our proposed research, we briefly overview the existing relevant work on unsupervised dataset analysis, and reformulate their objective functions to show the importance of pairwise differences. In addition, we also present the related work on learning to rank.
Approach
As detailed in Section 2.1, the existing unsupervised data analysis approaches guide the data sample embedding procedure with the pairwise difference between samples. However, our extensive literature survey reveals that none of the existing work has explicitly studied the ordering of pairwise similarities (or distances) in unsupervised clustering. We explicitly take into consideration the ordering information of the pairwise similarities and formulate them in a ranking framework for embedding
Relevant Sample Set
To reveal both the local and global structure with the learned score function, we can simply consider that a data sample is relevant to the whole sample set, i.e., . However, it will be computationally expensive for large datasets and it is difficult to disentangle the local structure from the global structure. For efficient and effective learning, we consider that a sample is relevant to the union of two sets, i.e., , with the neighboring sample set to reveal the
Experiments
To evaluate our proposed methods, we carry out extensive experiments in a number of phases. In the first phase, we show the effectiveness of our boundary sample discovery method on synthetic datasets and object image datasets. In the second phase, we conduct image clustering on three different tasks, including face image clustering, object image clustering, and real-world image clustering. We also analyze the influence of the size of relevant sample set and the initial representations on the
Conclusion and future work
In this paper, we propose a new method to learn a similarity score function and achieve improved performances for unsupervised clustering. Compared with the existing state of the arts, the novelty and the value of our proposed can be validated by three original contributions: (i) an ordering of pairwise differences is introduced as the supervisory signal; (ii) not only the relevance between neighboring sample pairs is considered to capture the local structure, but also the relevance between
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
The authors wish to acknowledge the financial support from: (i) Natural Science Foundation China (NSFC) under the Grant no. 62032015; and (ii) Natural Science Foundation China (NSFC) under the Grant no. 62172285.
Jinghua Wang received the BEng degree from Shandong University, China, in 2005, the MS degree from the Harbin Institute of Technology, China, in 2009, and the PhD degree from The Hong Kong Polytechnic University, Hong Kong, in 2013. He is currently an Assistant Professor with College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China. His current research interests include computer vision and machine learning.
References (41)
- et al.
Deep unsupervised learning of visual similarities
Pattern Recognit.
(2018) - et al.
Learning domain-shared group-sparse representation for unsupervised domain adaptation
Pattern Recognit.
(2018) - et al.
Combining where and what in change detection for unsupervised foreground learning in surveillance
Pattern Recognit.
(2015) - et al.
Unsupervised manifold learning through reciprocal kNN graph and connected components for image retrieval tasks
Pattern Recognit.
(2018) - et al.
Multi-view intact space clustering
Pattern Recognit.
(2019) - et al.
Clustering quality metrics for subspace clustering
Pattern Recognit.
(2020) - et al.
M-estimators for robust multidimensional scaling employing l21 norm regularization
Pattern Recognit.
(2018) - et al.
Weighted locally linear embedding for dimension reduction
Pattern Recognit.
(2009) - et al.
Unsupervised deep clustering via adaptive GMM modeling and optimization
Neurocomputing
(2021) - et al.
Joint graph optimization and projection learning for dimensionality reduction
Pattern Recognit.
(2019)
Parameterized principal component analysis
Pattern Recognit.
Semi-supervised local multi-manifold isomap by linear embedding for feature extraction
Pattern Recognit.
Pairwise based deep ranking hashing for histopathology image classification and retrieval
Pattern Recognit.
On spectral clustering: analysis and an algorithm
NIPS
Reducing the dimensionality of data with neural networks
Science
An unsupervised deep learning framework via integrated optimization of representation learning and GMM-based modeling
14th Asian Conference on Computer Vision
Deep adversarial subspace clustering
CVPR
Unsupervised deep embedding for clustering analysis
ICML
Deep clustering via joint convolutional autoencoder embedding and relative entropy minimization
ICCV
Unsupervised person re-identification: clustering and fine-tuning
ACM Trans. Multimedia Comput.Commun. Appl.
Cited by (2)
Jinghua Wang received the BEng degree from Shandong University, China, in 2005, the MS degree from the Harbin Institute of Technology, China, in 2009, and the PhD degree from The Hong Kong Polytechnic University, Hong Kong, in 2013. He is currently an Assistant Professor with College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China. His current research interests include computer vision and machine learning.
Li Wang received the BE degree from Southeast University, China, in 2006 and the ME degree from Shanghai Jiao Tong University, China, in 2009. He received the PhD degree from School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, in 2016. Currently, he is a research scientist at Institute for Infocomm Research, A*STAR, Singapore. His research interests include deep learning, computer vision and image processing.
Jianmin Jiang received PhD from the University of Nottingham, UK, in 1994. From 1997 to 2001, he worked as a full professor of Computing at the University of Glamorgan, Wales, UK. In 2002, he joined the University of Bradford, UK, as a Chair Professor of Digital Media, and Director of Digital Media & Systems Research Institute. He worked at the University of Surrey, UK, as a full professor during 2010–2014 and a distinguished professor (1000-plan) at Tianjin University, China, during 2010–2013. He is currently a Distinguished Professor and director of the Research Institute for Future Media Computing at the College of Computer Science & Software Engineering, Shenzhen University, China. He was a chartered engineer, fellow of IEE, fellow of RSA, member of EPSRC College in the UK, and EU FP-6/7 evaluator. His research interests include, image/video processing in compressed domain, digital video coding, medical imaging, computer graphics, machine learning and AI applications in digital media processing, retrieval and analysis. He has published around 400 refereed research papers.