Elsevier

Knowledge-Based Systems

Volume 242, 22 April 2022, 108336
Knowledge-Based Systems

Refining pseudo labels for unsupervised Domain Adaptive Re-Identification

https://doi.org/10.1016/j.knosys.2022.108336Get rights and content

Abstract

Our paper focuses on the topic of unsupervised domain adaptation for person re-identification (UDA re-ID) in an end-to-end framework. Currently, most existing methods tackle the problem by mining pseudo labels, and they rely heavily on the reliability of these pseudo labels. However, the generation of these labels mainly depends on clustering-based algorithms, which may inevitably introduce noise into the framework. Such noise substantially hinders the capability of the model to further improve feature representations in the target domain. We called these clustering-based labels HARD pseudo labels. To decrease the impacts of these noisy hard labels, in this paper, we directly consider the relative relationships of sample pair distances as another kind of pseudo label instead of clustering results, called SOFT pseudo labels. Furthermore, considering that the constructed relative relationships are not always correct and the wrong relationships may cause negative effects, it is not appropriate to assign the same weight to each sample pair. To alleviate the negative influence caused by tiny distances and strengthen the positive effect due to large distances, we assign different weights according to the different distances to adjust these relationships. By integrating the HARD labels and SOFT labels, the proposed method achieves considerable improvements on several popular person re-ID datasets, e.g, Market1501-to-Duke, Duke-to-Market1501, Market-to-MSMT17 and Duke-to-MSMT17 UDA tasks.

Introduction

Person re-identification (re-ID) [1] aims to retrieve images of a particular pedestrian based on a given query person-of-interest from a large dataset, which is one of the most important visual association techniques in the field of video surveillance and can effectively complement the visual tracking task [2], [3]. Recently, the performance of deep re-ID methods has been dramatically improved by the convolutional neural network [4], [5], [6]. However, in real-world applications, these re-ID models often suffer from domain shift problems due to different camera viewpoints, illuminations and scenes. The performance collapses dramatically when the trained models are directly applied to a new surveillance scene and the domain shift problem has become one of the major challenges hindering the wider applications of person re-ID algorithms.

To alleviate this problem, one straightforward method is to collect and manually annotate a large amount of labeled new data from the new environments for training as the traditional re-ID methods [7], [8]. However, this kind of method is not practical, as collecting and labeling sufficient data is too expensive. Considering that the scarcity of training data prohibits model training, to alleviate these dilemmas, it is essential to make full use of unlabeled target data [9]. In this paper, we aim to address this challenging but practical topic, i.e., Unsupervised Domain Adaptive Re-identification (UDA re-ID).

The key issue of domain adaptation is to reduce the distribution difference between two domains, thus, the learned knowledge from the source domain could be easily generalized to the new domain. However, generic UDA problems [10], [11], [12] always assume that the two domains (i.e., source domain and target domain) share the same set of classes, while person re-ID has entirely different classes, even though the number of classes in the target domain is not clear. Therefore, general UDA methods cannot be applied to cross-domain re-ID problems directly. Most existing UDA re-ID approaches [13], [14], [15] usually assign labels for the target samples based on different clustering algorithms and then train the model with generated pseudo labels (we called them HARD pseudo labels).

Cluster-based methods have always been employed to produce pseudo labels in previous UDA re-ID methods. First, a pre-trained model from the source domain is required. Then, the target samples are fed into the pre-trained model to extract features. After that, a kind of clustering algorithm is employed to generate pseudo labels based on the extracted features. Finally, the samples in the target domain are trained with these labels. Overall, this whole procedure runs iteratively to optimize the re-ID model. It is clear that the training of the model is substantially hindered due to the unreliability of the target samples. The unreliability mainly comes from two aspects. On the one hand, as the features of samples in the target domain are extracted by the source domain pre-trained model, they are not always reliable due to domain shift. On the other hand, to generate hard pseudo labels, unsupervised clustering methods (e.g., k-means and DBSCAN [16]) are usually leveraged to divide the data into different clusters. Then, the same cluster-based samples are defined as the same identity. Obviously, these coarse-grained hard labels inevitably contain noise. As a consequence, the pseudo labels on the target domain would not always be correct. As shown in Fig. 1, the noise in pseudo labels comes from the feature representation, and then it is further amplified by the hard decision boundary in the clustering algorithms. However, this problem has been largely ignored by previous methods but is shown to be critical for achieving superior final performance.

Our paper focuses on reducing the noise from pseudo labels. Since the existing clustering algorithms cannot prevent noise from hard labels, we skip over the clustering process to prevent the accumulation of errors. In addition to the hard pseudo labels, the relative relationships of sample pair distances are leveraged directly. We consider these relationships to be a kind of SOFT pseudo label. With these soft labels, a ranking loss is proposed to assist the training process in the re-ID model. In our method, we name the relative distance between sample pairs the relative relationship. Following the triplet loss, the noise made by the misclassified negative sample pairs is relieved with a margin set in relationships. However, as shown in Fig. 2, it is obvious that even a tiny distance between features would produce a relative relationship. For the positive sample pairs, the ranking loss with a margin may hinder the learning of intra-class compactness. Intuitively, the sample pairs with a smaller distance would have a larger probability of being the same identity, while sample pairs with a larger distance would be likely to come from different identities. Considering that the distances of the sample pairs also bring some useful information, we add a distance weight into the ranking loss, which can weaken the negative influence of these tiny distances and strengthen the positive effect of large distances. The noise brought by the misclassified positive sample pairs can be mitigated. Notably, our soft pseudo labels are not used to replace the hard pseudo labels, but to reduce the negative influence of hard labels and enhance the re-ID model. The aim of our method is to optimize the networks under the joint supervision of hard pseudo labels and soft pseudo labels. In turn, the refinery of those pseudo labels is improved.

In sum, the contributions of this paper can be summarized as threefold.

  • This paper takes the relative relationships between unlabeled samples into consideration as soft pseudo labels. These soft labels aim to mitigate the noise influence in hard pseudo labels for UDA re-ID. Then a ranking loss is provided to assist the training process of the network based on the soft pseudo labels.

  • Considering that different distances in relative relationships, a distance weighting strategy is introduced into the ranking loss, which helps the ranking loss make full use of the soft pseudo labels. The aftermentioned experimental results demonstrate that the introduction of such weighted ranking loss is the key to the superior performance of the proposed framework.

  • Experiments on various datasets verify the superiority of the joint hard labels and soft labels in UDA re-ID tasks.

The rest paper is summarized as follows. We first review the related work in domain adaptation and person re-ID in Section 2. Then, in Section 3, we introduce the proposed method in detail. In Section 4, the experimental results and the discussion are presented using benchmark datasets. Finally, the conclusion for this paper is presented in Section 5.

Section snippets

UDA for close set recognition

Traditional DA methods are always applied to close set problems. The goal of domain adaptation is to confuse the features between source and target domains, so that domain-invariant representations are ultimately obtained. Generally, the maximum mean discrepancy (MMD) [17], [18], [19] strategy, always as a non-parametric metric, is commonly used to measure the discrepancy of distributions. Some MMD-based methods are proposed to relieve the domain discrepancy problem. Wu et al. [20] developed a

The proposed method

We propose a simple but novel framework for tackling the noise in hard pseudo labels in clustering-based UDA re-ID methods. Our method focuses on label noise, which has an important influence on the domain adaptation performance but is ignored by some previous methods. Our key idea is to utilize the relative relationships of target samples to build soft pseudo labels, which contain less noise, and help the hard pseudo labels achieve optimal domain adaptation performance.

Datasets and settings

We adopt three datasets for evaluation, i.e., Market-1501 [48], DukeMTMC-Re-ID [5] and MSMT17 [49]. The detailed descriptions are as follows and some example images are shown in Fig. 4.

  • The Market-1501 is a dataset that is commonly and widely utilized in the re-ID field, involving 1501 identities with a total of 32,668 labeled images attached. They are observed under 6 camera viewpoints on campus. A total of 751 identities, including 12,936 total images, are leveraged for training. A total of

Conclusion

In this paper, we propose a simple but novel framework to tackle the problem of noise in pseudo labels brought by clustering-based UDA methods for person re-ID. To mitigate the negative influence of these noisy pseudo labels, we skip over the clustering process and propose a ranking loss to learn the relative relationships of samples in the target domain. Furthermore, considering that the constructed relative relationships may not be correct and that they may lead to negative effects if given

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is supported by National Natural Science Fund of China (No. 62106003), the University Synergy Innovation Program of Anhui Province, China (No. GXXT-2021-005) and Open Fund of Chongqing Key Laboratory of Bio-perception, China and Intelligent Information Processing, China (No. 2020CKL-BPIIP001). Additionally, this work is partially supported Chongqing Natural Science Fund, China (cstc2021jcyj-jqX0023), National Key R&D Program of China (2021YFB3100800), CCF Hikvision Open Fund, China,   

References (62)

  • YangX. et al.

    Person reidentification via structural deep metric learning

    TNNLS

    (2019)
  • YangX. et al.

    Enhancing person re-identification in a self-trained subspace

    ACM TOMM

    (2017)
  • YangX. et al.

    Person re-identification with metric learning using privileged information

    IEEE TIP

    (2017)
  • FanH. et al.

    Unsupervised person re-identification: Clustering and fine-tuning

    ACM TOMM

    (2018)
  • PanS.J. et al.

    A survey on transfer learning

    IEEE TKDE

    (2010)
  • WangS. et al.

    Self-adaptive re-weighted adversarial domain adaptation

    IJCAI

    (2020)
  • WangS. et al.

    Class-specific reconstruction transfer learning for visual recognition across domains

    TIP

    (2019)
  • ZhongZ. et al.

    Invariance matters: Exemplar memory for domain adaptive person re-identification

  • DingY. et al.

    Adaptive exploration for unsupervised person re-identification

    ACM Trans. Multimedia Comput. Commun. Appl.

    (2020)
  • EsterM. et al.

    A density-based algorithm for discovering clusters in large spatial databases with noise.

  • GrettonA. et al.

    A kernel two-sample test

    JMLR

    (2012)
  • WangS. et al.

    Regularized deep transfer learning: When CNN meets KNN

    IEEE Trans. Circuits Syst. II

    (2019)
  • ZhangL. et al.

    Manifold criterion guided transfer learning via intermediate domain generation

    TNNLS

    (2019)
  • WuH. et al.

    Iterative refinement for multi-source visual domain adaptation

    TKDE

    (2020)
  • WuH. et al.

    Heterogeneous domain adaptation by information capturing and distribution matching

    TIP

    (2021)
  • LiJ. et al.

    Locality preserving joint transfer for domain adaptation

    TIP

    (2019)
  • WuH. et al.

    Knowledge preserving and distribution alignment for heterogeneous domain adaptation

    ACM Trans. Inf. Syst.

    (2021)
  • GaninY. et al.

    Domain-adversarial training of neural networks

    JMLR

    (2016)
  • BousmalisK. et al.

    Domain separation networks

  • DengW. et al.

    Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification

  • FuY. et al.

    Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification

  • Cited by (0)

    This work was done when Shanshan Wang was intern at Alibaba supervised by Weihua Chen.

    View full text