Part-pixel transformer with smooth alignment fusion for domain adaptation person re-identification

Kong, Jun; Zhou, Hua; Jiang, Min; Liu, Tianshan

doi:10.1007/s11760-024-03037-z

Part-pixel transformer with smooth alignment fusion for domain adaptation person re-identification

Original Paper
Published: 20 February 2024

Volume 18, pages 3737–3744, (2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Jun Kong¹,
Hua Zhou²,
Min Jiang² &
…
Tianshan Liu³

214 Accesses
Explore all metrics

Abstract

The method of generating pseudo-labels by clustering is proved to be effective in unsupervised domain adaptation (UDA) person re-identification (re-ID). However, the pseudo-labels contain a lot of noise, which hinders the further improvement of the performance of the model. Extracting representative features is the key to solve the above problem. In this paper, we propose the Part-Pixel Transformer with Smooth Alignment Fusion Network (PTFNet) to capture richer discriminative pedestrian features. Specifically, we design a Part-Pixel Transformer (PPformer) to model the long-range dependence between features, which adopts the horizontal splitting method to obtain horizontal parts with more highly correlated regions of the image. At the same time, the interaction of pixel-level information is further captured in each horizontal part. In addition, we also propose a Smooth Alignment Fusion (SAF) module, which is composed of Smooth Alignment block (SA-Block) and Cross-layer Fusion block (CF-Block). Firstly, the cross-layer features are smoothed by SA-Block to reduce the semantic gap between the features of different layers. Then, it is fed into the CF-Block to complete the aggregation of low-level features with spatial information and high-level features with semantic information. Extensive experiments show that our proposed methods can significantly surpass the performance of previous works on UDA tasks for person re-ID.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive Alignment Network for Person Re-identification

High-Confidence Sample Labelling for Unsupervised Person Re-identification

Mining Diverse Clues with Transformers for Person Re-identification

Availability of data and materials

The datasets analyzed during the current study are available in the Ref [27,28,29,30].

References

Song, L., Zhou, X., Chen, Y.: Global attention-assisted representation learning for vehicle re-identification. SIViP 16(3), 807–815 (2022)
Article Google Scholar
Tagore, N.K., Chattopadhyay, P. A bi-network architecture for occlusion handling in Person re-identification. Signal, Image and Video Processing, 1–9 (2022).
Wu, Q., Dai, P., Chen, P., et al.: Deep adversarial data augmentation with attribute guided for person re-identification. SIViP 15, 655–662 (2021)
Article Google Scholar
Zhang, X., Hou, M., Deng, X., et al.: Multi-cascaded attention and overlapping part features network for person re-identification. SIViP 16(6), 1525–1532 (2022)
Article Google Scholar
Ding, Y., Fan, H., Xu, M., et al.: Adaptive exploration for unsupervised person re-identification. ACM Transact. Multimedia Comput. Commun. Appl. (TOMM) 16(1), 1–19 (2020)
Article Google Scholar
Zhong, Z., Zheng, L., Luo, Z. et al. Invariance matters: Exemplar memory for domain adaptive person re-identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 598–607 (2019).
Tao, X., Kong, J., Jiang, M., et al.: Unsupervised domain adaptation by multi-loss gap minimization learning for person re-identification. IEEE Trans. Circuits Syst. Video Technol. 32(7), 4404–4416 (2021)
Article Google Scholar
Song, L., Wang, C., Zhang, L., et al.: Unsupervised domain adaptive re-identification: theory and practice. Pattern Recogn. 102, 107173 (2020)
Article Google Scholar
Kumar, D., Siva, P., Marchwica, P. et al.: Unsupervised domain adaptation in person re-id via k-reciprocal clustering and large-scale heterogeneous environment synthesis. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2645–2654 (2020).
Zhai, Y., Lu, S., Ye, Q. et al. Ad-cluster: augmented discriminative clustering for domain adaptive person re-identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9021–9030 (2020).
Dosovitskiy, A., Beyer, L., Kolesnikov, A. et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, (2020).
Zou, Y., Yang, X., Yu, Z. et al. Joint disentangling and adaptation for cross-domain person re-identification. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16: Springer, 87–104 (2020).
Yang, F., Yan, K., Lu, S., et al.: Part-aware progressive unsupervised domain adaptation for person re-identification. IEEE Trans. Multimedia 23, 1681–1695 (2020)
Article Google Scholar
He, S., Luo, H., Wang, P. et al. Transreid: transformer-based object re-identification. Proceedings of the IEEE/CVF international conference on computer vision, 15013–15022 (2021).
Zhou, D., Kang, B., Jin, X. et al.: Deepvit: towards deeper vision transformer. arXiv preprint arXiv:2103.11886, 2021.
Lin, H., Cheng, X., Wu. X. et al. Cat: cross attention in vision transformer. 2022 IEEE International Conference on Multimedia and Expo (ICME): IEEE, 1–6 (2022).
Chu, X., Tian, Z., Wang, Y., et al.: Twins: revisiting the design of spatial attention in vision transformers. Adv. Neural. Inf. Process. Syst. 34, 9355–9366 (2021)
Google Scholar
Liu, W., Anguelov, D., Erhan, D. et al.: Ssd: single shot multibox detector. Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14: Springer, 21–37 (2016).
Simonyan, K., Zisserman, A. Very deep convolutional networks for large-scale image recognition[J]. arXiv preprint arXiv:1409.1556, (2014).
Zhu, Z., Xu, M., Bai, S. et al.: Asymmetric non-local neural networks for semantic segmentation. Proceedings of the IEEE/CVF International Conference on Computer Vision, 593–602 (2019).
Honari, S., Yosinski, J., Vincent, P. et al.: Recombinator networks: learning coarse-to-fine feature aggregation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5743–5752 (2016).
Luo, W., Yang, X., Mo, X. et al.: Cross-x learning for fine-grained visual categorization. Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 8242–8251.
Ge, Y., Chen, D., Li, H.: Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv preprint arXiv:2001.01526, (2020).
Si, T., He, F., Wu, H., et al.: Spatial-driven features based on image dependencies for person re-identification. Pattern Recogn. 124, 108462 (2022)
Article Google Scholar
Luo, H., Gu, Y., Liao, X. et al.: Bag of tricks and a strong baseline for deep person re-identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition WORKSHOPS, 0–0 (2019).
He, K., Zhang, X., Ren, S. et al. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778 (2016).
Ristani, E., Solera, F., Zou, R. et al.: Performance measures and a data set for multi-target, multi-camera tracking[C]. Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II: Springer, 17-35 (2016).
Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by gan improve the person re-identification baseline in vitro. Proceedings of the IEEE International Conference on Computer Vision, 3754–3762 (2017).
Zheng, L., Shen, L., Tian, L. et al.: Scalable person re-identification: a benchmark. Proceedings of the IEEE International Conference on Computer Vision, 1116–1124 (2015).
Wei, L., Zhang, S., Gao, W. et al.: Person transfer gan to bridge domain gap for person re-identification[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 79–88 (2018).
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
Zhong, Z., Zheng, L., Cao, D. et al.: Re-ranking person re-identification with k-reciprocal encoding[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1318–1327 (2017).
Zhao, F., Liao, S., Xie, G.-S. et al.: Unsupervised domain adaptation with noise resistible mutual-training for person re-identification. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16: Springer, 526–544 (2020).
Zhai, Y., Ye, Q., Lu, S., et al.: Multiple expert brainstorming for domain adaptive person re-identification. Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16: Springer, 594–611 (2020).
Ge, Y., Zhu, F., Chen, D., et al.: Self-paced contrastive learning with hybrid memory for domain adaptive object re-id. Adv. Neural. Inf. Process. Syst. 33, 11309–11321 (2020)
Google Scholar
Zheng, K., Lan, C., Zeng, W. et al.: Exploiting sample uncertainty for domain adaptive person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, 3538–3546 (2021).
Wang, W., Zhao, F., Liao, S., et al.: Attentive WaveBlock: Complementarity-enhanced mutual networks for unsupervised domain adaptation in person re-identification and beyond. IEEE Trans. Image Process. 31, 1532–1544 (2022)
Article Google Scholar
Dai, Y., Liu, J., Bai, Y., et al.: Dual-refinement: Joint label and feature refinement for unsupervised domain adaptive person re-identification. IEEE Trans. Image Process. 30, 7815–7829 (2021)
Article Google Scholar
Chen, H., Lagadec, B., Bremond, F.: Enhancing diversity in teacher-student networks via asymmetric branches for unsupervised person re-identification. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 1–10 (2021).
Zheng, K., Liu, W., He, L. et al.: Group-aware label transfer for domain adaptive person re-identification. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5310–5319 (2021).
Zheng, Y., Tang, S., Teng, G. et al.: Online pseudo label generation by hierarchical cluster dynamics for adaptive person re-identification. Proceedings of the IEEE/CVF International Conference on Computer Vision, 8371–8381 (2021).
Han, J., Li, Y.-L., Wang, S.: Delving into probabilistic uncertainty for unsupervised domain adaptive person re-identification. Proceedings of the AAAI Conference on Artificial Intelligence, 790–798 (2022).
Si, T., He, F., Zhang, Z. et al.: Hybrid contrastive learning for unsupervised person re-identification. IEEE Transactions on Multimedia, (2022).

Download references

Acknowledgements

This work was partially supported by the Fundamental Research Funds for the Central Universities (No. JUSRP41908), the National Natural Science Foundation of China (Nos. 62371209, 62371208, 61362030 and 61201429), China Postdoctoral Science Foundation (Nos. 2015M581720 and 2016M600360), and 111 Projects under Grant No.B12018.

Author information

Authors and Affiliations

Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi, 214122, China
Jun Kong
Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi, 214122, China
Hua Zhou & Min Jiang
Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hong Kong, 999077, China
Tianshan Liu

Authors

Jun Kong
View author publications
You can also search for this author inPubMed Google Scholar
Hua Zhou
View author publications
You can also search for this author inPubMed Google Scholar
Min Jiang
View author publications
You can also search for this author inPubMed Google Scholar
Tianshan Liu
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

JK and HZ contributed significantly to analysis and manuscript preparation. MJ and TL performed the data analyses and wrote the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Jun Kong.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kong, J., Zhou, H., Jiang, M. et al. Part-pixel transformer with smooth alignment fusion for domain adaptation person re-identification. SIViP 18, 3737–3744 (2024). https://doi.org/10.1007/s11760-024-03037-z

Download citation

Received: 31 July 2022
Revised: 20 May 2023
Accepted: 18 January 2024
Published: 20 February 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s11760-024-03037-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Part-pixel transformer with smooth alignment fusion for domain adaptation person re-identification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Alignment Network for Person Re-identification

High-Confidence Sample Labelling for Unsupervised Person Re-identification

Mining Diverse Clues with Transformers for Person Re-identification

Availability of data and materials

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now