Skip to main content

DTMIReID: Person Re-identification Based on Deformable Transformer to Incorporate Mutual Information Between Images

  • Conference paper
  • First Online:
Pattern Recognition (ICPR 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15314))

Included in the following conference series:

  • 150 Accesses

Abstract

Person Re-identification (ReID) aims to retrieve a target pedestrian from an image gallery captured by cameras in varied scenarios. It is crucial for ReID to extract extensive discriminative feature representations from images for achieving desirable performance. The majority of current methods focus on mining data that can identify a pedestrian from a single image by investigating different dimensions of the image. However, a single image is sometimes insufficient to precisely characterize all the necessary features for identifying a pedestrian especially when the data quality is not guaranteed. Since a pedestrian tends to be caught in numerous images, information missed in a single image is expected to be supplemented from other images. Therefore, we consider extracting more robust feature representations benefiting from relationships between multiple pedestrian images and propose a new method DTMIReID. Firstly, we suggest a Dual Branch Attention Module (DBAM) based on Transformer to extract global and local features from single images. Then we combine the extracted features of multiple images together and input them into our proposed Deformable Transformer Module (DTM) to simultaneously fuse the global and local features from these multiple images by a Sample-Points-Based Attention (SPBA) mechanism. To the best of our knowledge, our method is the first ReID model that uses the Deformable Transformer to establish relationships between multiple features. Experimental results on four large ReID datasets show that the new method outperforms state-of-the-art published works by a large margin. DTMIReID is available at https://github.com/Titaniumyh/DTMIReID.git.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Zheng, W., Gong, S., and Xiang, T.: Reidentification by relative distance comparison. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 653–668 (2013). https://doi.org/10.1109/TPAMI.2012.138

  2. Kostinger, M., Hirzer, M., Wohlhart, P., Roth, P., and Bischof, H.: Large scale metric learning from equivalence constraints. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 2288–2295. IEEE Computer Society (2012). https://doi.org/10.1109/CVPR.2012.6247939

  3. Liao, S., and Li, Z.: Efficient PSD constrained asymmetric metric learning for person re-Identification. In: 2015 IEEE International Conference on Computer Vision(ICCV), pp. 3685–3693. IEEE Computer Society (2015). https://doi.org/10.1109/ICCV.2015.420

  4. Li, W., Zhu, X., and Gong, S.: Harmonious attention network for person re-identification. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 2285–2294. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00243

  5. Wang, C., Zhang, Q., Huang, C., Liu, W., and Wang X.: Mancs: a multi-task attentional network with curriculum sampling for person re-identificatione. In: Proceedings of the 15th European Conference on Computer Vision(ECCV), pp. 356–381. Springer (2018)

    Google Scholar 

  6. Wang, Y., Chen, Z., Wu, F., Wang, G.: Person re-identification with cascaded pairwise convolutions. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 1470–1478. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00159

  7. Zhang, Z., Lan, C., Zeng, W., Jin, X., Chen, Z.: Relation-aware global attention for person re-identification. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3183–3192. IEEE Computer Society (2020). https://doi.org/10.1109/CVPR42600.2020.00325

  8. Song, C., Huang, Y., Ou Y., Wan L., Wang, L.: Mask-guided contrastive attention model for person re-identification. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1179–1188. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00129

  9. Huang, H., Li, D., Zhang, Z., Chen, X., Huang, K.: Adversarially occluded samples for person re-identification. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 5098–5107. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00535

  10. Zhong, Z., Zheng, L., Kang, G., Li, S., Yang, Y.: Random erasing data augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 7, pp. 13001–13008. Association for the Advancement of Artifcial Intelligence (2020). https://doi.org/10.1609/aaai.v34i07.7000

  11. Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 3774–3782. IEEE Computer Society (2017). https://doi.org/10.1109/ICCV.2017.405

  12. Liu, J., Ni, B., Yan, Y., Zhou, P., Cheng, S., Hu, J.: Pose transferrable person re-identification. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4099–4108. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00431

  13. Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: person retrieval with refined part pooling (and A Strong Convolutional Baseline). In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 501–518. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_30

    Chapter  Google Scholar 

  14. Luo, H., Jiang, W., Zhang, X., Fan, X., Qian, J., Zhang, C.: AlignedReID++: dynamically matching local information for person re-identification. Pattern Recogn. 94, 53–61 (2019). https://doi.org/10.1016/j.patcog.2019.05.028

    Article  Google Scholar 

  15. Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM International Conference on Multimedia(MM), pp. 274–282. Association for Computing Machineray (2018). https://doi.org/10.1145/3240508.3240552

  16. Suh, Y., Wang, J., Tang, S., Mei, T., Lee, K.M.: Part-aligned bilinear representations for person re-identification. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 418–437. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_25

    Chapter  Google Scholar 

  17. Zhao, L., Li, X., Zhuang, Y., Wang, J.: Deeply-learned part-aligned representations for person re-identification. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 3239–3248. IEEE Computer Society (2017). https://doi.org/10.1109/ICCV.2017.349

  18. Wei, L., Zhang, S., Yao, H., Gao, W., Tian, Q.: GLAD: global-local-alignment descriptor for pedestrian retrieval. In: Proceedings of the 25th ACM International Conference on Multimedia(MM), pp. 420–428. Association for Computing Machinery (2017). https://doi.org/10.1145/3123266.3123279

  19. Zhuo, J., Chen, Z., Lai, J., Wang, G.: Occluded person re-identification. In: 2018 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE Computer Society (2018). https://doi.org/10.1109/ICME.2018.8486568

  20. Guo, J., Yuan, Y., Huang, L., Zhang, C., Yao, J., Han, K.: Beyond human parts: dual part-aligned representations for person re-identification. In: 2019 IEEE/CVF International Conference on Computer Vision(ICCV), pp. 3641–3650. IEEE Computer Society (2019). https://doi.org/10.1109/ICCV.2019.00374

  21. Zheng, L., Huang, Y., Lu, H., Yang, Y.: Pose-invariant embedding for deep person re-identification. IEEE Trans. Image Process. 28(9), 4500–4509 (2019). https://doi.org/10.1109/TIP.2019.2910414

  22. Kalayeh, M., Basaran, E., Gokmen, M., Kamasak, M., Shah, M.: Human semantic parsing for person re-identification. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 1062–1071. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00535

  23. Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5686–5696. IEEE Computer Society (2019). https://doi.org/10.1109/CVPR.2019.00584

  24. Cao, Z., Hidalgo, G., Simon, T., Wei, S., and Sheikh, Y.: OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. In: IEEE Transactions on Pattern Analysis and Machine Intelligence 43(1), pp. 172–186 (2021)

    Google Scholar 

  25. Güler, R., Neverova, N., Kokkinos, I.: DensePose: dense human pose estimation in the wild. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 7297–7306. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00762

  26. He, S., Luo, H., Wang, P., Wang, F., Li, H., Jiang, W.: TransReID: transformer-based object re-identification. In: 2021 IEEE/CVF International Conference on Computer Vision(ICCV), pp. 14993–15002. IEEE Computer Society (2021). https://doi.org/10.1109/ICCV48922.2021.01474

  27. Zhu, K., et al.: AAformer: auto-aligned transformer for person re-identification. In: arXiv preprint arXiv:2104.00921. (2021)

  28. Zhu, H., Ke, W., Li, D., Liu, J., Tian, L., Shan, Y.: Dual cross-attention learning for fine-grained visual categorization and object re-identification. In: Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 4692–4702. IEEE Computer Society (2022)

    Google Scholar 

  29. Vaswani, A., et al.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010. Association for Computing Machineray (2017)

    Google Scholar 

  30. Dosovitskiy, A., et al.: An image is worth 16 \(\times \) 16 words: transformers for image recognition at scale. In: 2021 International Conference on Learning Representations (ICLR), pp. 1–22. OpenReview.net (2021)

    Google Scholar 

  31. Wang, H., Shen, J., Liu, Y., Gao, Y., Gavves, E.: NFormer: robust person re-identification with neighbor transformer. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7297–7307. IEEE Computer Society (2022)

    Google Scholar 

  32. Zhang, G., Zhang, P., Qi, J., Lu, H.: HAT: hierarchical aggregation transformers for person re-identification. In: Proceedings of the 29th ACM International Conference on Multimedia(MM), pp. 516–525. Association for Computing Machineray (2021). https://doi.org/10.1145/3474085.3475202

  33. Li, Y., He, J., Zhang, T., Liu, X., Zhang, Y., Wu, F.: Diverse part discovery: occluded person re-identification with part-aware transformer. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 2897–2906. IEEE Computer Society (2021). https://doi.org/10.1109/CVPR46437.2021.00292

  34. Zhang, Z., Zhang, H., Liu, S.: Person re-identification using heterogeneous local graph attention networks. In: Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 12136–12145. IEEE Computer Society (2021)

    Google Scholar 

  35. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  36. Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR: deformable transformers for end-to-end object detection. In: 2021 International Conference on Learning Representations(ICLR), pp. 1–16. OpenReview.net (2021)

    Google Scholar 

  37. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  38. Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1116–1124. IEEE Computer Society (2015). https://doi.org/10.1109/ICCV.2015.133

  39. Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer GAN to bridge domain gap for person re-identification. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 79–88. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00016

  40. Miao, J., Wu, Y., Liu, P., Ding, Y., Yang, Y.: Pose-guided feature alignment for occluded person re-identification. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 542–551. IEEE Computer Society (2019). https://doi.org/10.1109/ICCV.2019.00063

  41. Deng, J., Dong, W., Socher, R., Li, L., Kai L., Li, F.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255. IEEE Computer Society (2009). https://doi.org/10.1109/CVPR.2009.5206848

  42. Chen, T., et al.: ABD-Net: attentive but diverse person re-identification. In: 2019 IEEE/CVF International Conference on Computer Vision(ICCV), pp. 8350–8360. IEEE Computer Society (2019). https://doi.org/10.1109/ICCV.2019.00844

  43. Zhou, K., Yang, Y., Cavallaro, A., Xiang, T.: Omni-scale feature learning for person re-identification. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3701–3711. IEEE Computer Society (2019). https://doi.org/10.1109/ICCV.2019.00380

  44. Hou, R., Ma, B., Chang, H., Gu, X., Shan, S., Chen, X.: Interaction-and-aggregation network for person re-identification. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9309–9318. IEEE Computer Society (2019). https://doi.org/10.1109/CVPR.2019.00954

  45. Zhuang, Z., et al.: Rethinking the distribution gap of person re-identification with camera-based batch normalization. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12357, pp. 140–157. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58610-2_9

    Chapter  Google Scholar 

  46. Zhu, K., Guo, H., Liu, Z., Tang, M., Wang, J.: Identity-guided human semantic parsing for person re-identification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 346–363. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_21

    Chapter  Google Scholar 

  47. Wang, G., et al.: High-order information matters: learning relation and topology for occluded person re-identification. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 6448–6457. IEEE Computer Society (2020). https://doi.org/10.1109/CVPR42600.2020.00648

  48. Li, H., Wu, G., Zheng, W.: Combined depth space based architecture search for person re-identification. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 6725–6734. IEEE Computer Society (2021). https://doi.org/10.1109/CVPR46437.2021.00666

  49. Wang, Z., Zhu, F., Tang, S., Zhao, R., He, L., Song, J.: Feature erasing and diffusion network for occluded person re-identification. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 4744–4753. IEEE Computer Society (2022). https://doi.org/10.1109/CVPR52688.2022.00471

  50. Ye, Y., et al.: Dynamic feature pruning and consolidation for occluded person re-identification. In: Proceedings of the 2024 AAAI Conference on Artificial Intelligence, vol. 38, no. 7, pp. 6684–6692. Association for the Advancement of Artifcial Intelligence (2024). https://doi.org/10.1609/aaai.v38i7.28491

  51. Zhai, Y., Zeng, Y., Huang, Z. ., Qin, Z., Jin, X., Cao, D.: Multi-prompts learning with cross-modal alignment for attribute-based person re-identification. In: Proceedings of the 2024 AAAI Conference on Artificial Intelligence, vol. 38, no. 7, pp. 6979–6987. Association for the Advancement of Artifcial Intelligence (2024). https://doi.org/10.1609/aaai.v38i7.28524

  52. Dou Z., Wang Z., Li Y., Wang S.: Identity-seeking self-supervised representation learning for generalizable person re-identiffcation. In: Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision(ICCV), pp. 15847–15858. IEEE Computer Society (2023). arXiv:2308.08887

  53. Li W., et al.: DC-Former: diverse and compact transformer for person re-identification. In: Proceedings of the 2023 AAAI Conference on Artificial Intelligence, vol. 37, no. 2, pp. 1415–1423. Association for the Advancement of Artifcial Intelligence (2023). https://doi.org/10.1609/aaai.v37i2.25226

  54. Li S., Sun L., Li Q.: CLIP-ReID: exploiting vision-language model for image re-identification without concrete text labels. In: Proceedings of the 2023 AAAI Conference on Artificial Intelligence, vol. 37, no. 1, pp. 1405–1413. Association for the Advancement of Artifcial Intelligence (2023). https://doi.org/10.1609/aaai.v37i1.25225

  55. Chen W., et al.: Beyond appearance: a semantic controllable self-supervised learning framework for human-centric visual tasks. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp. 15050–15061. IEEE Computer Society (2023)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China under No. 61672325. We sincerely thank the anonymous reviewers for their valuable comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haodi Feng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, H., Feng, H., Cui, X. (2025). DTMIReID: Person Re-identification Based on Deformable Transformer to Incorporate Mutual Information Between Images. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15314. Springer, Cham. https://doi.org/10.1007/978-3-031-78341-8_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-78341-8_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-78340-1

  • Online ISBN: 978-3-031-78341-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics