3D Shape Temporal Aggregation for Video-Based Clothing-Change Person Re-identification

Han, Ke; Huang, Yan; Gong, Shaogang; Huang, Yan; Wang, Liang; Tan, Tieniu

doi:10.1007/978-3-031-26348-4_5

Ke Han^12,13,
Yan Huang^12,13,
Shaogang Gong¹⁴,
Yan Huang^12,13,
Liang Wang^12,13 &
…
Tieniu Tan^12,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13845))

Included in the following conference series:

Asian Conference on Computer Vision

446 Accesses
10 Citations

Abstract

3D shape of human body can be both discriminative and clothing-independent information in video-based clothing-change person re-identification (Re-ID). However, existing Re-ID methods usually generate 3D body shapes without considering identity modelling, which severely weakens the discriminability of 3D human shapes. In addition, different video frames provide highly similar 3D shapes, but existing methods cannot capture the differences among 3D shapes over time. They are thus insensitive to the unique and discriminative 3D shape information of each frame and ineffectively aggregate many redundant framewise shapes in a videowise representation for Re-ID. To address these problems, we propose a 3D Shape Temporal Aggregation (3STA) model for video-based clothing-change Re-ID. To generate the discriminative 3D shape for each frame, we first introduce an identity-aware 3D shape generation module. It embeds the identity information into the generation of 3D shapes by the joint learning of shape estimation and identity recognition. Second, a difference-aware shape aggregation module is designed to measure inter-frame 3D human shape differences and automatically select the unique 3D shape information of each frame. This helps minimise redundancy and maximise complementarity in temporal shape aggregation. We further construct a Video-based Clothing-Change Re-ID (VCCR) dataset to address the lack of publicly available datasets for video-based clothing-change Re-ID. Extensive experiments on the VCCR dataset demonstrate the effectiveness of the proposed 3STA model. The dataset is available at https://vhank.github.io/vccr.github.io.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Long-Term Cloth-Changing Person Re-identification

Attention-enhanced controllable disentanglement for cloth-changing person re-identification

Article 14 December 2024

Cross-Modality Complementary Learning for Video-Based Cloth-Changing Person Re-identification

References

Chen, J., et al.: Learning 3D shape feature for texture-insensitive person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8146–8155 (2021)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition (CVPR), pp. 248–255 (2009)
Google Scholar
Fan, L., Li, T., Fang, R., Hristov, R., Yuan, Y., Katabi, D.: Learning longterm representations for person re-identification using radio signals. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 10699–10709 (2020)
Google Scholar
Fu, Y., et al.: Horizontal pyramid matching for person re-identification. In: AAAI, pp. 8295–8302 (2019)
Google Scholar
Gao, J., Nevatia, R.: Revisiting temporal modeling for video-based person reid. arXiv preprint arXiv:1805.02104 (2018)
Gou, M., Zhang, X., Rates-Borras, A., Asghari-Esfeden, S., Sznaier, M., Camps, O.: Person re-identification in appearance impaired scenarios. arXiv preprint arXiv:1604.00367 (2016)
Gu, X., Chang, H., Ma, B., Zhang, H., Chen, X.: Appearance-Preserving 3D Convolution for Video-Based Person Re-identification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 228–243. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_14
Chapter Google Scholar
Han, K., Huang, Y., Chen, Z., Wang, L., Tan, T.: Prediction and Recovery for Adaptive Low-Resolution Person Re-Identification. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12371, pp. 193–209. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_12
Chapter Google Scholar
Han, K., Huang, Y., Song, C., Wang, L., Tan, T.: Adaptive super-resolution for person re-identification with low-resolution images. Pattern Recogn. 114, 107682 (2021)
Google Scholar
Han, K., Si, C., Huang, Y., Wang, L., Tan, T.: Generalizable person re-identification via self-supervised batch norm test-time adaption 36, pp. 817–825 (2022)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Hirzer, M., Beleznai, C., Roth, P.M., Bischof, H.: Person Re-identification by Descriptive and Discriminative Classification. In: Heyden, A., Kahl, F. (eds.) SCIA 2011. LNCS, vol. 6688, pp. 91–102. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21227-7_9
Chapter Google Scholar
Hong, P., Wu, T., Wu, A., Han, X., Zheng, W.: Fine-grained shape-appearance mutual learning for cloth-changing person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 10513–10522 (2021)
Google Scholar
Huang, Y., Wu, Q., Xu, J., Zhong, Y.: SBSGAN: suppression of inter-domain background shift for person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 9527–9536 (2019)
Google Scholar
Huang, Y., Wu, Q., Xu, J., Zhong, Y., Zhang, Z.: Clothing status awareness for long-term person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 11895–11904 (2021)
Google Scholar
Huang, Y., Wu, Q., Xu, J., Zhong, Y., Zhang, Z.: Unsupervised domain adaptation with background shift mitigating for person re-identification. Int. J. Comput. Vis. 129(7), 2244–2263 (2021). https://doi.org/10.1007/s11263-021-01474-8
Article Google Scholar
Huang, Y., Xu, J., Wu, Q., Zheng, Z., Zhang, Z., Zhang, J.: Multi-pseudo regularized label for generated data in person re-identification. Trans. Image Process. 28(3), 1391–1403 (2018)
Google Scholar
Huang, Y., Xu, J., Wu, Q., Zhong, Y., Zhang, P., Zhang, Z.: Beyond scalar neuron: adopting vector-neuron capsules for long-term person re-identification. TCSVT 30(10), 3459–3471 (2019)
Google Scholar
Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3.6m: large scale datasets and predictive methods for 3D human sensing in natural environments. TPAMI 36(7), 1325–1339 (2013)
Google Scholar
Isobe, T., Zhu, F., Wang, S.: Revisiting temporal modeling for video super-resolution. arXiv preprint arXiv:2008.05765 (2020)
Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: CVPR, pp. 7122–7131(2018)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kolotouros, N., Pavlakos, G., Black, M.J., Daniilidis, K.: Learning to reconstruct 3d human pose and shape via model-fitting in the loop. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp. 2252–2261 (2019)
Google Scholar
Li, D., Zhang, Z., Chen, X., Huang, K.: A richly annotated pedestrian dataset for person retrieval in real surveillance scenarios. Trans. Image Process. 28(4), pp. 1575–1590 (2018)
Google Scholar
Li, J., Wang, J., Tian, Q., Gao, W., Zhang, S.: Global-local temporal representations for video person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision (CVPR), pp. 3958–3967 (2019)
Google Scholar
Li, Y.J., Luo, Z., Weng, X., Kitani, K.M.: Learning shape representations for clothing variations in person re-identification. arXiv preprint arXiv:2003.07340 (2020)
Liu, K., Ma, B., Zhang, W., Huang, R.: A spatio-temporal appearance representation for viceo-based pedestrian re-identification. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 3810–3818 (2015)
Google Scholar
Liu, X., Zhang, P., Yu, C., Lu, H., Yang, X.: Watching you: global-guided reciprocal learning for video-based person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13334–13343 (2021)
Google Scholar
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. (TOG), 34(6), 1–16 (2015)
Google Scholar
Niu, K., Huang, Y., Ouyang, W., Wang, L.: Improving description-based person re-identification by multi-granularity image-text alignments. Trans. Image Process. 29,5542–5556 (2020)
Google Scholar
Niu, K., Huang, Y., Wang, L.: Fusing two directions in cross-domain adaption for real life person search by language. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCV) Workshops (2019)
Google Scholar
Niu, K., Huang, Y., Wang, L.: Textual dependency embedding for person search by language. In: Proceedings of the 28th ACM International Conference on Multimedia (ACMMM), pp. 4032–4040 (2020)
Google Scholar
Pathak, P., Eshratifar, A.E., Gormish, M.: Video person re-id: fantastic techniques and where to find them. arXiv preprint arXiv:1912.05295 (2019)
Qian, X., et al.: long-term cloth-changing person re-identification. arXiv preprint arXiv:2005.12633 (2020)
Shu, X., Li, G., Wang, X., Ruan, W., Tian, Q.: Semantic-guided pixel sampling for cloth-changing person re-identification. IJIS 28, 1365–1369 (2021)
Google Scholar
Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond Part Models: Person Retrieval with Refined Part Pooling (and A Strong Convolutional Baseline). In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 501–518. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_30
Chapter Google Scholar
Uddin, M.K., Lam, A., Fukuda, H., Kobayashi, Y., Kuno, Y.: Fusion in dissimilarity space for RGB-d person re-identification. Array 12, 100089 (2021)
Google Scholar
Wan, F., Wu, Y., Qian, X., Chen, Y., Fu, Y.: When person re-identification meets changing clothes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 830–831 (2020)
Google Scholar
Wang, G., Yuan, Y., Chen, X., Li, J., Zhou, X.: Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM international conference on Multimedia (ACMMM), pp. 274–282 (2018)
Google Scholar
Wang, K., Ma, Z., Chen, S., Yang, J., Zhou, K., Li, T.: A benchmark for clothes variation in person re-identification. Int. J. Intell. Syst. 35(12), 1881–1898 (2020)
Google Scholar
Wang, Y., Zhang, P., Gao, S., Geng, X., Lu, H., Wang, D.: Pyramid spatial-temporal aggregation for video-based person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 12026–12035 (2021)
Google Scholar
Wu, A., Zheng, W.S., Lai, J.H.: Robust depth-based person re-identification. IEEE Trans. Image Process. 26(6), 2588–2603 (2017)
Google Scholar
Yan, Y., et al.: Learning multi-granular hypergraphs for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 2899–2908 (2020)
Google Scholar
Yang, Q., Wu, A., Zheng, W.S.: Person re-identification by contour sketch under moderate clothing change. IEEE Trans. Pattern Anal. Mach. Intell. 43(6), 2029–2046 (2019)
Google Scholar
Yu, S., Li, S., Chen, D., Zhao, R., Yan, J., Qiao, Y.: Cocas: a large-scale clothes changing person dataset for re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3400–3409 (2020)
Google Scholar
Zhang, P., Wu, Q., Xu, J., Zhang, J.: Long-term person re-identification using true motion from videos. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 494–502 (2018)
Google Scholar
Zhang, Z., Lan, C., Zeng, W., Chen, Z.: Multi-granularity reference-aided attentive feature aggregation for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 10407–10416 (2020)
Google Scholar
Zheng, L., Bie, Z., Sun, Y., Wang, J., Su, C., Wang, S., Tian, Q.: MARS: A Video Benchmark for Large-Scale Person Re-Identification. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 868–884. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46466-4_52
Chapter Google Scholar
Zheng, Z., Zheng, N., Yang, Y.: Parameter-efficient person re-identification in the 3D space. arXiv preprint arXiv:2006.04569 (2020)

Download references

Acknowledgements

This work was jointly supported by National Key Research and Development Program of China Grant No. 2018AAA0100400, National Natural Science Foundation of China (62236010, 62276261, 61721004, and U1803261), Key Research Program of Frontier Sciences CAS Grant No. ZDBS-LY- JSC032, Beijing Nova Program (Z201100006820079), CAS-AIR, the fellowship of China postdoctoral science foundation (2022T150698), China Scholarship Council, Vision Semantics Limited, and the Alan Turing Institute Turing Fellowship.

Author information

Authors and Affiliations

Center for Research on Intelligent Perception and Computing (CRIPAC), Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing, China
Ke Han, Yan Huang, Yan Huang, Liang Wang & Tieniu Tan
University of Chinese Academy of Sciences (UCAS), Beijing, China
Ke Han, Yan Huang, Yan Huang, Liang Wang & Tieniu Tan
Queen Mary University of London (QMUL), London, England
Shaogang Gong

Authors

Ke Han
View author publications
You can also search for this author in PubMed Google Scholar
Yan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Shaogang Gong
View author publications
You can also search for this author in PubMed Google Scholar
Yan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Liang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tieniu Tan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Huang .

Editor information

Editors and Affiliations

University of Wollongong, Wollongong, NSW, Australia
Lei Wang
University of Bonn, Bonn, Germany
Juergen Gall
University of Adelaide, Adelaide, SA, Australia
Tat-Jun Chin
National Institute of Informatics, Tokyo, Japan
Imari Sato
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Han, K., Huang, Y., Gong, S., Huang, Y., Wang, L., Tan, T. (2023). 3D Shape Temporal Aggregation for Video-Based Clothing-Change Person Re-identification. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13845. Springer, Cham. https://doi.org/10.1007/978-3-031-26348-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-26348-4_5
Published: 09 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26347-7
Online ISBN: 978-3-031-26348-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

3D Shape Temporal Aggregation for Video-Based Clothing-Change Person Re-identification

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Long-Term Cloth-Changing Person Re-identification

Attention-enhanced controllable disentanglement for cloth-changing person re-identification

Cross-Modality Complementary Learning for Video-Based Cloth-Changing Person Re-identification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

3D Shape Temporal Aggregation for Video-Based Clothing-Change Person Re-identification

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Long-Term Cloth-Changing Person Re-identification

Attention-enhanced controllable disentanglement for cloth-changing person re-identification

Cross-Modality Complementary Learning for Video-Based Cloth-Changing Person Re-identification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation