skip to main content
10.1145/3473714.3473770acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccirConference Proceedingsconference-collections
research-article

A Multi-task Deep Network for video-based Person Re-identification

Published: 13 August 2021 Publication History

Abstract

Video-based person re-identification is one of the significant areas of computer vision, and it has huge application potential in video surveillance. Many existing methods address this difficulty by extracting only the characteristics of one person in the network. This paper proposes a Pose-Attribute network (MBNet) through multi-task learning of embedded pose representation, which combines pedestrian attribute recognition with ReID. MBNet extracts 18 human body keypoints and 12 pedestrian attributes that contain gender, shoes, etc. from the image, and simultaneously learns the local and global features. Conclusion of experiments is on two datasets, including the DukeMTMC-VID and MARS. The results show that the mAP of MBNet is higher than the mAP of pose-based or attribute aware for Person re-identification. The experimental results show that the proposed network is better than other video-based people re-identification approaches.

References

[1]
R. Hou, B. Ma, H. Chang, X. Gu, S. Shan, and X. Chen, "Interaction-And-Aggregation Network for Person Re-Identification, " in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, Jun. 2019, pp. 9309--9318.
[2]
Z. Zhang, C. Lan, W. Zeng, and Z. Chen, "Densely Semantically Aligned Person Re-Identification, " in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, Jun. 2019, pp. 667--676.
[3]
L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, "Scalable Person Re-identification: A Benchmark, " in 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec. 2015, pp. 1116--1124.
[4]
Z. Zheng, L. Zheng, and Y. Yang, "Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro, " in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Oct. 2017, pp. 3774--3782.
[5]
Y. Wu, Y. Lin, X. Dong, Y. Yan, W. Ouyang, and Y. Yang, "Exploit the Unknown Gradually: One-Shot Video-Based Person Re-identification by Stepwise Learning, " in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, Jun. 2018, pp. 5177--5186.
[6]
L. Zheng et al., MARS: A Video Benchmark for Large-Scale Person Re-Identification, vol. 9910. 2016, p. 884.
[7]
Toshev A, Szegedy C. Deeppose: Human pose estimation via deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 1653--1660.
[8]
Carreira J, Agrawal P, Fragkiadaki K, et al. Human pose estimation with iterative error feedback[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 4733--4742.
[9]
Sun X, Shang J, Liang S, et al. Compositional human pose regression[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 2602--2611.
[10]
Luvizon D C, Tabia H, Picard D. Human pose regression by combining indirect part detection and contextual information [J]. Computers & Graphics, 2019, 85: 15--22.
[11]
Li S, Liu Z Q, Chan A B. Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network[C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2014: 482--489.
[12]
Z. Chen, A. Li, and Y. Wang, "A Temporal Attentive Approach for Video-Based Pedestrian Attribute Recognition, " arXiv:1901.05742 [cs], Oct. 2019, Accessed: Feb. 24, 2021. [Online]. Available: http://arxiv.org/abs/1901.05742.
[13]
W. Song, J. Zheng, Y. Wu, C. Chen, and F. Liu, "A Two-Stage Attribute-Constraint Network for Video-Based Person Re-Identification, " IEEE Access, vol. 7, pp. 8508--8518, 2019.
[14]
Y. Zhao, X. Shen, Z. Jin, H. Lu, and X. Hua, "Attribute-Driven Feature Disentangling and Temporal Aggregation for Video Person Re-Identification, " in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, Jun. 2019, pp. 4908--4917.
[15]
M. Gou, X. Zhang, A. Rates-Borras, S. Asghari-Esfeden, M. Sznaier, and O. Camps, "Person Re-identification in Appearance Impaired Scenarios, " arXiv:1604.00367 [cs], Apr. 2016, Accessed: Feb. 25, 2021. [Online]. Available: http://arxiv.org/abs/1604.00367.
[16]
T. Wang, S. Gong, X. Zhu, and S. Wang, "Person Re-identification by Video Ranking, " in Computer Vision - ECCV 2014, vol. 8692, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Cham: Springer International Publishing, 2014, pp. 688--703.
[17]
Li W, Zhao R, Xiao T, et al. Deepreid: Deep filter pairing neural network for person re-identification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 152--159.
[18]
Zhang R, Lin L, Zhang R, et al. Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-identification [J]. IEEE Transactions on Image Processing, 2015, 24(12): 4766--4779.
[19]
Ahmed E, Jones M, Marks T K. An improved deep learning architecture for person re-identification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3908--3916.
[20]
Xiao T, Li H, Ouyang W, et al. Learning deep feature representations with domain guided dropout for person re-identification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 1249--1258.
[21]
Zheng L, Zhang H, Sun S, et al. Person re-identification in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 1367--1376.
[22]
D. Chung, K. Tahboub, and E. J. Delp, "A Two Stream Siamese Convolutional Neural Network for Person Re-identification, " in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Oct. 2017, pp. 1992--2000.
[23]
Z. Li, L. Yao, F. Nie, and M. Xu, "Multi-Rate Gated Recurrent Convolutional Networks for Video-Based Pedestrian Re-Identification, " p. 8.
[24]
D. Chen, H. Li, T. Xiao, S. Yi, and X. Wang, "Video Person Re-identification with Competitive Snippet-Similarity Aggregation and Co-attentive Snippet Embedding, " in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, Jun. 2018, pp. 1169--1178.
[25]
J. Gao and R. Nevatia, "Revisiting Temporal Modeling for Video-based Person ReID, " arXiv:1805.02104 [cs], May 2018, Accessed: Feb. 26, 2021. [Online]. Available: http://arxiv.org/abs/1805.02104.
[26]
C.-T. Liu, C.-W. Wu, Y.-C. F. Wang, and S.-Y. Chien, "Spatially and Temporally Efficient Non-local Attention Network for Video-based Person Re-Identification, " arXiv:1908.01683 [cs], Aug. 2019, Accessed: Feb. 25, 2021. [Online]. Available: http://arxiv.org/abs/1908.01683.
[27]
Yang F, Yan K, Lu S, et al. Attention driven person re-identification [J]. Pattern Recognition, 2019, 86: 143--155.
[28]
Yan Y, Ni B, Liu J, et al. Multi-level attention model for person re-identification [J]. Pattern Recognition Letters, 2019, 127: 156--164.
[29]
Tian Y, Li Q, Wang D, et al. Robust joint learning network: improved deep representation learning for person re-identification [J]. Multimedia Tools and Applications, 2019, 78(17): 24187--24203.
[30]
Jiao Z, Zhang H, Dong Y, et al. An algorithm for retrieval of surface albedo from small view-angle airborne observations through the use of BRDF archetypes as prior knowledge [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2015, 8(7): 3279--3293.
[31]
Zhu F, Kong X, Wu Q, et al. A loss combination based deep model for person re-identification [J]. Multimedia Tools and Applications, 2018, 77(3): 3049--3069.
[32]
Fan X, Jiang W, Luo H, et al. Spherereid: Deep hypersphere manifold embedding for person re-identification [J]. Journal of Visual Communication and Image Representation, 2019, 60: 51--58.
[33]
Hadsell R, Chopra S, LeCun Y. Dimensionality reduction by learning an invariant mapping[C]//2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06). IEEE, 2006, 2: 1735--1742.
[34]
Zhang Z, Si T. Learning deep features from body and parts for person reidentification in camera networks [J]. EURASIP Journal on Wireless Communications and Networking, 2018, 2018(1): 1--8.
[35]
Zhong W, Jiang L, Zhang T, et al. Combining multilevel feature extraction and multi-loss learning for person re-identification [J]. Neurocomputing, 2019, 334: 68--78.
[36]
Ding S, Lin L, Wang G, et al. Deep feature learning with relative distance comparison for person re-identification [J]. Pattern Recognition, 2015, 48(10): 2993--3003.
[37]
Wu D, Zheng S J, Yuan C A, et al. A deep model with combined losses for person re-identification [J]. Cognitive Systems Research, 2019, 54: 74--82.
[38]
Hermans A, Beyer L, Leibe B. In defense of the triplet loss for person re-identification [J]. arXiv preprint arXiv:1703.07737, 2017.
[39]
Su C, Zhang S, Xing J, et al. Multi-type attributes driven multi-camera person re-identification [J]. Pattern Recognition, 2018, 75: 77--89.
[40]
Liu J, Zha Z J, Tian Q I, et al. Multi-scale triplet cnn for person re-identification[C]//Proceedings of the 24th ACM international conference on Multimedia. 2016: 192--196.
[41]
Cheng D, Gong Y, Zhou S, et al. Person re-identification by multi-channel parts-based cnn with improved triplet loss function[C]//Proceedings of the iEEE conference on computer vision and pattern recognition. 2016: 1335--1344.
[42]
Y. Fu, X. Wang, Y. Wei, and T. Huang, "STA: Spatial-Temporal Attention for Large-Scale Video-Based Person Re-Identification, " AAAI, vol. 33, pp. 8287--8294, Jul. 2019.
[43]
R. Hou, B. Ma, H. Chang, X. Gu, S. Shan, and X. Chen, "VRSTC: Occlusion-Free Video Person Re-Identification, " p. 10.
[44]
J. Li, S. Zhang, J. Wang, W. Gao, and Q. Tian, "Global-Local Temporal Representations for Video Person Re-Identification, " in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), Oct. 2019, pp. 3957--3966.
[45]
Z. Chen, A. Li, S. Jiang, and Y. Wang, "Attribute-aware Identity-hard Triplet Loss for Video-based Person Re-identification, " arXiv:2006.07597 [cs], Jun. 2020, Accessed: Feb. 28, 2021. [Online]. Available: http://arxiv.org/abs/2006.07597.

Cited By

View all
  • (2024)Single-Task Joint Learning Model for an Online Multi-Object Tracking FrameworkApplied Sciences10.3390/app14221054014:22(10540)Online publication date: 15-Nov-2024

Index Terms

  1. A Multi-task Deep Network for video-based Person Re-identification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCIR '21: Proceedings of the 2021 1st International Conference on Control and Intelligent Robotics
    June 2021
    807 pages
    ISBN:9781450390231
    DOI:10.1145/3473714
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Chongqing Univ.: Chongqing University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 August 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Human pose estimator
    2. pedestrian attribute recognition
    3. person reidentification

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICCIR 2021

    Acceptance Rates

    ICCIR '21 Paper Acceptance Rate 131 of 239 submissions, 55%;
    Overall Acceptance Rate 131 of 239 submissions, 55%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 20 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Single-Task Joint Learning Model for an Online Multi-Object Tracking FrameworkApplied Sciences10.3390/app14221054014:22(10540)Online publication date: 15-Nov-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media