research-article

A Multi-task Deep Network for video-based Person Re-identification

Authors:

Guowei YangAuthors Info & Claims

ICCIR '21: Proceedings of the 2021 1st International Conference on Control and Intelligent Robotics

Pages 320 - 324

https://doi.org/10.1145/3473714.3473770

Published: 13 August 2021 Publication History

Abstract

Video-based person re-identification is one of the significant areas of computer vision, and it has huge application potential in video surveillance. Many existing methods address this difficulty by extracting only the characteristics of one person in the network. This paper proposes a Pose-Attribute network (MBNet) through multi-task learning of embedded pose representation, which combines pedestrian attribute recognition with ReID. MBNet extracts 18 human body keypoints and 12 pedestrian attributes that contain gender, shoes, etc. from the image, and simultaneously learns the local and global features. Conclusion of experiments is on two datasets, including the DukeMTMC-VID and MARS. The results show that the mAP of MBNet is higher than the mAP of pose-based or attribute aware for Person re-identification. The experimental results show that the proposed network is better than other video-based people re-identification approaches.

References

[1]

R. Hou, B. Ma, H. Chang, X. Gu, S. Shan, and X. Chen, "Interaction-And-Aggregation Network for Person Re-Identification, " in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, Jun. 2019, pp. 9309--9318.

[2]

Z. Zhang, C. Lan, W. Zeng, and Z. Chen, "Densely Semantically Aligned Person Re-Identification, " in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, Jun. 2019, pp. 667--676.

[3]

L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, "Scalable Person Re-identification: A Benchmark, " in 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, Dec. 2015, pp. 1116--1124.

[4]

Z. Zheng, L. Zheng, and Y. Yang, "Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in Vitro, " in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Oct. 2017, pp. 3774--3782.

[5]

Y. Wu, Y. Lin, X. Dong, Y. Yan, W. Ouyang, and Y. Yang, "Exploit the Unknown Gradually: One-Shot Video-Based Person Re-identification by Stepwise Learning, " in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, Jun. 2018, pp. 5177--5186.

[6]

L. Zheng et al., MARS: A Video Benchmark for Large-Scale Person Re-Identification, vol. 9910. 2016, p. 884.

[7]

Toshev A, Szegedy C. Deeppose: Human pose estimation via deep neural networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 1653--1660.

[8]

Carreira J, Agrawal P, Fragkiadaki K, et al. Human pose estimation with iterative error feedback[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 4733--4742.

[9]

Sun X, Shang J, Liang S, et al. Compositional human pose regression[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 2602--2611.

[10]

Luvizon D C, Tabia H, Picard D. Human pose regression by combining indirect part detection and contextual information [J]. Computers & Graphics, 2019, 85: 15--22.

Digital Library

[11]

Li S, Liu Z Q, Chan A B. Heterogeneous multi-task learning for human pose estimation with deep convolutional neural network[C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2014: 482--489.

[12]

Z. Chen, A. Li, and Y. Wang, "A Temporal Attentive Approach for Video-Based Pedestrian Attribute Recognition, " arXiv:1901.05742 [cs], Oct. 2019, Accessed: Feb. 24, 2021. [Online]. Available: http://arxiv.org/abs/1901.05742.

[13]

W. Song, J. Zheng, Y. Wu, C. Chen, and F. Liu, "A Two-Stage Attribute-Constraint Network for Video-Based Person Re-Identification, " IEEE Access, vol. 7, pp. 8508--8518, 2019.

[14]

Y. Zhao, X. Shen, Z. Jin, H. Lu, and X. Hua, "Attribute-Driven Feature Disentangling and Temporal Aggregation for Video Person Re-Identification, " in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, Jun. 2019, pp. 4908--4917.

[15]

M. Gou, X. Zhang, A. Rates-Borras, S. Asghari-Esfeden, M. Sznaier, and O. Camps, "Person Re-identification in Appearance Impaired Scenarios, " arXiv:1604.00367 [cs], Apr. 2016, Accessed: Feb. 25, 2021. [Online]. Available: http://arxiv.org/abs/1604.00367.

[16]

T. Wang, S. Gong, X. Zhu, and S. Wang, "Person Re-identification by Video Ranking, " in Computer Vision - ECCV 2014, vol. 8692, D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, Eds. Cham: Springer International Publishing, 2014, pp. 688--703.

[17]

Li W, Zhao R, Xiao T, et al. Deepreid: Deep filter pairing neural network for person re-identification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 152--159.

[18]

Zhang R, Lin L, Zhang R, et al. Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-identification [J]. IEEE Transactions on Image Processing, 2015, 24(12): 4766--4779.

Digital Library

[19]

Ahmed E, Jones M, Marks T K. An improved deep learning architecture for person re-identification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 3908--3916.

[20]

Xiao T, Li H, Ouyang W, et al. Learning deep feature representations with domain guided dropout for person re-identification[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 1249--1258.

[21]

Zheng L, Zhang H, Sun S, et al. Person re-identification in the wild[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 1367--1376.

[22]

D. Chung, K. Tahboub, and E. J. Delp, "A Two Stream Siamese Convolutional Neural Network for Person Re-identification, " in 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Oct. 2017, pp. 1992--2000.

[23]

Z. Li, L. Yao, F. Nie, and M. Xu, "Multi-Rate Gated Recurrent Convolutional Networks for Video-Based Pedestrian Re-Identification, " p. 8.

[24]

D. Chen, H. Li, T. Xiao, S. Yi, and X. Wang, "Video Person Re-identification with Competitive Snippet-Similarity Aggregation and Co-attentive Snippet Embedding, " in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, Jun. 2018, pp. 1169--1178.

[25]

J. Gao and R. Nevatia, "Revisiting Temporal Modeling for Video-based Person ReID, " arXiv:1805.02104 [cs], May 2018, Accessed: Feb. 26, 2021. [Online]. Available: http://arxiv.org/abs/1805.02104.

[26]

C.-T. Liu, C.-W. Wu, Y.-C. F. Wang, and S.-Y. Chien, "Spatially and Temporally Efficient Non-local Attention Network for Video-based Person Re-Identification, " arXiv:1908.01683 [cs], Aug. 2019, Accessed: Feb. 25, 2021. [Online]. Available: http://arxiv.org/abs/1908.01683.

[27]

Yang F, Yan K, Lu S, et al. Attention driven person re-identification [J]. Pattern Recognition, 2019, 86: 143--155.

[28]

Yan Y, Ni B, Liu J, et al. Multi-level attention model for person re-identification [J]. Pattern Recognition Letters, 2019, 127: 156--164.

Digital Library

[29]

Tian Y, Li Q, Wang D, et al. Robust joint learning network: improved deep representation learning for person re-identification [J]. Multimedia Tools and Applications, 2019, 78(17): 24187--24203.

Digital Library

[30]

Jiao Z, Zhang H, Dong Y, et al. An algorithm for retrieval of surface albedo from small view-angle airborne observations through the use of BRDF archetypes as prior knowledge [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2015, 8(7): 3279--3293.

[31]

Zhu F, Kong X, Wu Q, et al. A loss combination based deep model for person re-identification [J]. Multimedia Tools and Applications, 2018, 77(3): 3049--3069.

Digital Library

[32]

Fan X, Jiang W, Luo H, et al. Spherereid: Deep hypersphere manifold embedding for person re-identification [J]. Journal of Visual Communication and Image Representation, 2019, 60: 51--58.

Digital Library

[33]

Hadsell R, Chopra S, LeCun Y. Dimensionality reduction by learning an invariant mapping[C]//2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06). IEEE, 2006, 2: 1735--1742.

[34]

Zhang Z, Si T. Learning deep features from body and parts for person reidentification in camera networks [J]. EURASIP Journal on Wireless Communications and Networking, 2018, 2018(1): 1--8.

Digital Library

[35]

Zhong W, Jiang L, Zhang T, et al. Combining multilevel feature extraction and multi-loss learning for person re-identification [J]. Neurocomputing, 2019, 334: 68--78.

Digital Library

[36]

Ding S, Lin L, Wang G, et al. Deep feature learning with relative distance comparison for person re-identification [J]. Pattern Recognition, 2015, 48(10): 2993--3003.

Digital Library

[37]

Wu D, Zheng S J, Yuan C A, et al. A deep model with combined losses for person re-identification [J]. Cognitive Systems Research, 2019, 54: 74--82.

[38]

Hermans A, Beyer L, Leibe B. In defense of the triplet loss for person re-identification [J]. arXiv preprint arXiv:1703.07737, 2017.

[39]

Su C, Zhang S, Xing J, et al. Multi-type attributes driven multi-camera person re-identification [J]. Pattern Recognition, 2018, 75: 77--89.

Digital Library

[40]

Liu J, Zha Z J, Tian Q I, et al. Multi-scale triplet cnn for person re-identification[C]//Proceedings of the 24th ACM international conference on Multimedia. 2016: 192--196.

[41]

Cheng D, Gong Y, Zhou S, et al. Person re-identification by multi-channel parts-based cnn with improved triplet loss function[C]//Proceedings of the iEEE conference on computer vision and pattern recognition. 2016: 1335--1344.

[42]

Y. Fu, X. Wang, Y. Wei, and T. Huang, "STA: Spatial-Temporal Attention for Large-Scale Video-Based Person Re-Identification, " AAAI, vol. 33, pp. 8287--8294, Jul. 2019.

Digital Library

[43]

R. Hou, B. Ma, H. Chang, X. Gu, S. Shan, and X. Chen, "VRSTC: Occlusion-Free Video Person Re-Identification, " p. 10.

[44]

J. Li, S. Zhang, J. Wang, W. Gao, and Q. Tian, "Global-Local Temporal Representations for Video Person Re-Identification, " in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), Oct. 2019, pp. 3957--3966.

[45]

Z. Chen, A. Li, S. Jiang, and Y. Wang, "Attribute-aware Identity-hard Triplet Loss for Video-based Person Re-identification, " arXiv:2006.07597 [cs], Jun. 2020, Accessed: Feb. 28, 2021. [Online]. Available: http://arxiv.org/abs/2006.07597.

Cited By

Wang YPan THu C(2024)Single-Task Joint Learning Model for an Online Multi-Object Tracking FrameworkApplied Sciences10.3390/app14221054014:22(10540)Online publication date: 15-Nov-2024
https://doi.org/10.3390/app142210540

Index Terms

A Multi-task Deep Network for video-based Person Re-identification
1. Networks
  1. Network algorithms
    1. Control path algorithms
      1. Network design and planning algorithms

Recommendations

Video-based person re-identification with scene and person attributes
Abstract
Person re-identification (Re-ID) is an essential computer vision task retrieving a person of interest across multiple non-overlapping cameras. In recent years, video-based person Re-ID research has become more and more popular. Compared with image-...
Attribute and Identity Are Equally Important: Person Re-identification with More Powerful Pedestrian Attributes
Artificial Intelligence
Abstract
Person re-identification (ReID) technology aims to identify characteristic people from different perspectives taken by different surveillance cameras. Due to the large changes in external light and viewing angle, the traditional ReID may ...
Survey on person re‐identification based on deep learning

Person re‐identification (Re‐ID) is a fundamental subject in the field of the computer vision technologies. The traditional methods of person Re‐ID have difficulty in solving the problems of person illumination, occlusion and attitude change under complex ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICCIR '21: Proceedings of the 2021 1st International Conference on Control and Intelligent Robotics

June 2021

807 pages

ISBN:9781450390231

DOI:10.1145/3473714

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Chongqing Univ.: Chongqing University

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICCIR 2021

ICCIR 2021: 2021 International Conference on Control and Intelligent Robotics

June 18 - 20, 2021

Guangzhou, China

Acceptance Rates

ICCIR '21 Paper Acceptance Rate 131 of 239 submissions, 55%;

Overall Acceptance Rate 131 of 239 submissions, 55%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
50
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang YPan THu C(2024)Single-Task Joint Learning Model for an Online Multi-Object Tracking FrameworkApplied Sciences10.3390/app14221054014:22(10540)Online publication date: 15-Nov-2024
https://doi.org/10.3390/app142210540

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten