Recurrent matching networks of spatial alignment learning for person re-identification

Lin, Lan; Zhang, Dan; Zheng, Xin; Ye, Mao; Guo, Jiuxia

doi:10.1007/s11042-019-08364-9

Recurrent matching networks of spatial alignment learning for person re-identification

Published: 26 November 2019

Volume 79, pages 33735–33755, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Lan Lin¹,
Dan Zhang¹,
Xin Zheng²,
Mao Ye¹ &
…
Jiuxia Guo³

215 Accesses
2 Citations
Explore all metrics

Abstract

Person re-identification (re-id) usually refers to matching people across disjoint camera views. Many existing methods focus on extracting discriminative features or learning distance metrics to make the intraclass distance smaller than interclass distances. These methods subconsciously assume that pedestrian images are well aligned. However, one major challenge in person re-id is the unconstrained spatial misalignment between image pairs due to view angle changes and pedestrian pose variations. To address this problem, in this paper, we propose Recurrent Matching Network of Spatial Alignment Learning (RMN-SAL) to simulate the human vision perception. Reinforcement learning is introduced to locate attention regions, since it provides a flexible learning strategy for sequential decision-making. A linear mapping is employed to convert the environment state into spatial constraint, comprising spatial alignment into feature learning. And recurrent models are used to extract information from a sequence of corresponding regions. Finally, person re-id is performed based on the global features and the features from the learned alignment regions. Our contributions are: 1) the recurrent matching network, which can subtly combine local feature learning and sequential spatial correspondence learning into an end-to-end framework; 2) the design of a location network, which is based on reinforcement learning and aims to learn task-specific sequential spatial correspondences for different image pairs through the local pairwise internal representation interactions. The proposed model is evaluated on three benchmarks, including Market-1501, DukeMTMC-reID and CUHK03, and achieves better performances than other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

Article Open access 12 April 2024

Visual attention network

Article Open access 28 July 2023

References

Ahmed E, Jones M, Marks TK (2015) An improved deep learning architecture for person re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 3908–3916
An L, Chen X, Yang S, Li X (2017) Person re-identification by multi-hypergraph fusion. IEEE Trans Neural Netw Learn Syst 28(11):2763–2774
Article MathSciNet Google Scholar
Chang X, Hospedales TM, Xiang T (2018) Multi-level factorisation net for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2109–2118
Chen Y, Zhu X, Gong S (2018) Person re-identification by deep learning multi-scale representations. In: Proceedings of IEEE International Conference on Computer Vision Workshop, pp 2590–2600
Chen Y, Zhu X, Zheng W, Lai J (2018) Person re-identification by camera correlation aware feature augmentation. IEEE Trans Pattern Anal Mach Intell 40(2):392–408
Article Google Scholar
Choe G, Yuan C, Wang T, Feng Q, Hyon G, Choe C, Ri J, Ji G (2016) Combined salience based person re-identification. Multimed Tools Appl 75 (18):11,447–11,468
Article Google Scholar
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 248–255
Denil M, Bazzani L, Larochelle H, de Freitas N (2012) Learning where to attend with deep architectures for image tracking. Neural Comput 24(8):2151–2184
Article MathSciNet Google Scholar
Farenzena M, Bazzani L, Perina A, Murino V, Cristani M (2010) Person re-identification by symmetry-driven accumulation of local features. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 2360–2367
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. In: arxiv:1703.07737
Hu HM, Fang W, Zeng G, Hu Z, Li B (2017) A person re-identification algorithm based on pyramid color topology feature. Multimed Tools Appl 76(24):26,633–26,646
Article Google Scholar
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Jaderberg M, Simonyan K, Zisserman A, et al. (2015) Spatial transformer networks. In: Advances in neural information processing systems, pp 2017–2025
Jing XY, Zhu X, Wu F, Hu R, You X, Wang Y, Feng H, Yang JY (2017) Super-resolution person re-identification with semi-coupled low-rank discriminant dictionary learning. IEEE Trans Image Process 26:1363–1378
Article MathSciNet Google Scholar
Kaelbling LP, Littman ML, Moore AW (1996) Reinforcement learning: a survey. J Artif Intell Res 4:237–285
Article Google Scholar
Koch CSOC, Koch C, Davis J, Davis J (1994) Large-scale neuronal theories of the brain. MIT Press, Cambridge
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097– 1105
Lan X, Wang H, Gong S, Zhu X (2017) Deep reinforcement learning attention selection for person re-identification. In: Proceedings of British Machine Vision Conference
Larochelle H, Hinton GE (2010) Learning to combine foveal glimpses with a third-order boltzmann machine. In: Advances in neural information processing systems, pp 1243–1251
Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: Deep filter pairing neural network for person re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 152–159
Li D, Chen X, Zhang Z, Huang K (2017) Learning deep context-aware features over body and latent parts for person re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 384–393
Li J, Wu Y, Lu K (2017) Structured domain adaptation. IEEE Trans Circ Syst Video Technol 27(8):1700–1713
Article Google Scholar
Li W, Zhu X, Gong S (2017) Person re-identification by deep joint learning of multi-loss classification. In: Proceedings of International Joint Conference on Artificial Intelligence, pp 2194–2200
Li J, Lu K, Huang Z, Zhu L, Shen HT (2018) Transfer independently together: a generalized framework for domain adaptation. IEEE Trans Cybern 1(99):1–12
Google Scholar
Li W, Zhu X, Gong S (2018) Harmonious attention network for person re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 2
Li X, Liu L, Lu X (2018) Person reidentification based on elastic projections. IEEE Trans Neural Netw Learn Syst 29(4):1314–1327
Article Google Scholar
Liao S, Hu Y, Zhu X, Li S (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 2197–2206
Liao S, Li S (2015) Efficient psd constrained asymmetric metric learning for person re-identification. In: Proceedings of IEEE International Conference on Computer Vision, pp 3685–3693
Lin L, Huang R, Li X, Zhang F, Ye M (2017) Person re-identification by optimally organizing multiple similarity measures. IEEE Access 5:26,034–26,045
Article Google Scholar
Lin W, Shen Y, Yan J, Xu M, Wu J, Wang J, Lu K (2017) Learning correspondence structures for person re-identification. IEEE Trans Image Process 26 (5):2438–2453
Article MathSciNet Google Scholar
Lin Y, Zheng L, Zheng Z, Wu Y, Yang Y (2017) Improving person re-identification by attribute and identity learning. arXiv:1703.07220
Lin L, Luo H, Huang R, Ye M (2019) Recurrent models of visual co-attention for person re-identification. IEEE Access 7:8865–8875
Article Google Scholar
Liu H, Feng J, Qi M, Jiang J, Yan S (2017) End-to-end comparative attention networks for person re-identification. IEEE Trans Image Process 26(7):3492–3506
Article MathSciNet Google Scholar
Martinel N, Das A, Micheloni C, Roy-Chowdhury AK (2016) Temporal model adaptation for person re-identification. In: Proceedings of European Conference on Computer Vision, pp 858–877
Matsukawa T, Okabe T, Suzuki E, Sato Y (2016) Hierarchical gaussian descriptor for person re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 1363–1372
Mclaughlin N, Rincon JMD, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 1325–1334
Mnih V, Heess N, Graves A, et al. (2014) Recurrent models of visual attention. In: International conference on neural information processing systems, pp 2204–2212
Paisitkriangkrai S, Shen C, van den Hengel A (2015) Learning to rank in person re-identification with metric ensembles. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 1846–1855
Qian X, Fu Y, Jiang YG, Xiang T, Xue X (2017) Multi-scale deep learning architectures for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5409– 5418
Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: Proceedings of European Conference on Computer Vision, pp 17–35
Shen Y, Lin W, Yan J, Xu M, Wu J, Wang J (2015) Person re-identification with correspondence structure learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3200– 3208
Shen Y, Xiao T, Li H, Yi S, Wang X (2018) End-to-end deep kronecker-product matching for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 6886–6895
Si J, Zhang H, Li CG, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition
Sun Y, Zheng L, Deng W, Wang S (2017) Svdnet for pedestrian retrieval. In: Proceedings of IEEE International Conference on Computer Vision, pp 3820–3828
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 2818–2826
Varior RR, Haloi M, Wang G (2016) Gated siamese convolutional neural network architecture for human re-identification. In: Proceedings of European Conference on Computer Vision, pp 791– 808
Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: Proceedings of European Conference on Computer Vision, pp 135–153
Wang H, Zhu X, Gong S, Xiang T (2018) Person re-identification in identity regression space. International Journal of Computer Vision, pp 1–23
Weinberger KQ, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(1):207–244
MATH Google Scholar
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In: Proceedings of European Conference on Computer Vision, pp 499–515
Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3-4):229–256
Article Google Scholar
Wu A, Zheng WS, Lai JH (2017) Robust depth-based person re-identification. IEEE Trans Image Process 26(6):2588–2603
Article MathSciNet Google Scholar
Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 1249–1258
Wu L, Wang Y, Li X, Gao J (2018) What-and-where to match: Deep spatially multiplicative integration networks for person re-identification. Pattern Recogn 76:727–738
Article Google Scholar
Xiao T, Li S, Wang B, Lin L, Wang X (2017) Joint detection and identification feature learning for person search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3376–3385
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057
Yu Q, Chang X, Song YZ, Xiang T, Hospedales TM (2017) The devil is in the middle: Exploiting mid-level representations for cross-domain instance matching. arXiv:1711.08106
Zhang L, Xiang T, Gong S (2016) Learning a discriminative null space for person re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp 1239– 1248
Zhao R, Ouyang W, Wang X (2014) Learning mid-level filters for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 144–151
Zheng L, Shen L, Tian L, Wang S, Wang J, Tian Q (2015) Scalable person re-identification: a benchmark. In: Proceedings of IEEE International Conference on Computer Vision, pp 1116–1124
Zheng Z, Zheng L, Yang Y (2017) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3774–3782
Zhong Z, Zheng L, Kang G, Li S, Yang Y (2017) Random erasing data augmentation. arXiv:1708.04896
Zhu L, Shen J, Xie L, Cheng Z (2016) Unsupervised visual hashing with semantic assistant for content-based image retrieval. IEEE Trans Knowl Data Eng 29 (2):472–486
Article Google Scholar
Zhu L, Huang Z, Liu X, He X, Sun J, Zhou X (2017) Discrete multimodal hashing with canonical views for robust mobile landmark search. IEEE Trans Multimed 19(9):2066–2079
Article Google Scholar
Zhu F, Kong X, Wu Q, Fu H, Li M (2018) A loss combination based deep model for person re-identification. Multimed Tools Appl 77(3):3049–3069
Article Google Scholar
Zhu L, Huang Z, Li Z, Xie L, Shen HT (2018) Exploring auxiliary context: discrete semantic transfer hashing for scalable image retrieval. IEEE Trans Neural Netw Learn Syst 29(11):5264– 5276
Article MathSciNet Google Scholar
Zhu X, Jing XY, You X, Zhang X, Zhang T (2018) Video-based person re-identification by simultaneously learning intra-video and inter-video distance metrics. IEEE Trans Image Process 27(11):5683–5695
Article MathSciNet Google Scholar

Download references

Acknowledgments

This work was supported in part by the National Key R&D Program of China (2018YFC0831800), National Natural Science Foundation of China (61773093), Important Science and Technology Innovation Projects in Chengdu (2018-YF08-00039-GX) and Research Programs of Sichuan Science and Technology Department (2016JY0088, 17ZDYF3184).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Center for Robotics, Key Laboratory for NeuroInformation of Ministry of Education, University of Electronic Science and Technology of China, Chengdu, 611731, China
Lan Lin, Dan Zhang & Mao Ye
China West Normal University, Nanchong, 637002, China
Xin Zheng
Civil Aviation Flight University of China, Guanghan, 618307, China
Jiuxia Guo

Authors

Lan Lin
View author publications
You can also search for this author in PubMed Google Scholar
Dan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Mao Ye
View author publications
You can also search for this author in PubMed Google Scholar
Jiuxia Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mao Ye.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lin, L., Zhang, D., Zheng, X. et al. Recurrent matching networks of spatial alignment learning for person re-identification. Multimed Tools Appl 79, 33735–33755 (2020). https://doi.org/10.1007/s11042-019-08364-9

Download citation

Received: 10 March 2019
Revised: 16 August 2019
Accepted: 09 October 2019
Published: 26 November 2019
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11042-019-08364-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recurrent matching networks of spatial alignment learning for person re-identification

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

Visual attention network

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Recurrent matching networks of spatial alignment learning for person re-identification

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

BoostTrack: boosting the similarity measure and detection confidence for improved multiple object tracking

Visual attention network

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation