Learning Deep RGBT Representations for Robust Person Re-identification

Zheng, Ai-Hua; Chen, Zi-Han; Li, Cheng-Long; Tang, Jin; Luo, Bin

doi:10.1007/s11633-020-1262-z

Learning Deep RGBT Representations for Robust Person Re-identification

Research Article
Published: 19 January 2021

Volume 18, pages 443–456, (2021)
Cite this article

International Journal of Automation and Computing Aims and scope Submit manuscript

244 Accesses
1 Altmetric
Explore all metrics

Abstract

Person re-identification (Re-ID) is the scientific task of finding specific person images of a person in a non-overlapping camera networks, and has achieved many breakthroughs recently. However, it remains very challenging in adverse environmental conditions, especially in dark areas or at nighttime due to the imaging limitations of a single visible light source. To handle this problem, we propose a novel deep red green blue (RGB)-thermal (RGBT) representation learning framework for a single modality RGB person Re-ID. Due to the lack of thermal data in prevalent RGB Re-ID datasets, we propose to use the generative adversarial network to translate labeled RGB images of person to thermal infrared ones, trained on existing RGBT datasets. The labeled RGB images and the synthetic thermal images make up a labeled RGBT training set, and we propose a cross-modal attention network to learn effective RGBT representations for person Re-ID in day and night by leveraging the complementary advantages of RGB and thermal modalities. Extensive experiments on Market 1501, CUHK03 and DukeMTMC-reID datasets demonstrate the effectiveness of our method, which achieves state-of-the-art performance on all above person Re-ID datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person re-identification

Article 24 November 2020

Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification

Low-resolution assisted three-stream network for person re-identification

Article 15 April 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

O. Oreifej, R. Mehran, M. Shah. Human identity recognition in aerial images. In Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA, pp. 709–716, 2010. DOI: https://doi.org/10.1109/CVPR.2010.5540147.
A. Mignon, F. Jurie. PCCA: A new approach for distance learning from sparse pairwise constraints. In Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, pp.2666–2672. 2012. DOI: https://doi.org/10.1109/CVPR.2012.6247987.
S. C. Liao, Y. Hu, X. Y. Zhu, S. Z. Li. Person re-identification by local maximal occurrence representation and metric learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 2197–2206, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298832.
M. Köstinger, M. Hirzer, P. Wohlhart, P. M. Roth, H. Bischof. Large scale metric learning from equivalence constraints. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, pp. 2288–2295, 2012. DOI:10.1109/CVPR.2012.6247939.
A. X. Li, K. X. Zhang, L. W. Wang. Zero-shot fine-grained classification by deep feature learning with semantics. International Journal of Automation and Computing, vol. 16, no. 5, pp. 563–574, 2019. DOI: https://doi.org/10.1007/s11633-019-1177-8.
Article Google Scholar
W. Li, R. Zhao, T. Xiao, X. G. Wang. DeepReID: Deep filter pairing neural network for person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp. 152–159, 2014. DOI: https://doi.org/10.1109/CVPR.2014.27.
L. Chen, H. Yang, S. Wu, Z. Y. Gao. Data generation for improving person re-identification. In Proceedings of the 25th ACM International Conference on Multimedia, ACM, Mountain View, USA, pp.609–617, 2017. DOI:10.1145/3123266.3123302.
Chapter Google Scholar
Z. D. Zheng, L. Zheng, Y. Yang. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3774–3782, 2017. DOI: https://doi.org/10.1109/ICCV.2017.405.
Z. Zhong, L. Zheng, D. L. Cao, S. Z. Li. Re-ranking person re-identification with k-reciprocal encoding. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 3652–3661, 2017. DOI: https://doi.org/10.1109/CVPR.2017.389.
J. Satake, M. Chiba, J. Miura. Visual person identification using a distance-dependent appearance model for a person following robot. International Journal of Automation and Computing, vol. 10, no. 5, pp. 438–446, 2013. DOI: https://doi.org/10.1007/s11633-013-0740-y.
Article Google Scholar
Y. B. Chen, X. T. Zhu, S. G. Gong. Person re-identification by deep learning multi-scale representations. In Proceedings of IEEE International Conference on Computer Vision Workshops, Venice, Italy, pp. 2590–2600, 2017. DOI: https://doi.org/10.1109/ICCVW.2017.304.
Z. D. Zheng, L. Zheng, Y. Yang. Pedestrian alignment network for large-scale person re-identification. IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 10, pp. 3037–3045, 2019. DOI: https://doi.org/10.1109/TCSVT.2018.2873599.
Article Google Scholar
G. D. Ding, S. Khan, Z. M. Tang, F. Porikli. Feature mask network for person re-identification. Pattern Recognition Letters, vol. 137, pp. 91–98, 2020. DOI: https://doi.org/10.1016/j.patrec.2019.02.015.
Article Google Scholar
L. Wu, R. C. Hong, Y. Wang, M. Wang. Cross-entropy adversarial view adaptation for person re-identification. IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 7, pp. 2081–2092, 2020. DOI: https://doi.org/10.1109/TCSVT.2019.2909549.
Google Scholar
D. S. Xu, J. Chen, C. Liang, Z. Wang, R. M. Hu. Cross-view identical part area alignment for person re-identification. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, UK, pp. 2462–2466, 2019. DOI: https://doi.org/10.1109/ICASSP.2019.8683137.
L. Wei, Z. Y. Wei, Z. M. Jin, Z. X. Yu, J. Q. Huang, D. Cai, X. F. He, X. S. Hua. SIF: Self-inspirited feature learning for person re-identification. IEEE Transactions on Image Processing, vol. 29, pp. 4942–4951, 2020. DOI: https://doi.org/10.1109/TIP.2020.2975712.
Article Google Scholar
D. Yi, Z. Lei, S. C. Liao, S. Z. Li. Deep metric learning for person re-identification. In Proceedings of the 22nd International Conference on Pattern Recognition, IEEE, Stockholm, Sweden, pp. 34–39, 2014. DOI: https://doi.org/10.1109/ICPR.2014.16.
Google Scholar
L. Zheng, L. Y. Shen, L. Tian, S. J. Wang, J. D. Wang, Q. Tian. Scalable person re-identification: A benchmark. In Proceedings of IEEE International Conference on Computer Vision, Santiago, Chile, pp. 1116–1124, 2015. DOI: https://doi.org/10.1109/ICCV.2015.133.
X. B. Chang, T. M. Hospedales, T. Xiang. Multi-level factorisation net for person re-idenrification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 2109–2118, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00225.
J. J. You, A. C. Wu, X. Li, W. S. Zheng. Top-push video-based person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 1345–1353, 2016. DOI: https://doi.org/10.1109/CV-PR.2016.150.
A. Hermans, L. Beyer, B. Leibe. In defense of the triplet loss for person re-identification, [Online], Available: https://arxiv.org/abs/1703.07737, 2017.
J. Wang, Z. Wang, C. Liang, C. X. Gao, N. Sang. Equidistance constrained metric learning for person re-identification. Pattern Recognition, vol. 74, pp. 38–51, 2018. DOI: https://doi.org/10.1016/j.patcog.2017.09.014.
Article Google Scholar
X. K. Zhu, X. Y. Jing, F. Zhang, X. Y. Zhang, X. G. You, X. Cui. Distance learning by mining hard and easy negative samples for person re-identification. Pattern Recognition. vol. 95, pp. 211–222, 2019. DOI: https://doi.org/10.1016/j.patcog.2019.06.007.
Article Google Scholar
H. T. Yao, S. L. Zhang, R. C. Hong, Y. D. Zhang, C. S. Xu, Q. Tian. Deep representation learning with part loss for person re-identification. IEEE Transactions on Image Processing, vol. 28, no. 6, pp. 2860–2871, 2019. DOI: https://doi.org/10.1109/TIP.2019.2891888.
Article MathSciNet MATH Google Scholar
D. W. Li, X. T. Chen, Z. Zhang, K. Q. Huang. Learning deep context-aware features over body and latent parts for person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 7398–7407, 2017. DOI: https://doi.org/10.1109/CVPR.2017.782.
J. X. Liu, B. B. Ni, Y. C. Yan, P. Zhou, S. Cheng, J. G. Hu. Pose transferrable person re-identification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 4099–4108, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00431.
Google Scholar
Y. X. Ge, Z. W. Li, H. Y. Zhao, G. J. Yin, S. Yi, X. G. Wang, H. S. Li. FD-GAN: Pose-guided feature distilling GAN for robust person re-identification. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 1230–1241, 2018.
Z. D. Zheng, X. D. Yang, Z. D. Yu, L. Zheng, Y. Yang, J. Kautz. Joint discriminative and generative learning for person re-identification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp.2133–2142, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00224.
Google Scholar
T. Sattrupai, W. Kusakunniran. Deep trajectory based gait recognition for human re-identification In Proceedings of IEEE Region 10 Conference, Jeju, South Korea, pp. 1723–1726, 2018. DOI: https://doi.org/10.1109/TENCON.2018.8650523.
C. Carley, E. Ristani, C. Tomasi. Person re-identification from gait using an autocorrelation network. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Long Beach, USA, pp. 2345–2353, 2019. DOI: https://doi.org/10.1109/CVPRW.2019.00288.
Google Scholar
C. L. Li, X. Y. Liang, Y. J. Lu, N. Zhao, J. Tang. RGB-T object tracking: Benchmark and baseline. Pattern Recognition, vol. 06, Article number 106977, 2019. DOI: https://doi.org/10.1016/j.patcog.2019.106977.
C. L. Li, H. Cheng, S. Y. Hu, X. B. Liu, J. Tang, L. Lin. Learning collaborative sparse representation for grayscale-thermal tracking. IEEE Transactions on Image Processing, vol. 25, no. 12, pp. 5743–5756, 2016. DOI: https://doi.org/10.1109/TIP.2016.2614135.
Article MathSciNet MATH Google Scholar
L. St-Laurent, X. Maldague, D. Prevost. Combination of colour and thermal sensors for enhanced object detection. In Proceedings of the 10th International Conference on Information Fusion, IEEE, Quebec, Canada, pp. 1–8, 2007. DOI: https://doi.org/10.1109/ICIF.2007.4408003.
Google Scholar
D. T. Nguyen, H. G. Hong, K. W. Kim, K. R. Park. Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors, vol. 17, no. 3, Article number 605, 2017. DOI: https://doi.org/10.3390/s17030605.
M. Ye, Z. Wang, X. Y. Lan, P. C. Yuen. Visible thermal person re-identification via dual-constrained top-ranking. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI, Stockholm, Sweden, pp. 1092–1099, 2018. DOI: https://doi.org/10.24963/ijcai.2018/152.
Google Scholar
P. Y. Dai, R. R. Ji, H. B. Wang, Q. Wu, Y. Y. Huang. Cross-modality person re-identification with generative adversarial training In Proceedings of the 47th International Joint Conference on Artificial Intelligence, IJCAI, Stockholm, Sweden, pp 677–683, 2018.
Google Scholar
M. Ye, X. Y. Lan, J. W. Li, P. C. Yuen. Hierarchical discriminative learning for visible thermal person re-identification. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, the 30th Innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI, New Orleans, USA, 2018.
Google Scholar
L. C. Zhang, A. Gonzalez-Garcia, J. van de Weijer, M. Danelljan, F. S. Khan. Synthetic data generation for end-to-end thermal infrared tracking. IEEE Transactions on Image Processing, vol. 28, no. 4, pp. 1837–1850, 2019. DOI: https://doi.org/10.1109/TIP.2018.2879249.
Article MathSciNet Google Scholar
J. Y. Zhu, T. Park, P. Isola, A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 2242–2251, 2017. DOI: https://doi.org/10.1109/ICCV.2017.244.
X. Zhang, Q. Yang. Transfer hierarchical attention network for generative dialog system. International Journal of Automation and Computing, vol. 16, no. 6, pp. 720–736, 2019. DOI: https://doi.org/10.1007/s11633-019-1200-0.
Article Google Scholar
B. S. Wang, G. Cao, Y. F. Shang, L. C. Zhou, Y. Q. Zhang, X. S. Li. Single-column CNN for crowd counting with pixel-wise attention mechanism. Neural Computing and Applications, vol. 32, no. 7, pp. 2897–2908, 2020. DOI: https://doi.org/10.1007/s00521-018-3810-9.
Article Google Scholar
T. V. Nguyen, Z. Song, S. Y. Yan. STAP: Spatial-temporal attention-aware pooling for action recognition. IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 1, pp. 77–86, 2015. DOI: https://doi.org/10.1109/TCSVT.2014.2333151.
Article Google Scholar
Z. Ji, K. L. Xiong, Y. W. Pang, X. L. Li. Video summarization with attention-based encoder-decoder networks. IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 6, pp. 1709–1717, 2020. DOI: https://doi.org/10.1109/TCSVT.2019.2904996.
Article Google Scholar
Z. C. Wang, L. Du, F. Wang, H. T. Su, Y. Zhou. Multiscale target detection in SAR image based on visual attention model. In Proceedings of the IEEE 5th Asia-Pacific Conference on Synthetic Aperture Radar, Singapore, Singapore, pp. 704–709, 2015. DOI: https://doi.org/10.1109/APSAR.2015.7306303.
S. Woo, J. Park, J. Y. Lee, I. S. Kweon. CBAM: Convolutional block attention module. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 3–19, 2018. DOI: https://doi.org/10.1007/978-3-03001234-2_1.
Google Scholar
H. R. Chen, Y. W. Wang, Y. M. Shi, K. Yan, M. Y. Geng, Y. H. Tian, T. Xiang. Deep transfer learning for person re-identification. In Proceedings of the 4th International Conference on Multimedia Big Data, IEEE, Xi’an, China, pp. 1–5, 2018. DOI: https://doi.org/10.1109/BigMM.2018.8499067.
Google Scholar
H. Y. Zhao, M. Q. Tian, S. Y. Sun, J. Shao, J. J. Yan, S. Yi, X. G. Wang, X. H. Tang. Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 907–915, 2017. DOI: https://doi.org/10.1109/CVPR.2017.103.
C. Su, J. N. Li, S. L. Zhang, J. L. Xing, W. Gao, Q. Tian. Pose-driven deep convolutional model for person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3980–3989, 2017. DOI: https://doi.org/10.1109/ICCV.2017.427.
Y. F. Sun, L. Zheng, Y. Yang, Q. Tian, S. J. Wang. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 501–518, 2018. DOI: https://doi.org/10.1007/978-3-030-01225-0_30.
Google Scholar
W. H. Chen, X. T. Chen, J. G. Zhang, K. Q. Huang. Beyond triplet loss: A deep quadruplet network for person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 1320–1329, 2017. DOI: https://doi.org/10.1109/CVPR.2017.145.
Y. Yuan, W. Y. Chen, Y. Yang, Z. Y. Wang. In defense of the triplet loss again: Learning robust person re-identification with fast approximated triplet loss and label distillation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Seattle, USA, pp. 1454–1463, 2020. DOI: https://doi.org/10.1109/CVPRW50498.2020.00185.
Google Scholar
I. B. Barbosa, M. Cristani, A. del Bue, L. Bazzani, V. Murino. Re-identification with RGB-D sensors. In Proceedings of European Conference on Computer Vision, Springer, Florence, Italy, pp. 433–442, 2012. DOI: https://doi.org/10.1007/978-3-642-33863-2_43.
Google Scholar
M. Munaro, A. Fossati, A. Basso, E. Menegatti, L. van Gool. One-shot person re-identification with a consumer depth camera. Person Re-Identification, S. G. Gong, M. Cristani, S. C. Yan, C. C. Loy, Eds., London, UK: Springer, pp. 161–181, 2014. DOI: https://doi.org/10.1007/978-1-4471-6296-4_8.
Chapter Google Scholar
F. Pala, R. Satta, G. Fumera, F. Roli. Multimodal person reidentification using RGB-D cameras. IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no.4, pp. 788–799, 2016. DOI https://doi.org/10.1109/TCSVT.2015.2424056.
Article Google Scholar
A. Mogelmose, C. Bahnsen, T. Moeslund, A. Clapes, S. Escalera. Tri-modal person re-identification with RGB, depth and thermal features. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, Portland, USA, pp.301–307, 2013. DOI: https://doi.org/10.1109/CVPRW.2013.52.
X. X. Xu, W. Li, D. Xu. Distance metric learning using privileged information for face verification and person re-identification. IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 12, pp. 3150–3162, 2015. DOI: https://doi.org/10.1109/TNNLS.2015.2405574.
Article MathSciNet Google Scholar
V. John, G. Englebienne, B. Krose. Person re-identification using height-based gait in colour depth camera. In Proceedings of IEEE International Conference on Image Processing, Melbourne, Australia, pp. 3345–3349, 2013. DOI: https://doi.org/10.1109/ICIP.2013.6738689.
A. C. Wu, W. S. Zheng, J. H. Lai. Robust depth-based person re-identification. IEEE Transactions on Image Processing, vol. 26, no. 6, pp. 2588–2603, 2017. DOI: https://doi.org/10.1109/TIP.2017.2675201.
Article MathSciNet MATH Google Scholar
M. Paolanti, L. Romeo, D. Liciotti, R. Pietrini, A. Cenci, E. Frontoni, P. Zingaretti. Person re-identification with RGB-D camera in top-view configuration through multiple nearest nearest neighbor classifiers and neighborhood component features selection. Sensors, vol 18, no. 10, Article number 3471, 2018. DOI: https://doi.org/10.3390/s18103471.
L. L. Ren, J. W. Lu, J. J. Feng, J. Zhou. Uniform and variational deep learning for RGB-D object recognition and person re-identification. IEEE Transactions on Image Processing, vol. 28, no. 10, pp. 4970–4983, 2019. DOI: https://doi.org/10.1109/TIP.2019.2915655.
Article MathSciNet MATH Google Scholar
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS, Long Beach, USA, pp. 2672–2680, 2014.
Google Scholar
M. Mirza, S. Osindero. Conditional generative adversarial nets, [Online], Available: https://arxiv.org/abs/1411.1784, 2014.
A. Radford, L. Metz, S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks, [Online], https://arxiv.org/abs/1511.06434, 2015.
G. Perarnau, J. van de Weijer, B. Raducanu, J. M. Álvarez. Invertible conditional GANS for image editing, [Online], Available: https://arxiv.org/abs/1611.06355, 2016.
P. Isola, J. Y. Zhu, T. H. Zhou, A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 5967–5976, 2017. DOI: https://doi.org/10.1109/CVPR.2017.632.
D. Xu, W. L. Ouyang, E. Ricci, X. G. Wang, N. Sebe. Learning cross-modal deep representations for robust pedestrian detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 4236–4244, 2017. DOI: https://doi.org/10.1109/CVPR.2017.451.
Y. Luo, J. Ren, M. Lin, J. H. Pang, W. X. Sun, H. S. Li, L. Lin. Single view stereo matching. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Satt Lake City, USA, pp.155–163, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00024.
Google Scholar
T. T. Qiao, J. Zhang, D. Q. Xu, D. C. Tao. MirrorGAN: Learning text-to-image generation by redescription. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1505–1514, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00160.
Google Scholar
L. Chen, S. Srivastava, Z. Y. Duan, C. L. Xu. Deep cross-modal audio-visual generation. In Proceedings of Thematic Workshops of ACM Multimedia 2017, ACM, Mountain View, USA, pp. 349–357, 2017. DOI: https://doi.org/10.1145/3126686.3126723.
Chapter Google Scholar
H. Zhou, Y. Liu, Z. W. Liu, P. Luo, X. G. Wang. Talking face generation by adversarially disentangled audio-visual representation. In Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, no. 1, pp. 9299–9306, 2019. DOI: https://doi.org/10.1609/aaai.v33i01.33019299.
Article Google Scholar
C. Li, M. Wand. Precomputed real-time texture synthesis with Markovian generative adversarial networks. In Proceeding of 4th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 702–716, 2016. DOI: https://doi.org/10.1007/978-3-319-46487-9_43.
Google Scholar
A. C. Wu, W. S. Zheng, H. X. Yu, S. G. Gong, J. H. Lai. RGB-infrared cross-modality person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 5390–5399, 2017. DOI: https://doi.org/10.1109/ICCV.2017.575.
D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. [Online], Available: https://arxiv.org/abs/1412.6980, 2014.
B. T. Zhang, X. P. Wang, Y. Shen, T. Lei. Dual-modal physiological feature fusion-based sleep recognition using CFS and RF algorithm. International Journal of Automation and Computing, vol. 16, no. 3, pp. 286–296, 2019. DOI: https://doi.org/10.1007/s11633-019-1171-1.
Article Google Scholar
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.
S. Ioffe, C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 448–456, 2015.
E. Ristani, F. Solera, R. Zou, R. Cucchiara, C. Tomasi. Performance measures and a data set for multi-target, multi-camera tracking. In Proceedings of European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 17–35, 2016. DOI: https://doi.org/10.1007/978-3-319-48881-3_2.
Google Scholar
T. Xiao, S. Li, B. C. Wang, L. Lin, X. G. Wang. Joint detection and identification feature learning for person search. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 3376–3385, 2017. DOI: https://doi.org/10.1109/CVPR.2017.360.
W. Li, X. T. Zhu, S. G. Gong. Person re-identification by deep joint learning of multi-loss classification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, pp. 2194–2200, 2017. DOI: https://doi.org/10.24963/ijcai.2017/305.
L. X. He, J. Liang, H. Q. Li, Z. N. Sun. Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp.7073–7082, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00739.
Google Scholar
A. Siarohin, E. Sangineto, S. Lathuilière, N. Sebe. Deformable GANs for pose-based human image generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 3408–3416, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00359.
J. H. Zhou, P. Yu, W. Tang, Y. Wu. Efficient online local metric adaptation via negative samples for person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 2439–2447, 2017. DOI: https://doi.org/10.1109/ICCV.2017.265.
Y. F. Sun, L. Zheng, W. J. Deng, S. J. Wang. SVDNet for pedestrian retrieval. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3820–3828, 2017. DOI: https://doi.org/10.1109/ICCV.2017.410.
L. M. Zhao, X. Li, Y. T. Zhuang, J. D. Wang. Deeply-learned part-aligned representations for person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3239–3248, 2017. DOI: https://doi.org/10.1109/ICCV.2017.349.
W. J. Deng, L. Zheng, Q. X. Ye, G. L. Kang, Y. Yang, J. B. Jiao. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 994–1003, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00110.
Google Scholar
Y. Q. Zhang, X. Li, L. M. Zhao, Z. F. Zhang. Semantics-aware deep correspondence structure learning for robust person re-identification. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, USA, pp. 3545–3551, 2016.
J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00745.
Google Scholar
S. H. Gao, M. M. Cheng, K. Zhao, X. Y. Zhang, M. H. Yang, P. H. S. Torr. Res2Net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019. DOI: https://doi.org/10.1109/TPAMI.2019.2938758.

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Nos. 61976002, 61976003 and 61860206004), Natural Science Foundation of Anhui Higher Education Institutions of China (No. KJ2019A0033), and the Open Project Program of the National Laboratory of Pattern Recognition (No. 201900046).

Author information

Authors and Affiliations

Anhui Provincial Key Laboratory of Multi-modal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei, 230601, China
Ai-Hua Zheng, Zi-Han Chen, Cheng-Long Li, Jin Tang & Bin Luo

Authors

Ai-Hua Zheng
View author publications
You can also search for this author inPubMed Google Scholar
Zi-Han Chen
View author publications
You can also search for this author inPubMed Google Scholar
Cheng-Long Li
View author publications
You can also search for this author inPubMed Google Scholar
Jin Tang
View author publications
You can also search for this author inPubMed Google Scholar
Bin Luo
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Cheng-Long Li.

Additional information

Recommended by Associate Editor Jangmyung Lee

Ai-Hua Zheng received the B. Eng. and Ph. D. degrees in computer science and technology from Anhui University, China in 2006 and 2008, respectively. And she received the Ph. D. degree in computer science from University of Greenwich, UK in 2012. She visited University of Stirling and Texas State University from June to September in 2013 and from September 2019 to August 2020, respectively. She is currently an associate professor and Ph. D. supervisor in School of Computer Science and Technology in Anhui University, China. As the first author or corresponding author, she has published more than 40 academic papers, including top conferences papers in American Association for Artificial Intelligence Conference on Artificial Intelligence (AAAI) and the International Joint Conference on Artificial Intelligence (IJCAI), and authoritative journals in IEEE Transactions on Systems, Man, and Cybernetics: Systems (TSMCS), Pattern Recognition (PR), Pattern Recognition Letters (PRL), Neurocomputing (NeuCom), Cognitive Computation (CogCom), the IEEE International Symposium on Network Computing and Applications (NCA), etc. She is a member of China Computer Federation (CCF) and China Society of Image and Graphics (CSIG). She is also serving as reviewers for representative conferences and journals, including AAAI, IJCAI, IEEE Transactions on Image Processing (TIP), IEEE Transactions on Multimedia (TMM), IEEE Transactions on Intelligent Transportation Systems (TITS), PR, etc. She has obtained the Best Paper Award in the International Conference on Software Engineering Research, Management and Applications (SERA) 2017 and the Best Student Paper Award in the workshop in the IEEE International Conference on Multimedia and Expo (ICME) 2019.

Her research interests include vision based artificial intelligence and pattern recognition, especially on person/vehicle reidentification, audio visual computing, and multi-modal intelligence.

Zi-Han Chen received the B. Eng. degree in software engineering from Anhui University, China in 2018. He is currently a master student in computer science and technology from Anhui University, China.

His research interests include computer vision, person re-identification and machine learning.

Cheng-Long Li received the M. Sc. and Ph. D. degrees in computer science from School of Computer Science and Technology, Anhui University, China in 2013 and 2016, respectively. From 2014 to 2015, he worked as a visiting student with School of Data and Computer Science, Sun Yat-sen University, China. He was a postdoctoral research fellow at the Center for Research on Intelligent Perception and Computing (CRIPAC), National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), China. He is currently an associate professor at School of Computer Science and Technology, Anhui University, China. He was a recipient of the ACM Hefei Doctoral Dissertation Award in 2016.

His research interests include computer vision and deep learning.

Jin Tang received the B. Eng. degree in automation and the Ph. D. degree in computer science from Anhui University, China in 1999 and 2007, respectively. He is a professor with School of Computer Science and Technology, Anhui University.

His research interests include computer vision, pattern recognition and machine learning.

Bin Luo received the B. Eng. degree in electronics, and the M. Eng. degree in computer science from Anhui University, China in 1984 and 1991, respectively, and the Ph. D. degree in computer science from University of York, UK in 2002. From 2000 to 2004, he was a research associate with University of York, UK. He is currently a professor with Anhui University, China.

His research interests include graph spectral analysis, large image database retrieval, image and graph matching, statistical pattern recognition, digital watermarking and information security.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, AH., Chen, ZH., Li, CL. et al. Learning Deep RGBT Representations for Robust Person Re-identification. Int. J. Autom. Comput. 18, 443–456 (2021). https://doi.org/10.1007/s11633-020-1262-z

Download citation

Received: 22 June 2020
Accepted: 10 October 2020
Published: 19 January 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11633-020-1262-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning Deep RGBT Representations for Robust Person Re-identification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal Person re-identification

Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification

Low-resolution assisted three-stream network for person re-identification

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now