Abstract
Person re-identification (Re-ID) is the scientific task of finding specific person images of a person in a non-overlapping camera networks, and has achieved many breakthroughs recently. However, it remains very challenging in adverse environmental conditions, especially in dark areas or at nighttime due to the imaging limitations of a single visible light source. To handle this problem, we propose a novel deep red green blue (RGB)-thermal (RGBT) representation learning framework for a single modality RGB person Re-ID. Due to the lack of thermal data in prevalent RGB Re-ID datasets, we propose to use the generative adversarial network to translate labeled RGB images of person to thermal infrared ones, trained on existing RGBT datasets. The labeled RGB images and the synthetic thermal images make up a labeled RGBT training set, and we propose a cross-modal attention network to learn effective RGBT representations for person Re-ID in day and night by leveraging the complementary advantages of RGB and thermal modalities. Extensive experiments on Market 1501, CUHK03 and DukeMTMC-reID datasets demonstrate the effectiveness of our method, which achieves state-of-the-art performance on all above person Re-ID datasets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
O. Oreifej, R. Mehran, M. Shah. Human identity recognition in aerial images. In Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA, pp. 709–716, 2010. DOI: https://doi.org/10.1109/CVPR.2010.5540147.
A. Mignon, F. Jurie. PCCA: A new approach for distance learning from sparse pairwise constraints. In Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, pp.2666–2672. 2012. DOI: https://doi.org/10.1109/CVPR.2012.6247987.
S. C. Liao, Y. Hu, X. Y. Zhu, S. Z. Li. Person re-identification by local maximal occurrence representation and metric learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 2197–2206, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298832.
M. Köstinger, M. Hirzer, P. Wohlhart, P. M. Roth, H. Bischof. Large scale metric learning from equivalence constraints. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, pp. 2288–2295, 2012. DOI:10.1109/CVPR.2012.6247939.
A. X. Li, K. X. Zhang, L. W. Wang. Zero-shot fine-grained classification by deep feature learning with semantics. International Journal of Automation and Computing, vol. 16, no. 5, pp. 563–574, 2019. DOI: https://doi.org/10.1007/s11633-019-1177-8.
W. Li, R. Zhao, T. Xiao, X. G. Wang. DeepReID: Deep filter pairing neural network for person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp. 152–159, 2014. DOI: https://doi.org/10.1109/CVPR.2014.27.
L. Chen, H. Yang, S. Wu, Z. Y. Gao. Data generation for improving person re-identification. In Proceedings of the 25th ACM International Conference on Multimedia, ACM, Mountain View, USA, pp.609–617, 2017. DOI:10.1145/3123266.3123302.
Z. D. Zheng, L. Zheng, Y. Yang. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3774–3782, 2017. DOI: https://doi.org/10.1109/ICCV.2017.405.
Z. Zhong, L. Zheng, D. L. Cao, S. Z. Li. Re-ranking person re-identification with k-reciprocal encoding. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 3652–3661, 2017. DOI: https://doi.org/10.1109/CVPR.2017.389.
J. Satake, M. Chiba, J. Miura. Visual person identification using a distance-dependent appearance model for a person following robot. International Journal of Automation and Computing, vol. 10, no. 5, pp. 438–446, 2013. DOI: https://doi.org/10.1007/s11633-013-0740-y.
Y. B. Chen, X. T. Zhu, S. G. Gong. Person re-identification by deep learning multi-scale representations. In Proceedings of IEEE International Conference on Computer Vision Workshops, Venice, Italy, pp. 2590–2600, 2017. DOI: https://doi.org/10.1109/ICCVW.2017.304.
Z. D. Zheng, L. Zheng, Y. Yang. Pedestrian alignment network for large-scale person re-identification. IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 10, pp. 3037–3045, 2019. DOI: https://doi.org/10.1109/TCSVT.2018.2873599.
G. D. Ding, S. Khan, Z. M. Tang, F. Porikli. Feature mask network for person re-identification. Pattern Recognition Letters, vol. 137, pp. 91–98, 2020. DOI: https://doi.org/10.1016/j.patrec.2019.02.015.
L. Wu, R. C. Hong, Y. Wang, M. Wang. Cross-entropy adversarial view adaptation for person re-identification. IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 7, pp. 2081–2092, 2020. DOI: https://doi.org/10.1109/TCSVT.2019.2909549.
D. S. Xu, J. Chen, C. Liang, Z. Wang, R. M. Hu. Cross-view identical part area alignment for person re-identification. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, UK, pp. 2462–2466, 2019. DOI: https://doi.org/10.1109/ICASSP.2019.8683137.
L. Wei, Z. Y. Wei, Z. M. Jin, Z. X. Yu, J. Q. Huang, D. Cai, X. F. He, X. S. Hua. SIF: Self-inspirited feature learning for person re-identification. IEEE Transactions on Image Processing, vol. 29, pp. 4942–4951, 2020. DOI: https://doi.org/10.1109/TIP.2020.2975712.
D. Yi, Z. Lei, S. C. Liao, S. Z. Li. Deep metric learning for person re-identification. In Proceedings of the 22nd International Conference on Pattern Recognition, IEEE, Stockholm, Sweden, pp. 34–39, 2014. DOI: https://doi.org/10.1109/ICPR.2014.16.
L. Zheng, L. Y. Shen, L. Tian, S. J. Wang, J. D. Wang, Q. Tian. Scalable person re-identification: A benchmark. In Proceedings of IEEE International Conference on Computer Vision, Santiago, Chile, pp. 1116–1124, 2015. DOI: https://doi.org/10.1109/ICCV.2015.133.
X. B. Chang, T. M. Hospedales, T. Xiang. Multi-level factorisation net for person re-idenrification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 2109–2118, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00225.
J. J. You, A. C. Wu, X. Li, W. S. Zheng. Top-push video-based person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 1345–1353, 2016. DOI: https://doi.org/10.1109/CV-PR.2016.150.
A. Hermans, L. Beyer, B. Leibe. In defense of the triplet loss for person re-identification, [Online], Available: https://arxiv.org/abs/1703.07737, 2017.
J. Wang, Z. Wang, C. Liang, C. X. Gao, N. Sang. Equidistance constrained metric learning for person re-identification. Pattern Recognition, vol. 74, pp. 38–51, 2018. DOI: https://doi.org/10.1016/j.patcog.2017.09.014.
X. K. Zhu, X. Y. Jing, F. Zhang, X. Y. Zhang, X. G. You, X. Cui. Distance learning by mining hard and easy negative samples for person re-identification. Pattern Recognition. vol. 95, pp. 211–222, 2019. DOI: https://doi.org/10.1016/j.patcog.2019.06.007.
H. T. Yao, S. L. Zhang, R. C. Hong, Y. D. Zhang, C. S. Xu, Q. Tian. Deep representation learning with part loss for person re-identification. IEEE Transactions on Image Processing, vol. 28, no. 6, pp. 2860–2871, 2019. DOI: https://doi.org/10.1109/TIP.2019.2891888.
D. W. Li, X. T. Chen, Z. Zhang, K. Q. Huang. Learning deep context-aware features over body and latent parts for person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 7398–7407, 2017. DOI: https://doi.org/10.1109/CVPR.2017.782.
J. X. Liu, B. B. Ni, Y. C. Yan, P. Zhou, S. Cheng, J. G. Hu. Pose transferrable person re-identification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 4099–4108, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00431.
Y. X. Ge, Z. W. Li, H. Y. Zhao, G. J. Yin, S. Yi, X. G. Wang, H. S. Li. FD-GAN: Pose-guided feature distilling GAN for robust person re-identification. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 1230–1241, 2018.
Z. D. Zheng, X. D. Yang, Z. D. Yu, L. Zheng, Y. Yang, J. Kautz. Joint discriminative and generative learning for person re-identification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp.2133–2142, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00224.
T. Sattrupai, W. Kusakunniran. Deep trajectory based gait recognition for human re-identification In Proceedings of IEEE Region 10 Conference, Jeju, South Korea, pp. 1723–1726, 2018. DOI: https://doi.org/10.1109/TENCON.2018.8650523.
C. Carley, E. Ristani, C. Tomasi. Person re-identification from gait using an autocorrelation network. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Long Beach, USA, pp. 2345–2353, 2019. DOI: https://doi.org/10.1109/CVPRW.2019.00288.
C. L. Li, X. Y. Liang, Y. J. Lu, N. Zhao, J. Tang. RGB-T object tracking: Benchmark and baseline. Pattern Recognition, vol. 06, Article number 106977, 2019. DOI: https://doi.org/10.1016/j.patcog.2019.106977.
C. L. Li, H. Cheng, S. Y. Hu, X. B. Liu, J. Tang, L. Lin. Learning collaborative sparse representation for grayscale-thermal tracking. IEEE Transactions on Image Processing, vol. 25, no. 12, pp. 5743–5756, 2016. DOI: https://doi.org/10.1109/TIP.2016.2614135.
L. St-Laurent, X. Maldague, D. Prevost. Combination of colour and thermal sensors for enhanced object detection. In Proceedings of the 10th International Conference on Information Fusion, IEEE, Quebec, Canada, pp. 1–8, 2007. DOI: https://doi.org/10.1109/ICIF.2007.4408003.
D. T. Nguyen, H. G. Hong, K. W. Kim, K. R. Park. Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors, vol. 17, no. 3, Article number 605, 2017. DOI: https://doi.org/10.3390/s17030605.
M. Ye, Z. Wang, X. Y. Lan, P. C. Yuen. Visible thermal person re-identification via dual-constrained top-ranking. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI, Stockholm, Sweden, pp. 1092–1099, 2018. DOI: https://doi.org/10.24963/ijcai.2018/152.
P. Y. Dai, R. R. Ji, H. B. Wang, Q. Wu, Y. Y. Huang. Cross-modality person re-identification with generative adversarial training In Proceedings of the 47th International Joint Conference on Artificial Intelligence, IJCAI, Stockholm, Sweden, pp 677–683, 2018.
M. Ye, X. Y. Lan, J. W. Li, P. C. Yuen. Hierarchical discriminative learning for visible thermal person re-identification. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, the 30th Innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI, New Orleans, USA, 2018.
L. C. Zhang, A. Gonzalez-Garcia, J. van de Weijer, M. Danelljan, F. S. Khan. Synthetic data generation for end-to-end thermal infrared tracking. IEEE Transactions on Image Processing, vol. 28, no. 4, pp. 1837–1850, 2019. DOI: https://doi.org/10.1109/TIP.2018.2879249.
J. Y. Zhu, T. Park, P. Isola, A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 2242–2251, 2017. DOI: https://doi.org/10.1109/ICCV.2017.244.
X. Zhang, Q. Yang. Transfer hierarchical attention network for generative dialog system. International Journal of Automation and Computing, vol. 16, no. 6, pp. 720–736, 2019. DOI: https://doi.org/10.1007/s11633-019-1200-0.
B. S. Wang, G. Cao, Y. F. Shang, L. C. Zhou, Y. Q. Zhang, X. S. Li. Single-column CNN for crowd counting with pixel-wise attention mechanism. Neural Computing and Applications, vol. 32, no. 7, pp. 2897–2908, 2020. DOI: https://doi.org/10.1007/s00521-018-3810-9.
T. V. Nguyen, Z. Song, S. Y. Yan. STAP: Spatial-temporal attention-aware pooling for action recognition. IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 1, pp. 77–86, 2015. DOI: https://doi.org/10.1109/TCSVT.2014.2333151.
Z. Ji, K. L. Xiong, Y. W. Pang, X. L. Li. Video summarization with attention-based encoder-decoder networks. IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 6, pp. 1709–1717, 2020. DOI: https://doi.org/10.1109/TCSVT.2019.2904996.
Z. C. Wang, L. Du, F. Wang, H. T. Su, Y. Zhou. Multiscale target detection in SAR image based on visual attention model. In Proceedings of the IEEE 5th Asia-Pacific Conference on Synthetic Aperture Radar, Singapore, Singapore, pp. 704–709, 2015. DOI: https://doi.org/10.1109/APSAR.2015.7306303.
S. Woo, J. Park, J. Y. Lee, I. S. Kweon. CBAM: Convolutional block attention module. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 3–19, 2018. DOI: https://doi.org/10.1007/978-3-03001234-2_1.
H. R. Chen, Y. W. Wang, Y. M. Shi, K. Yan, M. Y. Geng, Y. H. Tian, T. Xiang. Deep transfer learning for person re-identification. In Proceedings of the 4th International Conference on Multimedia Big Data, IEEE, Xi’an, China, pp. 1–5, 2018. DOI: https://doi.org/10.1109/BigMM.2018.8499067.
H. Y. Zhao, M. Q. Tian, S. Y. Sun, J. Shao, J. J. Yan, S. Yi, X. G. Wang, X. H. Tang. Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 907–915, 2017. DOI: https://doi.org/10.1109/CVPR.2017.103.
C. Su, J. N. Li, S. L. Zhang, J. L. Xing, W. Gao, Q. Tian. Pose-driven deep convolutional model for person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3980–3989, 2017. DOI: https://doi.org/10.1109/ICCV.2017.427.
Y. F. Sun, L. Zheng, Y. Yang, Q. Tian, S. J. Wang. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 501–518, 2018. DOI: https://doi.org/10.1007/978-3-030-01225-0_30.
W. H. Chen, X. T. Chen, J. G. Zhang, K. Q. Huang. Beyond triplet loss: A deep quadruplet network for person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 1320–1329, 2017. DOI: https://doi.org/10.1109/CVPR.2017.145.
Y. Yuan, W. Y. Chen, Y. Yang, Z. Y. Wang. In defense of the triplet loss again: Learning robust person re-identification with fast approximated triplet loss and label distillation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Seattle, USA, pp. 1454–1463, 2020. DOI: https://doi.org/10.1109/CVPRW50498.2020.00185.
I. B. Barbosa, M. Cristani, A. del Bue, L. Bazzani, V. Murino. Re-identification with RGB-D sensors. In Proceedings of European Conference on Computer Vision, Springer, Florence, Italy, pp. 433–442, 2012. DOI: https://doi.org/10.1007/978-3-642-33863-2_43.
M. Munaro, A. Fossati, A. Basso, E. Menegatti, L. van Gool. One-shot person re-identification with a consumer depth camera. Person Re-Identification, S. G. Gong, M. Cristani, S. C. Yan, C. C. Loy, Eds., London, UK: Springer, pp. 161–181, 2014. DOI: https://doi.org/10.1007/978-1-4471-6296-4_8.
F. Pala, R. Satta, G. Fumera, F. Roli. Multimodal person reidentification using RGB-D cameras. IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no.4, pp. 788–799, 2016. DOI https://doi.org/10.1109/TCSVT.2015.2424056.
A. Mogelmose, C. Bahnsen, T. Moeslund, A. Clapes, S. Escalera. Tri-modal person re-identification with RGB, depth and thermal features. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, Portland, USA, pp.301–307, 2013. DOI: https://doi.org/10.1109/CVPRW.2013.52.
X. X. Xu, W. Li, D. Xu. Distance metric learning using privileged information for face verification and person re-identification. IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 12, pp. 3150–3162, 2015. DOI: https://doi.org/10.1109/TNNLS.2015.2405574.
V. John, G. Englebienne, B. Krose. Person re-identification using height-based gait in colour depth camera. In Proceedings of IEEE International Conference on Image Processing, Melbourne, Australia, pp. 3345–3349, 2013. DOI: https://doi.org/10.1109/ICIP.2013.6738689.
A. C. Wu, W. S. Zheng, J. H. Lai. Robust depth-based person re-identification. IEEE Transactions on Image Processing, vol. 26, no. 6, pp. 2588–2603, 2017. DOI: https://doi.org/10.1109/TIP.2017.2675201.
M. Paolanti, L. Romeo, D. Liciotti, R. Pietrini, A. Cenci, E. Frontoni, P. Zingaretti. Person re-identification with RGB-D camera in top-view configuration through multiple nearest nearest neighbor classifiers and neighborhood component features selection. Sensors, vol 18, no. 10, Article number 3471, 2018. DOI: https://doi.org/10.3390/s18103471.
L. L. Ren, J. W. Lu, J. J. Feng, J. Zhou. Uniform and variational deep learning for RGB-D object recognition and person re-identification. IEEE Transactions on Image Processing, vol. 28, no. 10, pp. 4970–4983, 2019. DOI: https://doi.org/10.1109/TIP.2019.2915655.
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS, Long Beach, USA, pp. 2672–2680, 2014.
M. Mirza, S. Osindero. Conditional generative adversarial nets, [Online], Available: https://arxiv.org/abs/1411.1784, 2014.
A. Radford, L. Metz, S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks, [Online], https://arxiv.org/abs/1511.06434, 2015.
G. Perarnau, J. van de Weijer, B. Raducanu, J. M. Álvarez. Invertible conditional GANS for image editing, [Online], Available: https://arxiv.org/abs/1611.06355, 2016.
P. Isola, J. Y. Zhu, T. H. Zhou, A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 5967–5976, 2017. DOI: https://doi.org/10.1109/CVPR.2017.632.
D. Xu, W. L. Ouyang, E. Ricci, X. G. Wang, N. Sebe. Learning cross-modal deep representations for robust pedestrian detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 4236–4244, 2017. DOI: https://doi.org/10.1109/CVPR.2017.451.
Y. Luo, J. Ren, M. Lin, J. H. Pang, W. X. Sun, H. S. Li, L. Lin. Single view stereo matching. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Satt Lake City, USA, pp.155–163, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00024.
T. T. Qiao, J. Zhang, D. Q. Xu, D. C. Tao. MirrorGAN: Learning text-to-image generation by redescription. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1505–1514, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00160.
L. Chen, S. Srivastava, Z. Y. Duan, C. L. Xu. Deep cross-modal audio-visual generation. In Proceedings of Thematic Workshops of ACM Multimedia 2017, ACM, Mountain View, USA, pp. 349–357, 2017. DOI: https://doi.org/10.1145/3126686.3126723.
H. Zhou, Y. Liu, Z. W. Liu, P. Luo, X. G. Wang. Talking face generation by adversarially disentangled audio-visual representation. In Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, no. 1, pp. 9299–9306, 2019. DOI: https://doi.org/10.1609/aaai.v33i01.33019299.
C. Li, M. Wand. Precomputed real-time texture synthesis with Markovian generative adversarial networks. In Proceeding of 4th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 702–716, 2016. DOI: https://doi.org/10.1007/978-3-319-46487-9_43.
A. C. Wu, W. S. Zheng, H. X. Yu, S. G. Gong, J. H. Lai. RGB-infrared cross-modality person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 5390–5399, 2017. DOI: https://doi.org/10.1109/ICCV.2017.575.
D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. [Online], Available: https://arxiv.org/abs/1412.6980, 2014.
B. T. Zhang, X. P. Wang, Y. Shen, T. Lei. Dual-modal physiological feature fusion-based sleep recognition using CFS and RF algorithm. International Journal of Automation and Computing, vol. 16, no. 3, pp. 286–296, 2019. DOI: https://doi.org/10.1007/s11633-019-1171-1.
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.
S. Ioffe, C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 448–456, 2015.
E. Ristani, F. Solera, R. Zou, R. Cucchiara, C. Tomasi. Performance measures and a data set for multi-target, multi-camera tracking. In Proceedings of European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 17–35, 2016. DOI: https://doi.org/10.1007/978-3-319-48881-3_2.
T. Xiao, S. Li, B. C. Wang, L. Lin, X. G. Wang. Joint detection and identification feature learning for person search. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 3376–3385, 2017. DOI: https://doi.org/10.1109/CVPR.2017.360.
W. Li, X. T. Zhu, S. G. Gong. Person re-identification by deep joint learning of multi-loss classification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, pp. 2194–2200, 2017. DOI: https://doi.org/10.24963/ijcai.2017/305.
L. X. He, J. Liang, H. Q. Li, Z. N. Sun. Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp.7073–7082, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00739.
A. Siarohin, E. Sangineto, S. Lathuilière, N. Sebe. Deformable GANs for pose-based human image generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 3408–3416, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00359.
J. H. Zhou, P. Yu, W. Tang, Y. Wu. Efficient online local metric adaptation via negative samples for person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 2439–2447, 2017. DOI: https://doi.org/10.1109/ICCV.2017.265.
Y. F. Sun, L. Zheng, W. J. Deng, S. J. Wang. SVDNet for pedestrian retrieval. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3820–3828, 2017. DOI: https://doi.org/10.1109/ICCV.2017.410.
L. M. Zhao, X. Li, Y. T. Zhuang, J. D. Wang. Deeply-learned part-aligned representations for person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3239–3248, 2017. DOI: https://doi.org/10.1109/ICCV.2017.349.
W. J. Deng, L. Zheng, Q. X. Ye, G. L. Kang, Y. Yang, J. B. Jiao. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 994–1003, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00110.
Y. Q. Zhang, X. Li, L. M. Zhao, Z. F. Zhang. Semantics-aware deep correspondence structure learning for robust person re-identification. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, USA, pp. 3545–3551, 2016.
J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00745.
S. H. Gao, M. M. Cheng, K. Zhao, X. Y. Zhang, M. H. Yang, P. H. S. Torr. Res2Net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019. DOI: https://doi.org/10.1109/TPAMI.2019.2938758.
Acknowledgements
This work was supported by National Natural Science Foundation of China (Nos. 61976002, 61976003 and 61860206004), Natural Science Foundation of Anhui Higher Education Institutions of China (No. KJ2019A0033), and the Open Project Program of the National Laboratory of Pattern Recognition (No. 201900046).
Author information
Authors and Affiliations
Corresponding author
Additional information
Recommended by Associate Editor Jangmyung Lee
Ai-Hua Zheng received the B. Eng. and Ph. D. degrees in computer science and technology from Anhui University, China in 2006 and 2008, respectively. And she received the Ph. D. degree in computer science from University of Greenwich, UK in 2012. She visited University of Stirling and Texas State University from June to September in 2013 and from September 2019 to August 2020, respectively. She is currently an associate professor and Ph. D. supervisor in School of Computer Science and Technology in Anhui University, China. As the first author or corresponding author, she has published more than 40 academic papers, including top conferences papers in American Association for Artificial Intelligence Conference on Artificial Intelligence (AAAI) and the International Joint Conference on Artificial Intelligence (IJCAI), and authoritative journals in IEEE Transactions on Systems, Man, and Cybernetics: Systems (TSMCS), Pattern Recognition (PR), Pattern Recognition Letters (PRL), Neurocomputing (NeuCom), Cognitive Computation (CogCom), the IEEE International Symposium on Network Computing and Applications (NCA), etc. She is a member of China Computer Federation (CCF) and China Society of Image and Graphics (CSIG). She is also serving as reviewers for representative conferences and journals, including AAAI, IJCAI, IEEE Transactions on Image Processing (TIP), IEEE Transactions on Multimedia (TMM), IEEE Transactions on Intelligent Transportation Systems (TITS), PR, etc. She has obtained the Best Paper Award in the International Conference on Software Engineering Research, Management and Applications (SERA) 2017 and the Best Student Paper Award in the workshop in the IEEE International Conference on Multimedia and Expo (ICME) 2019.
Her research interests include vision based artificial intelligence and pattern recognition, especially on person/vehicle reidentification, audio visual computing, and multi-modal intelligence.
Zi-Han Chen received the B. Eng. degree in software engineering from Anhui University, China in 2018. He is currently a master student in computer science and technology from Anhui University, China.
His research interests include computer vision, person re-identification and machine learning.
Cheng-Long Li received the M. Sc. and Ph. D. degrees in computer science from School of Computer Science and Technology, Anhui University, China in 2013 and 2016, respectively. From 2014 to 2015, he worked as a visiting student with School of Data and Computer Science, Sun Yat-sen University, China. He was a postdoctoral research fellow at the Center for Research on Intelligent Perception and Computing (CRIPAC), National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), China. He is currently an associate professor at School of Computer Science and Technology, Anhui University, China. He was a recipient of the ACM Hefei Doctoral Dissertation Award in 2016.
His research interests include computer vision and deep learning.
Jin Tang received the B. Eng. degree in automation and the Ph. D. degree in computer science from Anhui University, China in 1999 and 2007, respectively. He is a professor with School of Computer Science and Technology, Anhui University.
His research interests include computer vision, pattern recognition and machine learning.
Bin Luo received the B. Eng. degree in electronics, and the M. Eng. degree in computer science from Anhui University, China in 1984 and 1991, respectively, and the Ph. D. degree in computer science from University of York, UK in 2002. From 2000 to 2004, he was a research associate with University of York, UK. He is currently a professor with Anhui University, China.
His research interests include graph spectral analysis, large image database retrieval, image and graph matching, statistical pattern recognition, digital watermarking and information security.
Rights and permissions
About this article
Cite this article
Zheng, AH., Chen, ZH., Li, CL. et al. Learning Deep RGBT Representations for Robust Person Re-identification. Int. J. Autom. Comput. 18, 443–456 (2021). https://doi.org/10.1007/s11633-020-1262-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11633-020-1262-z