Skip to main content
Log in

Learning Deep RGBT Representations for Robust Person Re-identification

  • Research Article
  • Published:
International Journal of Automation and Computing Aims and scope Submit manuscript

Abstract

Person re-identification (Re-ID) is the scientific task of finding specific person images of a person in a non-overlapping camera networks, and has achieved many breakthroughs recently. However, it remains very challenging in adverse environmental conditions, especially in dark areas or at nighttime due to the imaging limitations of a single visible light source. To handle this problem, we propose a novel deep red green blue (RGB)-thermal (RGBT) representation learning framework for a single modality RGB person Re-ID. Due to the lack of thermal data in prevalent RGB Re-ID datasets, we propose to use the generative adversarial network to translate labeled RGB images of person to thermal infrared ones, trained on existing RGBT datasets. The labeled RGB images and the synthetic thermal images make up a labeled RGBT training set, and we propose a cross-modal attention network to learn effective RGBT representations for person Re-ID in day and night by leveraging the complementary advantages of RGB and thermal modalities. Extensive experiments on Market 1501, CUHK03 and DukeMTMC-reID datasets demonstrate the effectiveness of our method, which achieves state-of-the-art performance on all above person Re-ID datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. O. Oreifej, R. Mehran, M. Shah. Human identity recognition in aerial images. In Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA, pp. 709–716, 2010. DOI: https://doi.org/10.1109/CVPR.2010.5540147.

  2. A. Mignon, F. Jurie. PCCA: A new approach for distance learning from sparse pairwise constraints. In Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, pp.2666–2672. 2012. DOI: https://doi.org/10.1109/CVPR.2012.6247987.

  3. S. C. Liao, Y. Hu, X. Y. Zhu, S. Z. Li. Person re-identification by local maximal occurrence representation and metric learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 2197–2206, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298832.

  4. M. Köstinger, M. Hirzer, P. Wohlhart, P. M. Roth, H. Bischof. Large scale metric learning from equivalence constraints. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, pp. 2288–2295, 2012. DOI:10.1109/CVPR.2012.6247939.

  5. A. X. Li, K. X. Zhang, L. W. Wang. Zero-shot fine-grained classification by deep feature learning with semantics. International Journal of Automation and Computing, vol. 16, no. 5, pp. 563–574, 2019. DOI: https://doi.org/10.1007/s11633-019-1177-8.

    Article  Google Scholar 

  6. W. Li, R. Zhao, T. Xiao, X. G. Wang. DeepReID: Deep filter pairing neural network for person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp. 152–159, 2014. DOI: https://doi.org/10.1109/CVPR.2014.27.

  7. L. Chen, H. Yang, S. Wu, Z. Y. Gao. Data generation for improving person re-identification. In Proceedings of the 25th ACM International Conference on Multimedia, ACM, Mountain View, USA, pp.609–617, 2017. DOI:10.1145/3123266.3123302.

    Chapter  Google Scholar 

  8. Z. D. Zheng, L. Zheng, Y. Yang. Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3774–3782, 2017. DOI: https://doi.org/10.1109/ICCV.2017.405.

  9. Z. Zhong, L. Zheng, D. L. Cao, S. Z. Li. Re-ranking person re-identification with k-reciprocal encoding. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 3652–3661, 2017. DOI: https://doi.org/10.1109/CVPR.2017.389.

  10. J. Satake, M. Chiba, J. Miura. Visual person identification using a distance-dependent appearance model for a person following robot. International Journal of Automation and Computing, vol. 10, no. 5, pp. 438–446, 2013. DOI: https://doi.org/10.1007/s11633-013-0740-y.

    Article  Google Scholar 

  11. Y. B. Chen, X. T. Zhu, S. G. Gong. Person re-identification by deep learning multi-scale representations. In Proceedings of IEEE International Conference on Computer Vision Workshops, Venice, Italy, pp. 2590–2600, 2017. DOI: https://doi.org/10.1109/ICCVW.2017.304.

  12. Z. D. Zheng, L. Zheng, Y. Yang. Pedestrian alignment network for large-scale person re-identification. IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 10, pp. 3037–3045, 2019. DOI: https://doi.org/10.1109/TCSVT.2018.2873599.

    Article  Google Scholar 

  13. G. D. Ding, S. Khan, Z. M. Tang, F. Porikli. Feature mask network for person re-identification. Pattern Recognition Letters, vol. 137, pp. 91–98, 2020. DOI: https://doi.org/10.1016/j.patrec.2019.02.015.

    Article  Google Scholar 

  14. L. Wu, R. C. Hong, Y. Wang, M. Wang. Cross-entropy adversarial view adaptation for person re-identification. IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 7, pp. 2081–2092, 2020. DOI: https://doi.org/10.1109/TCSVT.2019.2909549.

    Google Scholar 

  15. D. S. Xu, J. Chen, C. Liang, Z. Wang, R. M. Hu. Cross-view identical part area alignment for person re-identification. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Brighton, UK, pp. 2462–2466, 2019. DOI: https://doi.org/10.1109/ICASSP.2019.8683137.

  16. L. Wei, Z. Y. Wei, Z. M. Jin, Z. X. Yu, J. Q. Huang, D. Cai, X. F. He, X. S. Hua. SIF: Self-inspirited feature learning for person re-identification. IEEE Transactions on Image Processing, vol. 29, pp. 4942–4951, 2020. DOI: https://doi.org/10.1109/TIP.2020.2975712.

    Article  Google Scholar 

  17. D. Yi, Z. Lei, S. C. Liao, S. Z. Li. Deep metric learning for person re-identification. In Proceedings of the 22nd International Conference on Pattern Recognition, IEEE, Stockholm, Sweden, pp. 34–39, 2014. DOI: https://doi.org/10.1109/ICPR.2014.16.

    Google Scholar 

  18. L. Zheng, L. Y. Shen, L. Tian, S. J. Wang, J. D. Wang, Q. Tian. Scalable person re-identification: A benchmark. In Proceedings of IEEE International Conference on Computer Vision, Santiago, Chile, pp. 1116–1124, 2015. DOI: https://doi.org/10.1109/ICCV.2015.133.

  19. X. B. Chang, T. M. Hospedales, T. Xiang. Multi-level factorisation net for person re-idenrification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 2109–2118, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00225.

  20. J. J. You, A. C. Wu, X. Li, W. S. Zheng. Top-push video-based person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 1345–1353, 2016. DOI: https://doi.org/10.1109/CV-PR.2016.150.

  21. A. Hermans, L. Beyer, B. Leibe. In defense of the triplet loss for person re-identification, [Online], Available: https://arxiv.org/abs/1703.07737, 2017.

  22. J. Wang, Z. Wang, C. Liang, C. X. Gao, N. Sang. Equidistance constrained metric learning for person re-identification. Pattern Recognition, vol. 74, pp. 38–51, 2018. DOI: https://doi.org/10.1016/j.patcog.2017.09.014.

    Article  Google Scholar 

  23. X. K. Zhu, X. Y. Jing, F. Zhang, X. Y. Zhang, X. G. You, X. Cui. Distance learning by mining hard and easy negative samples for person re-identification. Pattern Recognition. vol. 95, pp. 211–222, 2019. DOI: https://doi.org/10.1016/j.patcog.2019.06.007.

    Article  Google Scholar 

  24. H. T. Yao, S. L. Zhang, R. C. Hong, Y. D. Zhang, C. S. Xu, Q. Tian. Deep representation learning with part loss for person re-identification. IEEE Transactions on Image Processing, vol. 28, no. 6, pp. 2860–2871, 2019. DOI: https://doi.org/10.1109/TIP.2019.2891888.

    Article  MathSciNet  MATH  Google Scholar 

  25. D. W. Li, X. T. Chen, Z. Zhang, K. Q. Huang. Learning deep context-aware features over body and latent parts for person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 7398–7407, 2017. DOI: https://doi.org/10.1109/CVPR.2017.782.

  26. J. X. Liu, B. B. Ni, Y. C. Yan, P. Zhou, S. Cheng, J. G. Hu. Pose transferrable person re-identification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 4099–4108, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00431.

    Google Scholar 

  27. Y. X. Ge, Z. W. Li, H. Y. Zhao, G. J. Yin, S. Yi, X. G. Wang, H. S. Li. FD-GAN: Pose-guided feature distilling GAN for robust person re-identification. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 1230–1241, 2018.

  28. Z. D. Zheng, X. D. Yang, Z. D. Yu, L. Zheng, Y. Yang, J. Kautz. Joint discriminative and generative learning for person re-identification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp.2133–2142, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00224.

    Google Scholar 

  29. T. Sattrupai, W. Kusakunniran. Deep trajectory based gait recognition for human re-identification In Proceedings of IEEE Region 10 Conference, Jeju, South Korea, pp. 1723–1726, 2018. DOI: https://doi.org/10.1109/TENCON.2018.8650523.

  30. C. Carley, E. Ristani, C. Tomasi. Person re-identification from gait using an autocorrelation network. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Long Beach, USA, pp. 2345–2353, 2019. DOI: https://doi.org/10.1109/CVPRW.2019.00288.

    Google Scholar 

  31. C. L. Li, X. Y. Liang, Y. J. Lu, N. Zhao, J. Tang. RGB-T object tracking: Benchmark and baseline. Pattern Recognition, vol. 06, Article number 106977, 2019. DOI: https://doi.org/10.1016/j.patcog.2019.106977.

  32. C. L. Li, H. Cheng, S. Y. Hu, X. B. Liu, J. Tang, L. Lin. Learning collaborative sparse representation for grayscale-thermal tracking. IEEE Transactions on Image Processing, vol. 25, no. 12, pp. 5743–5756, 2016. DOI: https://doi.org/10.1109/TIP.2016.2614135.

    Article  MathSciNet  MATH  Google Scholar 

  33. L. St-Laurent, X. Maldague, D. Prevost. Combination of colour and thermal sensors for enhanced object detection. In Proceedings of the 10th International Conference on Information Fusion, IEEE, Quebec, Canada, pp. 1–8, 2007. DOI: https://doi.org/10.1109/ICIF.2007.4408003.

    Google Scholar 

  34. D. T. Nguyen, H. G. Hong, K. W. Kim, K. R. Park. Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors, vol. 17, no. 3, Article number 605, 2017. DOI: https://doi.org/10.3390/s17030605.

  35. M. Ye, Z. Wang, X. Y. Lan, P. C. Yuen. Visible thermal person re-identification via dual-constrained top-ranking. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI, Stockholm, Sweden, pp. 1092–1099, 2018. DOI: https://doi.org/10.24963/ijcai.2018/152.

    Google Scholar 

  36. P. Y. Dai, R. R. Ji, H. B. Wang, Q. Wu, Y. Y. Huang. Cross-modality person re-identification with generative adversarial training In Proceedings of the 47th International Joint Conference on Artificial Intelligence, IJCAI, Stockholm, Sweden, pp 677–683, 2018.

    Google Scholar 

  37. M. Ye, X. Y. Lan, J. W. Li, P. C. Yuen. Hierarchical discriminative learning for visible thermal person re-identification. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, the 30th Innovative Applications of Artificial Intelligence, and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI, New Orleans, USA, 2018.

    Google Scholar 

  38. L. C. Zhang, A. Gonzalez-Garcia, J. van de Weijer, M. Danelljan, F. S. Khan. Synthetic data generation for end-to-end thermal infrared tracking. IEEE Transactions on Image Processing, vol. 28, no. 4, pp. 1837–1850, 2019. DOI: https://doi.org/10.1109/TIP.2018.2879249.

    Article  MathSciNet  Google Scholar 

  39. J. Y. Zhu, T. Park, P. Isola, A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 2242–2251, 2017. DOI: https://doi.org/10.1109/ICCV.2017.244.

  40. X. Zhang, Q. Yang. Transfer hierarchical attention network for generative dialog system. International Journal of Automation and Computing, vol. 16, no. 6, pp. 720–736, 2019. DOI: https://doi.org/10.1007/s11633-019-1200-0.

    Article  Google Scholar 

  41. B. S. Wang, G. Cao, Y. F. Shang, L. C. Zhou, Y. Q. Zhang, X. S. Li. Single-column CNN for crowd counting with pixel-wise attention mechanism. Neural Computing and Applications, vol. 32, no. 7, pp. 2897–2908, 2020. DOI: https://doi.org/10.1007/s00521-018-3810-9.

    Article  Google Scholar 

  42. T. V. Nguyen, Z. Song, S. Y. Yan. STAP: Spatial-temporal attention-aware pooling for action recognition. IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 1, pp. 77–86, 2015. DOI: https://doi.org/10.1109/TCSVT.2014.2333151.

    Article  Google Scholar 

  43. Z. Ji, K. L. Xiong, Y. W. Pang, X. L. Li. Video summarization with attention-based encoder-decoder networks. IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 6, pp. 1709–1717, 2020. DOI: https://doi.org/10.1109/TCSVT.2019.2904996.

    Article  Google Scholar 

  44. Z. C. Wang, L. Du, F. Wang, H. T. Su, Y. Zhou. Multiscale target detection in SAR image based on visual attention model. In Proceedings of the IEEE 5th Asia-Pacific Conference on Synthetic Aperture Radar, Singapore, Singapore, pp. 704–709, 2015. DOI: https://doi.org/10.1109/APSAR.2015.7306303.

  45. S. Woo, J. Park, J. Y. Lee, I. S. Kweon. CBAM: Convolutional block attention module. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 3–19, 2018. DOI: https://doi.org/10.1007/978-3-03001234-2_1.

    Google Scholar 

  46. H. R. Chen, Y. W. Wang, Y. M. Shi, K. Yan, M. Y. Geng, Y. H. Tian, T. Xiang. Deep transfer learning for person re-identification. In Proceedings of the 4th International Conference on Multimedia Big Data, IEEE, Xi’an, China, pp. 1–5, 2018. DOI: https://doi.org/10.1109/BigMM.2018.8499067.

    Google Scholar 

  47. H. Y. Zhao, M. Q. Tian, S. Y. Sun, J. Shao, J. J. Yan, S. Yi, X. G. Wang, X. H. Tang. Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 907–915, 2017. DOI: https://doi.org/10.1109/CVPR.2017.103.

  48. C. Su, J. N. Li, S. L. Zhang, J. L. Xing, W. Gao, Q. Tian. Pose-driven deep convolutional model for person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3980–3989, 2017. DOI: https://doi.org/10.1109/ICCV.2017.427.

  49. Y. F. Sun, L. Zheng, Y. Yang, Q. Tian, S. J. Wang. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 501–518, 2018. DOI: https://doi.org/10.1007/978-3-030-01225-0_30.

    Google Scholar 

  50. W. H. Chen, X. T. Chen, J. G. Zhang, K. Q. Huang. Beyond triplet loss: A deep quadruplet network for person re-identification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 1320–1329, 2017. DOI: https://doi.org/10.1109/CVPR.2017.145.

  51. Y. Yuan, W. Y. Chen, Y. Yang, Z. Y. Wang. In defense of the triplet loss again: Learning robust person re-identification with fast approximated triplet loss and label distillation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Seattle, USA, pp. 1454–1463, 2020. DOI: https://doi.org/10.1109/CVPRW50498.2020.00185.

    Google Scholar 

  52. I. B. Barbosa, M. Cristani, A. del Bue, L. Bazzani, V. Murino. Re-identification with RGB-D sensors. In Proceedings of European Conference on Computer Vision, Springer, Florence, Italy, pp. 433–442, 2012. DOI: https://doi.org/10.1007/978-3-642-33863-2_43.

    Google Scholar 

  53. M. Munaro, A. Fossati, A. Basso, E. Menegatti, L. van Gool. One-shot person re-identification with a consumer depth camera. Person Re-Identification, S. G. Gong, M. Cristani, S. C. Yan, C. C. Loy, Eds., London, UK: Springer, pp. 161–181, 2014. DOI: https://doi.org/10.1007/978-1-4471-6296-4_8.

    Chapter  Google Scholar 

  54. F. Pala, R. Satta, G. Fumera, F. Roli. Multimodal person reidentification using RGB-D cameras. IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no.4, pp. 788–799, 2016. DOI https://doi.org/10.1109/TCSVT.2015.2424056.

    Article  Google Scholar 

  55. A. Mogelmose, C. Bahnsen, T. Moeslund, A. Clapes, S. Escalera. Tri-modal person re-identification with RGB, depth and thermal features. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, Portland, USA, pp.301–307, 2013. DOI: https://doi.org/10.1109/CVPRW.2013.52.

  56. X. X. Xu, W. Li, D. Xu. Distance metric learning using privileged information for face verification and person re-identification. IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 12, pp. 3150–3162, 2015. DOI: https://doi.org/10.1109/TNNLS.2015.2405574.

    Article  MathSciNet  Google Scholar 

  57. V. John, G. Englebienne, B. Krose. Person re-identification using height-based gait in colour depth camera. In Proceedings of IEEE International Conference on Image Processing, Melbourne, Australia, pp. 3345–3349, 2013. DOI: https://doi.org/10.1109/ICIP.2013.6738689.

  58. A. C. Wu, W. S. Zheng, J. H. Lai. Robust depth-based person re-identification. IEEE Transactions on Image Processing, vol. 26, no. 6, pp. 2588–2603, 2017. DOI: https://doi.org/10.1109/TIP.2017.2675201.

    Article  MathSciNet  MATH  Google Scholar 

  59. M. Paolanti, L. Romeo, D. Liciotti, R. Pietrini, A. Cenci, E. Frontoni, P. Zingaretti. Person re-identification with RGB-D camera in top-view configuration through multiple nearest nearest neighbor classifiers and neighborhood component features selection. Sensors, vol 18, no. 10, Article number 3471, 2018. DOI: https://doi.org/10.3390/s18103471.

  60. L. L. Ren, J. W. Lu, J. J. Feng, J. Zhou. Uniform and variational deep learning for RGB-D object recognition and person re-identification. IEEE Transactions on Image Processing, vol. 28, no. 10, pp. 4970–4983, 2019. DOI: https://doi.org/10.1109/TIP.2019.2915655.

    Article  MathSciNet  MATH  Google Scholar 

  61. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS, Long Beach, USA, pp. 2672–2680, 2014.

    Google Scholar 

  62. M. Mirza, S. Osindero. Conditional generative adversarial nets, [Online], Available: https://arxiv.org/abs/1411.1784, 2014.

  63. A. Radford, L. Metz, S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks, [Online], https://arxiv.org/abs/1511.06434, 2015.

  64. G. Perarnau, J. van de Weijer, B. Raducanu, J. M. Álvarez. Invertible conditional GANS for image editing, [Online], Available: https://arxiv.org/abs/1611.06355, 2016.

  65. P. Isola, J. Y. Zhu, T. H. Zhou, A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 5967–5976, 2017. DOI: https://doi.org/10.1109/CVPR.2017.632.

  66. D. Xu, W. L. Ouyang, E. Ricci, X. G. Wang, N. Sebe. Learning cross-modal deep representations for robust pedestrian detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 4236–4244, 2017. DOI: https://doi.org/10.1109/CVPR.2017.451.

  67. Y. Luo, J. Ren, M. Lin, J. H. Pang, W. X. Sun, H. S. Li, L. Lin. Single view stereo matching. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Satt Lake City, USA, pp.155–163, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00024.

    Google Scholar 

  68. T. T. Qiao, J. Zhang, D. Q. Xu, D. C. Tao. MirrorGAN: Learning text-to-image generation by redescription. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1505–1514, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00160.

    Google Scholar 

  69. L. Chen, S. Srivastava, Z. Y. Duan, C. L. Xu. Deep cross-modal audio-visual generation. In Proceedings of Thematic Workshops of ACM Multimedia 2017, ACM, Mountain View, USA, pp. 349–357, 2017. DOI: https://doi.org/10.1145/3126686.3126723.

    Chapter  Google Scholar 

  70. H. Zhou, Y. Liu, Z. W. Liu, P. Luo, X. G. Wang. Talking face generation by adversarially disentangled audio-visual representation. In Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, no. 1, pp. 9299–9306, 2019. DOI: https://doi.org/10.1609/aaai.v33i01.33019299.

    Article  Google Scholar 

  71. C. Li, M. Wand. Precomputed real-time texture synthesis with Markovian generative adversarial networks. In Proceeding of 4th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 702–716, 2016. DOI: https://doi.org/10.1007/978-3-319-46487-9_43.

    Google Scholar 

  72. A. C. Wu, W. S. Zheng, H. X. Yu, S. G. Gong, J. H. Lai. RGB-infrared cross-modality person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 5390–5399, 2017. DOI: https://doi.org/10.1109/ICCV.2017.575.

  73. D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. [Online], Available: https://arxiv.org/abs/1412.6980, 2014.

  74. B. T. Zhang, X. P. Wang, Y. Shen, T. Lei. Dual-modal physiological feature fusion-based sleep recognition using CFS and RF algorithm. International Journal of Automation and Computing, vol. 16, no. 3, pp. 286–296, 2019. DOI: https://doi.org/10.1007/s11633-019-1171-1.

    Article  Google Scholar 

  75. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.

  76. S. Ioffe, C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 448–456, 2015.

  77. E. Ristani, F. Solera, R. Zou, R. Cucchiara, C. Tomasi. Performance measures and a data set for multi-target, multi-camera tracking. In Proceedings of European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 17–35, 2016. DOI: https://doi.org/10.1007/978-3-319-48881-3_2.

    Google Scholar 

  78. T. Xiao, S. Li, B. C. Wang, L. Lin, X. G. Wang. Joint detection and identification feature learning for person search. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 3376–3385, 2017. DOI: https://doi.org/10.1109/CVPR.2017.360.

  79. W. Li, X. T. Zhu, S. G. Gong. Person re-identification by deep joint learning of multi-loss classification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, pp. 2194–2200, 2017. DOI: https://doi.org/10.24963/ijcai.2017/305.

  80. L. X. He, J. Liang, H. Q. Li, Z. N. Sun. Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp.7073–7082, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00739.

    Google Scholar 

  81. A. Siarohin, E. Sangineto, S. Lathuilière, N. Sebe. Deformable GANs for pose-based human image generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 3408–3416, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00359.

  82. J. H. Zhou, P. Yu, W. Tang, Y. Wu. Efficient online local metric adaptation via negative samples for person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 2439–2447, 2017. DOI: https://doi.org/10.1109/ICCV.2017.265.

  83. Y. F. Sun, L. Zheng, W. J. Deng, S. J. Wang. SVDNet for pedestrian retrieval. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3820–3828, 2017. DOI: https://doi.org/10.1109/ICCV.2017.410.

  84. L. M. Zhao, X. Li, Y. T. Zhuang, J. D. Wang. Deeply-learned part-aligned representations for person re-identification. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 3239–3248, 2017. DOI: https://doi.org/10.1109/ICCV.2017.349.

  85. W. J. Deng, L. Zheng, Q. X. Ye, G. L. Kang, Y. Yang, J. B. Jiao. Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 994–1003, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00110.

    Google Scholar 

  86. Y. Q. Zhang, X. Li, L. M. Zhao, Z. F. Zhang. Semantics-aware deep correspondence structure learning for robust person re-identification. In Proceedings of the 25th International Joint Conference on Artificial Intelligence, New York, USA, pp. 3545–3551, 2016.

  87. J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00745.

    Google Scholar 

  88. S. H. Gao, M. M. Cheng, K. Zhao, X. Y. Zhang, M. H. Yang, P. H. S. Torr. Res2Net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019. DOI: https://doi.org/10.1109/TPAMI.2019.2938758.

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Nos. 61976002, 61976003 and 61860206004), Natural Science Foundation of Anhui Higher Education Institutions of China (No. KJ2019A0033), and the Open Project Program of the National Laboratory of Pattern Recognition (No. 201900046).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng-Long Li.

Additional information

Recommended by Associate Editor Jangmyung Lee

Ai-Hua Zheng received the B. Eng. and Ph. D. degrees in computer science and technology from Anhui University, China in 2006 and 2008, respectively. And she received the Ph. D. degree in computer science from University of Greenwich, UK in 2012. She visited University of Stirling and Texas State University from June to September in 2013 and from September 2019 to August 2020, respectively. She is currently an associate professor and Ph. D. supervisor in School of Computer Science and Technology in Anhui University, China. As the first author or corresponding author, she has published more than 40 academic papers, including top conferences papers in American Association for Artificial Intelligence Conference on Artificial Intelligence (AAAI) and the International Joint Conference on Artificial Intelligence (IJCAI), and authoritative journals in IEEE Transactions on Systems, Man, and Cybernetics: Systems (TSMCS), Pattern Recognition (PR), Pattern Recognition Letters (PRL), Neurocomputing (NeuCom), Cognitive Computation (CogCom), the IEEE International Symposium on Network Computing and Applications (NCA), etc. She is a member of China Computer Federation (CCF) and China Society of Image and Graphics (CSIG). She is also serving as reviewers for representative conferences and journals, including AAAI, IJCAI, IEEE Transactions on Image Processing (TIP), IEEE Transactions on Multimedia (TMM), IEEE Transactions on Intelligent Transportation Systems (TITS), PR, etc. She has obtained the Best Paper Award in the International Conference on Software Engineering Research, Management and Applications (SERA) 2017 and the Best Student Paper Award in the workshop in the IEEE International Conference on Multimedia and Expo (ICME) 2019.

Her research interests include vision based artificial intelligence and pattern recognition, especially on person/vehicle reidentification, audio visual computing, and multi-modal intelligence.

Zi-Han Chen received the B. Eng. degree in software engineering from Anhui University, China in 2018. He is currently a master student in computer science and technology from Anhui University, China.

His research interests include computer vision, person re-identification and machine learning.

Cheng-Long Li received the M. Sc. and Ph. D. degrees in computer science from School of Computer Science and Technology, Anhui University, China in 2013 and 2016, respectively. From 2014 to 2015, he worked as a visiting student with School of Data and Computer Science, Sun Yat-sen University, China. He was a postdoctoral research fellow at the Center for Research on Intelligent Perception and Computing (CRIPAC), National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), China. He is currently an associate professor at School of Computer Science and Technology, Anhui University, China. He was a recipient of the ACM Hefei Doctoral Dissertation Award in 2016.

His research interests include computer vision and deep learning.

Jin Tang received the B. Eng. degree in automation and the Ph. D. degree in computer science from Anhui University, China in 1999 and 2007, respectively. He is a professor with School of Computer Science and Technology, Anhui University.

His research interests include computer vision, pattern recognition and machine learning.

Bin Luo received the B. Eng. degree in electronics, and the M. Eng. degree in computer science from Anhui University, China in 1984 and 1991, respectively, and the Ph. D. degree in computer science from University of York, UK in 2002. From 2000 to 2004, he was a research associate with University of York, UK. He is currently a professor with Anhui University, China.

His research interests include graph spectral analysis, large image database retrieval, image and graph matching, statistical pattern recognition, digital watermarking and information security.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, AH., Chen, ZH., Li, CL. et al. Learning Deep RGBT Representations for Robust Person Re-identification. Int. J. Autom. Comput. 18, 443–456 (2021). https://doi.org/10.1007/s11633-020-1262-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-020-1262-z

Keywords

Navigation