Abstract
Due to the low number of pedestrian samples in the categories in person Re-Identification (ReID) benchmarks, many researchers use Generative Adversarial Networks (GANs) to generate samples and expand the datasets. Real and generated samples are then used to train the person ReID model. In traditional GANs, high-dimensional samples are generated from noise. However, due to the complexity of pedestrian samples, the visual effect of generated samples is unsatisfactory. In this work, we propose a new generative model called the Encoder-Decoder Assisted Image Generative Adversarial Network (EDAGAN). EDAGAN improves the visual effects of the generated samples by reducing the dimensions of generated feature, which are obtained by the traditional GANs. In addition, many existing methods cannot optimize the real and generated samples simultaneously. Thus, the person ReID model may not make good use of the generated samples to improve the performance. For this purpose, we propose a new loss function called Soft Label Smoothing Regularization for Outliers (SLSRO), which facilitates the use of real samples and generated samples for model training. We use ResNet-50 as the backbone network to evaluate the effectiveness of EDAGAN and SLSRO. The experiments show that the EDAGAN with the SLSRO achieves a significant improvement compared to other models on the three public benchmarks, Market-1501, DukeMTMC-ReID and CUHK03.
Similar content being viewed by others
References
Amponsah AA, Han F, Osei-Kwakye J, Bonah E, Ling QH (2021) An improved multi-leader comprehensive learning particle swarm optimisation based on gravitational search algorithm. Connection Sci (1):1–32
Arjovsky M, Chintala S, and Bottou L 2017 Wasserstein GAN. in arXiv preprint arXiv:1701.07875,
Augustus O (2016) "Semi-supervised learning with generative adversarial networks," presented at the ICML workshop
Ba JL, Kiros JR, and Hinton GE (2016) Layer Normalization. in arXiv preprint arXiv: 1607.06450, .
Bai S, Bai X, Tian Q (2017) Scalable Person Re-Identification on Supervised Smoothed Manifold. Proceed IEEE Conf Comp Vision Patt Recogn (CVPR):2530–2539
Bolle RM, Connell JH, Pankanti S, Ratha NK, Senior AW (2005) The relation between the ROC curve and the CMC. Fourth IEEE Workshop Automatic Identif Advan Technol (AutoID'05):15–20
L. Bottou, "Stochastic Gradient Descent Tricks," in Neural Networks: Tricks of the Trade: Second Edition, G. Montavon, G. B. Orr, and K.-R. Müller, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 421–436.
Chang Y-S et al (2020) Joint deep semantic embedding and metric learning for person re-identification. Pattern Recog Lett 130:306–311
Dalal N, Triggs B (2005) Histograms of Oriented Gradients for Human Detection. 2005 IEEE Comp Soc Conf Comp Vision Patt Recogn (CVPR'05) 1:886–893
Deng J, Dong W, Socher R, Li L, Kai L, Li F-F (2009) ImageNet: A large-scale hierarchical image database. 2009 IEEE Conf Comp Vision Pattern Recog:248–255
Deng W, Zheng L, Ye Q, Kang G, Yang Y, Jiao J (2018) Image-Image Domain Adaptation With Preserved Self-Similarity and Domain-Dissimilarity for Person Re-Identification. Proceed IEEE Conf CompVision Pattern Recogn (CVPR):994–1003
Dong-Hyun L (2013) "pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks," in ICML workshop
Ge Y et al (2018) FD-GAN: Pose-guided Feature Distilling GAN for Robust Person Re-identification. Proceed Neural Inform Process Syst (NIPS):1222–1233
He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition. Proceed IEEE Conf Comp Vision Pattern Recog (CVPR):770–778
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, and Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580.
Huang Y, Xu J, Wu Q, Zheng Z, Zhang Z, Zhang J (2019) Multi-pseudo regularized label for generated data in person re-identification. IEEE Trans Image Process 28(3):1391–1403
Ian G et al (2014) Generative Adversarial Nets. Advan Neural Inform Process Syst 27
Ishaan G, Faruk A, Martin A, Vincent D, Aaron C (2017) Improved Training of Wasserstein GANs. Advanc Neural Inform Process Syst:5767–5777
Kingma DP and Ba J (2015) "Adam: A Method for Stochastic Optimization," in arXiv preprint arXiv:1412.6980.
Köstinger M, Hirzer M, Wohlhart P, Roth PM, and Bischof H (2012) Large Scale Metric Learning from Equivalence Constraints. in 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2288–2295.
Li W, Zhao R, Xiao T, and Wang X (2014) DeepReID: Deep Filter Pairing Neural Network for Person Re-Identification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 152–159.
Li D, Chen X, Zhang Z, Huang K (2017) Learning Deep Context-Aware Features Over Body and Latent Parts for Person Re-Identification. Proceed IEEE Conf Comp Vision Patt Recogn (CVPR):384–393
Liao S, Hu Y, Zhu X, and Li SZ (2015) Person Re-Identification by Local Maximal Occurrence Representation and Metric Learning," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2197–2206, .
Ling QH, Song YQ, Han F, Zhou CH, Lu H (2019) An improved learning algorithm for random neural networks based on particle swarm optimization and input-to-output sensitivity. Cogn Syst Res 53:51–60
Liqian M, Xu J, Qianru S, Bernt S, Tuytelaars T, Gool L (2017) Pose Guided Person Image Generation. Adbances Neural Inform Process Syst
Liu J, Ni B, Yan Y, Zhou P, Cheng S, Hu J (2018) Pose Transferrable Person Re-Identification. Proceed IEEE Conf Comp Vision Patt Recog (CVPR):4099–4108
Lowe DG (1999) Object Recognition from Local Scale-Invariant Features. Proceed Seventh IEEE Int Conf Comp Vision (ICCV) 2:1150–1157
Lu H (2021) Click-cut: a framework for interactive object selection. Multimed Tools Appl 80:24759–24776
Lu H, Song Y, Wei H (2020) Multiple-kernel combination fuzzy clustering for community detection. Soft Computing 24(18):14157–14165
Lu H, Liu S, Wei H, Chen C, Geng X (2021) Deep multi-kernel auto-encoder network for clustering brain functional connectivity data. Neural Networks 135:148–157
Maas AL (2013) Rectifier nonlinearities improve neural network acoustic models
Mao X, Li Q, Xie H, Lau RYK, Wang Z, and Smolley SP (2017)"least squares generative adversarial networks," presented at the proceedings of the IEEE international conference on computer vision (ICCV)
Ning X, Gong K, Li W, Zhang L, Bai X, Tian S (2020) Feature refinement and filter network for person re-identification. IEEE Trans Circ Syst Video Technol:1–1
Ning X, Gong K, Li W, Zhang L (2021) JWSAA: Joint weak saliency and attention aware for person re-identification. Neurocomputing 453:801–811
Qian X et al (2018) Pose-Normalized Image Generation for Person Re-identification. Proceed Eur Conf Comp Vision (ECCV):650–667
Radford A, Metz L, and Chintala S (2016) Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks," CoRR, vol. abs/1511.06434.
Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance Measures and a Data Set for Multi-target, Multi-camera Tracking. Springer International Publishing, pp 17–35
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training GANs. Adv Neural Inf Proces Syst 29:2234–2242
Sergey I and Christian S (2015) "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," . [Online]. Available: http://proceedings.mlr.press/v37/ioffe15.html.
Shamsolmoali P et al (2021) Image synthesis with adversarial networks: A comprehensive survey and case studies. Inform Fusion 72:126–146
Siarohin A, Sangineto E, Lathuilière S, Sebe N (2018) Deformable GANs for Pose-Based Human Image Generation. Proceed IEEE Conf Comp Vision Patt Recog (CVPR):3408–3416
Slawomir B, Peter C, Jean-Francois L (2018) Domain adaptation through synthesis for unsupervised person re-identification. Proceed Eur Conf Comput Vision (ECCV):189–205
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond Part Models: Person Retrieval with Refined Part Pooling (and A Strong Convolutional Baseline). Proceed Eur Conf Comp Vision (ECCV):480–496
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the Inception Architecture for Computer Vision," in Proceed IEEE Conf Comp Vision Patt Recogn (CVPR), 2016, pp. 2818–2826.
R. R. Varior, M. Haloi, and G. Wang, "Gated Siamese Convolutional Neural Network Architecture for Human Re-identification," in Computer Vision – ECCV 2016, Cham, 2016: Springer International Publishing, pp. 791–808.
Wei L, Zhang S, Gao W, Tian Q (2018) Person Transfer GAN to Bridge Domain Gap for Person Re-Identification. Proceed IEEE Conf Comp Vision Pattern Recog (CVPR):79–88
T. Xiao, S. Li, B. Wang, L. Lin, and X. Wang, "Joint Detection and Identification Feature Learning for Person Search," pp. 3415–3424, 2017.
Zhang L, Xiang T, Gong S (2016) Learning a Discriminative Null Space for Person Re-identification. 2016 IEEE Conf Comp Vision Pattern Recog (CVPR):1239–1248
Zhang Z, Xie Y, Zhang W, Tang Y, Tian Q (2020) Tensor multi-task learning for person re-identification. IEEE Trans Image Process 29:2463–2477
L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, "Scalable Person Re-Identification: A Benchmark," in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1116–1124.
L. Zheng, Y. Yang, and A. Hauptmann 2016 Person Re-identification: Past, Present and Future. ArXiv, vol. abs/1610.02984.
Zheng Z, Zheng L, Yang Y (2017) Unlabeled Samples Generated by GAN Improve the Person Re-Identification Baseline in Vitro. Proceed IEEE Int Conf Comp Vision (ICCV):3754–3762
Zheng Z, Zheng L, Yang Y (2019) Pedestrian alignment network for large-scale person re-identification. IEEE Trans Circ Syst Video Technol 29(10):3037–3045
Zheng Z, Yang X, Yu Z, Zheng L, Yang Y, Kautz J (2019) Joint Discriminative and Generative Learning for Person Re-Identification. Proceed IEEE/CVF Conf Comp Vision Pattern Recog (CVPR):2138–2147
Zhong Z, Zheng L, Cao D, Li S (2017) Re-Ranking Person Re-Identification With k-Reciprocal Encoding. Proceed IEEE Conf Comp Vision Patt Recogn (CVPR):1318–1327
Acknowledgements
This work was supported by the Postgraduate Research & Practice Innovation Program of Jiangsu Province (Project No. KYCX20_3083) and Program of Shanghai Academic/Technology Research Leader (Project No.18XD1423200).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Y., Jiang, K., Lu, H. et al. Encoder-decoder assisted image generation for person re-identification. Multimed Tools Appl 81, 10373–10390 (2022). https://doi.org/10.1007/s11042-022-11907-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-11907-2