ABSTRACT
Cross-modal pedestrian re-identification is to match the color image and infrared image of the pedestrian to determine whether there is the same pedestrian. The images have large inter-modal differences and large intra-modal variations, resulting in low recognition accuracy. This paper proposes a multi-loss joint cross-modal pedestrian re-identification method that fuses grayscale images and relation-aware global attention (RGA). First, grayscale the color images in the cross-modal dataset to reduce the network's dependence on color information; second, use the trained generative adversarial network to convert the color images and infrared images in the dataset to each other to reduce pedestrian poses and modality changes; then, the RGA is embedded into a shared-weight dual-stream ResNet50 to capture more robust features; finally, the hard sample triplet loss is improved and converted into a cross-modal hard sample triplet loss. The cross-entropy loss of the smooth label is combined with the hard sample triplet loss and the cross-modal hard sample triplet loss, respectively, to form joint functions L1 and L2, and then supervised training of the network. Experiments are carried out on the RegDB and SYSU-MM01 datasets, and the mAP reaches 75.85% and 69.56%, respectively, which are better than many current methods, indicating that the proposed method has better recognition accuracy.
- Wu A, Zheng W S, Yu H X, RGB-infrared cross-modality person re-identification [C]// Proceedings of the IEEE international conference on computer vision. 2017: 5380-5389.Google Scholar
- Ye M, Lan X, Li J, Hierarchical discriminative learning for visible thermal person re-identification [C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2018, 32(1).Google Scholar
- Du Peng, Song Yonghong, Zhang Xinyao. Research on cross-modal pedestrian re-recognition method based on self-attention modal fusion network [J/OL]. Acta Automatica Sinica: 1-12, 2020, 09, 20. https://doi. org/10.16383/j.aas.c190340. (in Chinese)Google Scholar
- Zhu J Y, Park T, Isola P, Unpaired image-to-image translation using cycle-consistent adversarial networks [C]// Proceedings of the IEEE international conference on computer vision. 2017: 2223-2232.Google Scholar
- Dai P, Ji R, Wang H, Cross-modality person re-identification with generative adversarial training [C]// IJCAI. 2018, 1(3): 6.Google Scholar
- Ye M, Lan X, Wang Z, Bi-directional center-constrained top-ranking for visible thermal person re-identification [J]. IEEE Transactions on Information Forensics and Security, 2019, 15: 407-419.Google ScholarDigital Library
- Feng Z, Lai J, Xie X. Learning modality-specific representations for visible-infrared person re-identification [J]. IEEE Transactions on Image Processing, 2019, 29: 579-590.Google ScholarDigital Library
- Hao Y, Wang N, Li J, HSME: Hypersphere manifold embedding for visible thermal person re-identification [C]// Proceedings of the AAAI conference on artificial intelligence. 2019, 33(01): 8385-8392.Google Scholar
- Lin J W, Li H. HPILN: A feature learning framework for cross-modality person re-identification [J]. arXiv preprint arXiv: 1906.03142, 2019.Google Scholar
- Wang G, Zhang T, Cheng J, Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 3623-3632.Google Scholar
- Li D, Wei X, Hong X, Infrared-visible cross-modal person re-identification with an x modality[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(04): 4610-4617.Google Scholar
- LI T W, ZENG Z Y. Cross-modal person re-identification model based on dynamic dual-attention mechanism [J]. Journal of Computer Applications, 2022: 0. LI Dawei, ZENG Zhiyong. Cross-modal person re-identification model based on dynamic dual-attention mechanism [J/OL]. Journal of Computer Applications: 1-10, 2022, 06, 07. (in Chinese)Google Scholar
- Ye M, Shen J, Shao L. Visible-infrared person re-identification via homogeneous augmented tri-modal learning [J]. IEEE Transactions on Information Forensics and Security, 2020, 16: 728-739.Google ScholarCross Ref
- Lu Y, Wu Y, Liu B, Cross-modality person re-identification with shared-specific feature transfer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 13379-13389.Google Scholar
- Chen K F, Pan Z S, Wang J B, Collaborative Learning Method for Cross Modality Person Re-identification [J]. Computer Engineering and Applications. 2021, 57 (12): 115-125.Google Scholar
- Huang P, Zhu S H, Liang Z W. Cross-modal person re-identification with triple- attention feature aggregation [J]. Journal of Nanjing University of Posts and Telecommuncations (Natural Science Edition), 2021, 41 (05): 101-112. DOI: 10.14132/j.cnki.1673-5439.2021.05.014.Google Scholar
- Wei Z Y, Ynag X, Wang N N, Reciprocal bi-directional generative adversarial networkfor cross-modal pedestrian re-identification [J]. JOURNAL OF XIDIAN UNIVERSITY. ,2021, 48(02):205-212. DOI:10.19665/j.issn1001-2400.2021.02.026.Google Scholar
- Cheng D, Hao Y, Zhou J Y, Cross-modality person re-identification utility utility thehybrid two-stream neural networks[J]. JOURNAL OF XIDIAN UNIVERSITY, 2021,48(05):190-200.DOI:10.19665/j.issn1001-2400.2021.05.022.Google Scholar
- Zhang Z, Lan C, Zeng W, Relation-aware global attention for person re-identification[C]//Proceedings of the ieee/cvf conference on computer vision and pattern recognition. 2020: 3186-3195.Google Scholar
- Chen R, Huang W, Huang B, Reusing discriminators for encoding: Towards unsupervised image-to-image translation[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 8168-8177.Google Scholar
- Nguyen D T, Hong H G, Kim K W, Person recognition system based on a combination of body images from visible light and thermal cameras [J]. Sensors, 2017, 17(3): 605.Google ScholarCross Ref
- Basaran E, Gökmen M, Kamasak M E. An efficient framework for visible–infrared cross modality person re-identification[J]. Signal Processing: Image Communication, 2020, 87: 115933.Google ScholarCross Ref
- Zhu Y, Yang Z, Wang L, Hetero-center loss for cross-modality person re-identification[J]. Neurocomputing, 2020, 386: 97-109.Google ScholarCross Ref
- Ye M, Shen J, J Crandall D, Dynamic dual-attentive aggregation learning for visible-infrared person re-identification[C]//European Conference on Computer Vision. Springer, Cham, 2020: 229-247.Google Scholar
- Zhang Y, Xiang Xu, Tang Jun, Cross-modality person re-identification algorithm using symmetric network[J]. Journal of National University of Defense Technology, 2022, 44 (1): 122-128(in Chinese)Google Scholar
- Ye M, Lan X, Leng Q. Modality-aware collaborative learning for visible thermal person re-identification [C]// Proceedings of the 27th ACM International Conference on Multimedia. 2019: 347-355.Google Scholar
- Liu H, Cheng J, Wang W, Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification [J]. Neurocomputing, 2020, 398: 11-19.Google ScholarCross Ref
- Hao Y, Wang N, Gao X, Dual-alignment feature embedding for cross-modality person re-identification [C]// Proceedings of the 27th ACM International Conference on Multimedia. 2019: 57-65.Google Scholar
- Ye M, Shen J, Lin G, Deep learning for person re-identification: A survey and outlook[J]. IEEE transactions on pattern analysis and machine intelligence, 2021, 44 (6): 2872-2893.Google Scholar
Index Terms
- Multi-loss joint cross-modal pedestrian re-identification method fused with grayscale and RGA
Recommendations
A Local-Global Self-attention Interaction Network for RGB-D Cross-Modal Person Re-identification
Pattern Recognition and Computer VisionAbstractRGB-D cross-modal person re-identification (Re-ID) task aims to match the person images between the RGB and depth modalities. This task is rather challenging for the tremendous discrepancy between these two modalities in addition to common issues ...
Colorization of Mountainous Landscape Images in Grayscale Using Texture Feature Analysis
ICVISP 2018: Proceedings of the 2nd International Conference on Vision, Image and Signal ProcessingColorization of black and white images is a difficult task due to the lack of color information in the monochrome space. In this paper, we propose an automated approach to produce realistic colorizations using texture classification. We limit our ...
Colour image cross-modal retrieval method based on multi-modal visual data fusion
Because the traditional colour image cross-modal retrieval methods have the problems of low retrieval accuracy and recall, and long retrieval time, a colour image cross-modal retrieval method based on multi-modal visual data fusion is proposed. First, ...
Comments