Abstract
Pedestrian attribute retrieval (PAR) aims at retrieving soft-biometric attributes of pedestrian images from video surveillance. Despite advancements, PAR grapples with challenges, notably the concern of attribute imbalanced distribution. Within this context, we highlight a critical observation: this challenge encompasses both inter-attribute and overlooked intra-attribute imbalanced data distribution. To address the overlooked intra-attribute imbalance problem, we introduce an attribute-wise reasoning reinforcement learning framework (AwRL). AwRL formulates PAR as a Markov decision process (MDP), orchestrating attribute retrieval individually within reinforcement learning episodes. By traversing the entire PAR dataset, each attribute retrieval is calibrated with distinct reward scales, thereby ameliorating the intra-attribute imbalance. Additionally, we develop a novel supervised reinforcement loss function (SR-Loss) to enhance the robustness of the retrieval model. SR-Loss mitigates reinforcement learning’s inherent training instability in the trial-and-error interactions with the environment. The experimental results on three benchmark datasets of PETA, RAP and PA100K demonstrate the effectiveness of our approach, underscoring its capacity to surmount the intra-attribute imbalanced problem.
Similar content being viewed by others
References
Sathish P, Balaji S (2018) A complete person re-identification model using kernel-PCA-based Gabor-filtered hybrid descriptors. Int J Multimed Inf Retr 7(4):221–229
Panigrahi S, Raju U (2022) Inceptiondepth-wiseyolov2: improved implementation of yolo framework for pedestrian detection. Int J Multimed Inf Retr 1–22
Murthy CB, Hashmi MF, Keskar AG (2021) Optimized mobilenet+ SSD: a real-time pedestrian detection on a low-end edge device. Int J Multimed Inf Retr 10(3):171–184
Patrikar DR, Parate MR (2022) Anomaly detection using edge computing in video surveillance system. Int J Multimed Inf Retr 1–26
Saremi M, Yaghmaee F (2021) Early-stopped learning for action prediction in videos. Int J Multimed Inf Retr 10(4):219–226
Nafea O, Abdul W, Muhammad G (2022) Multi-sensor human activity recognition using CNN and GRU. Int J Multimed Inf Retr 11(2):135–147
Zhu J, Liao S, Yi D, Lei Z, Li SZ (2015) Multi-label CNN based pedestrian attribute learning for soft biometrics. In: International conference on biometrics, pp 535–540
Zheng X, Yu Z, Chen L, Zhu F, Wang S (2021) Multi-label contrastive focal loss for pedestrian attribute recognition. In: International conference on pattern recognition, pp 7349–7356
Zhu J, Liao S, Lei Z, Li SZ (2017) Multi-label convolutional neural network based pedestrian attribute classification. Image Vis Comput 58:224–229
Li Y, Shi F, Hou S, Li J, Li C, Yin G (2020) Feature pyramid attention model and multi-label focal loss for pedestrian attribute recognition. IEEE Access 8:164570–164579
Deng Y, Luo P, Loy CC, Tang X (2014) Pedestrian attribute recognition at far distance. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp 789–792
Li D, Zhang Z, Chen X, Ling H, Huang K (2016) A richly annotated dataset for pedestrian attribute recognition. arXiv preprint arXiv:1603.07054
Liu X, Zhao H, Tian M, Sheng L, Shao J, Yi S, Yan J, Wang X (2017) Hydraplus-net: Attentive deep features for pedestrian analysis. In: Proceedings of the IEEE International conference on computer vision, pp 350–359
Li D, Chen X, Huang K (2015) Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios. In: Asian conference on pattern recognition, pp 111–115
Zhou Y, Yu K, Leng B, Zhang Z, Li D, Huang K, Feng B, Yao C (2017) Weakly-supervised learning of mid-level features for pedestrian attribute recognition and localization. In: British machine vision conference, pp 1–12
Ji Z, Hu Z, He E, Han J, Pang Y (2020) Pedestrian attribute recognition based on multiple time steps attention. Pattern Recogn Lett 138:170–176
Ji Z, Hu Z, Wang Y, Shao Z, Pang Y (2022) Reinforced pedestrian attribute recognition with group optimization reward. Image Vis Comput 128:104585
Siadari TS, Han M, Yoon H (2019) Gsr-mar: Global super-resolution for person multi-attribute recognition. In: IEEE International conference on computer vision workshops, pp 1098–1103
Ji Z, Zheng W, Pang Y (2017) Deep pedestrian attribute recognition based on LSTM. In: IEEE International conference on image processing, pp 151–155
An H, Hu H-M, Guo Y, Zhou Q, Li B (2021) Hierarchical reasoning network for pedestrian attribute recognition. IEEE Trans Multimed 23:268–280
Li D, Chen X, Zhang Z, Huang K (2018) Pose guided deep model for pedestrian attribute recognition in surveillance scenarios. In: IEEE International conference on multimedia and expo, pp 1–6
Liu P, Liu X, Yan J, Shao J (2018) Localization guided learning for pedestrian attribute recognition. In: British machine vision conference, p 142
Tang C, Sheng L, Zhang Z, Hu X (2019) Improving pedestrian attribute recognition with weakly-supervised multi-scale attribute-specific localization. In: International conference on computer vision, pp 4997–5006
Zhang J, Ren P, Li J (2020) Deep template matching for pedestrian attribute recognition with the auxiliary supervision of attribute-wise keypoints. arXiv preprint arXiv:2011.06798
Yang Y, Tan Z, Tiwari P, Pandey HM, Wan J, Lei Z, Guo G, Li SZ (2021) Cascaded split-and-aggregate learning with feature recombination for pedestrian attribute recognition. Int J Comput Vis 129(10):2731–2744
Zeng H, Ai H, Zhuang Z, Chen L (2020) Multi-task learning via co-attentive sharing for pedestrian attribute recognition. In: IEEE International conference on multimedia and expo, pp 1–6
Chen W-C, Yu X-Y, Ou L-L (2022) Pedestrian attribute recognition in video surveillance scenarios based on view-attribute attention localization. Mach Intell Res 19(2):153–168
Wu M, Huang D, Guo Y, Wang Y (2020) Distraction-aware feature learning for human attribute recognition via coarse-to-fine attention mechanism. In: AAAI Conference on artificial intelligence, vol 34, pp 12394–12401
Ji Z, He E, Wang H, Yang A (2019) Image-attribute reciprocally guided attention network for pedestrian attribute recognition. Pattern Recogn Lett 120:89–95
Le N, Rathour VS, Yamazaki K, Luu K, Savvides M (2022) Deep reinforcement learning in computer vision: a comprehensive survey. Artif Intell Rev 55(4):2733–2819
Hafiz AM, Parah SA, Bhat R (2021) Reinforcement learning applied to machine vision: state of the art. Int J Multimed Inf Retr 10(2):71–82
Liu T, Meng Q, Vlontzos A, Tan J, Rueckert D, Kainz B (2020) Ultrasound video summarization using deep reinforcement learning. In: International conference on medical image computing and computer-assisted intervention, pp 483–492
Teng Z, Zhang B, Fan J (2020) Three-step action search networks with deep q-learning for real-time object tracking. Pattern Recogn 101:107188
Zhou M, Wang R, Xie C, Liu L, Li R, Wang F, Li D (2021) Reinforcenet: a reinforcement learning embedded object detection framework with region selection network. Neurocomputing 443:369–379
Duong CN, Luu K, Quach KG, Nguyen N, Patterson E, Bui TD, Le N (2019) Automatic face aging in videos via deep reinforcement learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10013–10022
Wang C, Zhou J, Duan X, Zhang G, Zhou W (2021) Recurrent deep attention network for person re-identification. In: International conference on pattern recognition. IEEE, pp 4276–4281
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Joulin A, Grave E, Bojanowski P, Mikolov T (2016) Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759
Sudowe P, Spitzer H, Leibe B (2015) Person attribute recognition with a jointly-trained holistic cnn model. In: IEEE International conference on computer vision workshops, pp 87–95
Liu F, Xiang T, Hospedales TM, Yang W, Sun C (2017) Semantic regularisation for recurrent image annotation. In: IEEE Conference on computer vision and pattern recognition, pp 2872–2880
Li Y, Lin G, Zhuang B, Liu L, Shen C, van den Hengel A (2017) Sequential person recognition in photo albums with a recurrent network. In: IEEE Conference on computer vision and pattern recognition, pp 1338–1346
Sarfraz MS, Schumann A, Wang Y, Stiefelhagen R (2017) Deep view-sensitive pedestrian attribute inference in an end-to-end model. In: British machine vision conference
Tan Z, Yang Y, Wan J, Hang H, Guo G, Li SZ (2019) Attention-based pedestrian attribute analysis. IEEE Trans Image Process 28(12):6126–6140
Zhao R, Lang C, Li Z, Liang L, Wei L, Feng S, Wang T (2022) Pedestrian attribute recognition based on attribute correlation. Multimed Syst 28(3):1069–1081
Lv J, Xiong Z, Zou R, Wen Z, Lin H (2022) Feature fusion with non-local for pedestrian attribute recognition. In: 2022 2nd International conference on bioinformatics and intelligent computing, pp 421–428
Zhao Y, Yam GPD, Lu J, Bian Z-P, Tian J (2022) Flsrnet: pedestrian attribute recognition using focal label smoothing regularization. Signal Image Video Process 1–8
Wu J, Huang Y, Gao Z, Hong Y, Zhao J, Du X (2022) Inter-attribute awareness for pedestrian attribute recognition. Pattern Recogn 131:108865
Wang J, Zhu X, Gong S, Li W (2017) Attribute recognition by joint recurrent learning of context and correlation. In: International conference on computer vision, pp 531–540
Lou M, Yu Z, Guo F, Zheng X (2019) Mse-net: Pedestrian attribute recognition using mlsc and se-blocks. In: International conference on artificial intelligence and security, pp 217–226
Liu Z, Zhang Z, Li D, Zhang P, Shan C (2022) Dual-branch self-attention network for pedestrian attribute recognition. Pattern Recogn Lett 163:112–120
Guo H, Fan X, Wang S (2022) Visual attention consistency for human attribute recognition. Int J Comput Vis 130(4):1088–1106
Acknowledgements
This work was supported by the National Natural Science Foundation of China (NSFC) under Grant 62176178 and the Natural Science Foundation of Tianjin under Grant 19JCYBJC16000.
Author information
Authors and Affiliations
Contributions
YW: Methodology, writing. ZH: Methodology, software, writing. ZJ: Conceptualization, funding acquisition.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Y., Hu, Z. & Ji, Z. Attribute-wise reasoning reinforcement learning for pedestrian attribute retrieval. Int J Multimed Info Retr 12, 35 (2023). https://doi.org/10.1007/s13735-023-00300-w
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13735-023-00300-w