Skip to main content
Log in

Attribute-wise reasoning reinforcement learning for pedestrian attribute retrieval

  • Regular Paper
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Pedestrian attribute retrieval (PAR) aims at retrieving soft-biometric attributes of pedestrian images from video surveillance. Despite advancements, PAR grapples with challenges, notably the concern of attribute imbalanced distribution. Within this context, we highlight a critical observation: this challenge encompasses both inter-attribute and overlooked intra-attribute imbalanced data distribution. To address the overlooked intra-attribute imbalance problem, we introduce an attribute-wise reasoning reinforcement learning framework (AwRL). AwRL formulates PAR as a Markov decision process (MDP), orchestrating attribute retrieval individually within reinforcement learning episodes. By traversing the entire PAR dataset, each attribute retrieval is calibrated with distinct reward scales, thereby ameliorating the intra-attribute imbalance. Additionally, we develop a novel supervised reinforcement loss function (SR-Loss) to enhance the robustness of the retrieval model. SR-Loss mitigates reinforcement learning’s inherent training instability in the trial-and-error interactions with the environment. The experimental results on three benchmark datasets of PETA, RAP and PA100K demonstrate the effectiveness of our approach, underscoring its capacity to surmount the intra-attribute imbalanced problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Fig. 5

Similar content being viewed by others

References

  1. Sathish P, Balaji S (2018) A complete person re-identification model using kernel-PCA-based Gabor-filtered hybrid descriptors. Int J Multimed Inf Retr 7(4):221–229

    Article  Google Scholar 

  2. Panigrahi S, Raju U (2022) Inceptiondepth-wiseyolov2: improved implementation of yolo framework for pedestrian detection. Int J Multimed Inf Retr 1–22

  3. Murthy CB, Hashmi MF, Keskar AG (2021) Optimized mobilenet+ SSD: a real-time pedestrian detection on a low-end edge device. Int J Multimed Inf Retr 10(3):171–184

    Article  Google Scholar 

  4. Patrikar DR, Parate MR (2022) Anomaly detection using edge computing in video surveillance system. Int J Multimed Inf Retr 1–26

  5. Saremi M, Yaghmaee F (2021) Early-stopped learning for action prediction in videos. Int J Multimed Inf Retr 10(4):219–226

    Article  Google Scholar 

  6. Nafea O, Abdul W, Muhammad G (2022) Multi-sensor human activity recognition using CNN and GRU. Int J Multimed Inf Retr 11(2):135–147

    Article  Google Scholar 

  7. Zhu J, Liao S, Yi D, Lei Z, Li SZ (2015) Multi-label CNN based pedestrian attribute learning for soft biometrics. In: International conference on biometrics, pp 535–540

  8. Zheng X, Yu Z, Chen L, Zhu F, Wang S (2021) Multi-label contrastive focal loss for pedestrian attribute recognition. In: International conference on pattern recognition, pp 7349–7356

  9. Zhu J, Liao S, Lei Z, Li SZ (2017) Multi-label convolutional neural network based pedestrian attribute classification. Image Vis Comput 58:224–229

    Article  Google Scholar 

  10. Li Y, Shi F, Hou S, Li J, Li C, Yin G (2020) Feature pyramid attention model and multi-label focal loss for pedestrian attribute recognition. IEEE Access 8:164570–164579

    Article  Google Scholar 

  11. Deng Y, Luo P, Loy CC, Tang X (2014) Pedestrian attribute recognition at far distance. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp 789–792

  12. Li D, Zhang Z, Chen X, Ling H, Huang K (2016) A richly annotated dataset for pedestrian attribute recognition. arXiv preprint arXiv:1603.07054

  13. Liu X, Zhao H, Tian M, Sheng L, Shao J, Yi S, Yan J, Wang X (2017) Hydraplus-net: Attentive deep features for pedestrian analysis. In: Proceedings of the IEEE International conference on computer vision, pp 350–359

  14. Li D, Chen X, Huang K (2015) Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios. In: Asian conference on pattern recognition, pp 111–115

  15. Zhou Y, Yu K, Leng B, Zhang Z, Li D, Huang K, Feng B, Yao C (2017) Weakly-supervised learning of mid-level features for pedestrian attribute recognition and localization. In: British machine vision conference, pp 1–12

  16. Ji Z, Hu Z, He E, Han J, Pang Y (2020) Pedestrian attribute recognition based on multiple time steps attention. Pattern Recogn Lett 138:170–176

    Article  Google Scholar 

  17. Ji Z, Hu Z, Wang Y, Shao Z, Pang Y (2022) Reinforced pedestrian attribute recognition with group optimization reward. Image Vis Comput 128:104585

    Article  Google Scholar 

  18. Siadari TS, Han M, Yoon H (2019) Gsr-mar: Global super-resolution for person multi-attribute recognition. In: IEEE International conference on computer vision workshops, pp 1098–1103

  19. Ji Z, Zheng W, Pang Y (2017) Deep pedestrian attribute recognition based on LSTM. In: IEEE International conference on image processing, pp 151–155

  20. An H, Hu H-M, Guo Y, Zhou Q, Li B (2021) Hierarchical reasoning network for pedestrian attribute recognition. IEEE Trans Multimed 23:268–280

    Article  Google Scholar 

  21. Li D, Chen X, Zhang Z, Huang K (2018) Pose guided deep model for pedestrian attribute recognition in surveillance scenarios. In: IEEE International conference on multimedia and expo, pp 1–6

  22. Liu P, Liu X, Yan J, Shao J (2018) Localization guided learning for pedestrian attribute recognition. In: British machine vision conference, p 142

  23. Tang C, Sheng L, Zhang Z, Hu X (2019) Improving pedestrian attribute recognition with weakly-supervised multi-scale attribute-specific localization. In: International conference on computer vision, pp 4997–5006

  24. Zhang J, Ren P, Li J (2020) Deep template matching for pedestrian attribute recognition with the auxiliary supervision of attribute-wise keypoints. arXiv preprint arXiv:2011.06798

  25. Yang Y, Tan Z, Tiwari P, Pandey HM, Wan J, Lei Z, Guo G, Li SZ (2021) Cascaded split-and-aggregate learning with feature recombination for pedestrian attribute recognition. Int J Comput Vis 129(10):2731–2744

    Article  Google Scholar 

  26. Zeng H, Ai H, Zhuang Z, Chen L (2020) Multi-task learning via co-attentive sharing for pedestrian attribute recognition. In: IEEE International conference on multimedia and expo, pp 1–6

  27. Chen W-C, Yu X-Y, Ou L-L (2022) Pedestrian attribute recognition in video surveillance scenarios based on view-attribute attention localization. Mach Intell Res 19(2):153–168

    Article  Google Scholar 

  28. Wu M, Huang D, Guo Y, Wang Y (2020) Distraction-aware feature learning for human attribute recognition via coarse-to-fine attention mechanism. In: AAAI Conference on artificial intelligence, vol 34, pp 12394–12401

  29. Ji Z, He E, Wang H, Yang A (2019) Image-attribute reciprocally guided attention network for pedestrian attribute recognition. Pattern Recogn Lett 120:89–95

    Article  Google Scholar 

  30. Le N, Rathour VS, Yamazaki K, Luu K, Savvides M (2022) Deep reinforcement learning in computer vision: a comprehensive survey. Artif Intell Rev 55(4):2733–2819

    Article  Google Scholar 

  31. Hafiz AM, Parah SA, Bhat R (2021) Reinforcement learning applied to machine vision: state of the art. Int J Multimed Inf Retr 10(2):71–82

    Article  Google Scholar 

  32. Liu T, Meng Q, Vlontzos A, Tan J, Rueckert D, Kainz B (2020) Ultrasound video summarization using deep reinforcement learning. In: International conference on medical image computing and computer-assisted intervention, pp 483–492

  33. Teng Z, Zhang B, Fan J (2020) Three-step action search networks with deep q-learning for real-time object tracking. Pattern Recogn 101:107188

    Article  Google Scholar 

  34. Zhou M, Wang R, Xie C, Liu L, Li R, Wang F, Li D (2021) Reinforcenet: a reinforcement learning embedded object detection framework with region selection network. Neurocomputing 443:369–379

    Article  Google Scholar 

  35. Duong CN, Luu K, Quach KG, Nguyen N, Patterson E, Bui TD, Le N (2019) Automatic face aging in videos via deep reinforcement learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10013–10022

  36. Wang C, Zhou J, Duan X, Zhang G, Zhou W (2021) Recurrent deep attention network for person re-identification. In: International conference on pattern recognition. IEEE, pp 4276–4281

  37. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  38. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition, pp 770–778

  39. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  40. Joulin A, Grave E, Bojanowski P, Mikolov T (2016) Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759

  41. Sudowe P, Spitzer H, Leibe B (2015) Person attribute recognition with a jointly-trained holistic cnn model. In: IEEE International conference on computer vision workshops, pp 87–95

  42. Liu F, Xiang T, Hospedales TM, Yang W, Sun C (2017) Semantic regularisation for recurrent image annotation. In: IEEE Conference on computer vision and pattern recognition, pp 2872–2880

  43. Li Y, Lin G, Zhuang B, Liu L, Shen C, van den Hengel A (2017) Sequential person recognition in photo albums with a recurrent network. In: IEEE Conference on computer vision and pattern recognition, pp 1338–1346

  44. Sarfraz MS, Schumann A, Wang Y, Stiefelhagen R (2017) Deep view-sensitive pedestrian attribute inference in an end-to-end model. In: British machine vision conference

  45. Tan Z, Yang Y, Wan J, Hang H, Guo G, Li SZ (2019) Attention-based pedestrian attribute analysis. IEEE Trans Image Process 28(12):6126–6140

    Article  MathSciNet  MATH  Google Scholar 

  46. Zhao R, Lang C, Li Z, Liang L, Wei L, Feng S, Wang T (2022) Pedestrian attribute recognition based on attribute correlation. Multimed Syst 28(3):1069–1081

    Article  Google Scholar 

  47. Lv J, Xiong Z, Zou R, Wen Z, Lin H (2022) Feature fusion with non-local for pedestrian attribute recognition. In: 2022 2nd International conference on bioinformatics and intelligent computing, pp 421–428

  48. Zhao Y, Yam GPD, Lu J, Bian Z-P, Tian J (2022) Flsrnet: pedestrian attribute recognition using focal label smoothing regularization. Signal Image Video Process 1–8

  49. Wu J, Huang Y, Gao Z, Hong Y, Zhao J, Du X (2022) Inter-attribute awareness for pedestrian attribute recognition. Pattern Recogn 131:108865

    Article  Google Scholar 

  50. Wang J, Zhu X, Gong S, Li W (2017) Attribute recognition by joint recurrent learning of context and correlation. In: International conference on computer vision, pp 531–540

  51. Lou M, Yu Z, Guo F, Zheng X (2019) Mse-net: Pedestrian attribute recognition using mlsc and se-blocks. In: International conference on artificial intelligence and security, pp 217–226

  52. Liu Z, Zhang Z, Li D, Zhang P, Shan C (2022) Dual-branch self-attention network for pedestrian attribute recognition. Pattern Recogn Lett 163:112–120

    Article  Google Scholar 

  53. Guo H, Fan X, Wang S (2022) Visual attention consistency for human attribute recognition. Int J Comput Vis 130(4):1088–1106

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (NSFC) under Grant 62176178 and the Natural Science Foundation of Tianjin under Grant 19JCYBJC16000.

Author information

Authors and Affiliations

Authors

Contributions

YW: Methodology, writing. ZH: Methodology, software, writing. ZJ: Conceptualization, funding acquisition.

Corresponding author

Correspondence to Zhong Ji.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Hu, Z. & Ji, Z. Attribute-wise reasoning reinforcement learning for pedestrian attribute retrieval. Int J Multimed Info Retr 12, 35 (2023). https://doi.org/10.1007/s13735-023-00300-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13735-023-00300-w

Keywords

Navigation