Abstract
In person attribute recognition (PAR), an individual is described by his or her appearance. PAR-based person retrieval is a cross-modal problem where the input is a textual description of the person’s appearance and the output is an image of the person. The paper introduces PAR model development by merging a large-scale RAP dataset with the person retrieval benchmark dataset of AVSS 2018 challenge II. It uses a single deep network to detect a person’s attributes. The proposed approach uses five attributes; age, upper body (uBody) clothing color, uBody clothing type, lower body (lBody) clothing color, and lBody clothing type. Mask R-CNN is used for person detection, and the approach weighs each attribute to generate a ranking score for every detected person. Unlike the existing approaches, the proposed method uses a single deep network and fewer attributes to achieve state-of-the-art average intersection-of-union (IoU) of 66.7%, retrieval with IoU \(\ge\) 0.4 is 85.6%, and an average true positive rate (TPR) of 85.30%. It is better by 10.80% average IoU, 5.94% IoU \(\ge\) 0.4, and 3.85% TPR than the existing state-of-the-art person retrieval using attributes recognition.
Similar content being viewed by others
References
Chen D, Li H, Liu X, Shen Y, Shao J, Yuan Z, Wang X (2018) Improving deep visual representation for person re-identification by global and local image-language association. In: Proceedings of the European conference on computer vision (ECCV), pp 54–70,
Denman S, Fookes C, Bialkowski A, Sridharan S (2009) Soft-biometrics: unconstrained authentication in a surveillance environment. In: Proceedings 2009 digital image computing: techniques and applications (DICTA). IEEE, pp 196–203
Denman S, Halstead M, Fookes C, Sridharan S (2015) Searching for people using semantic soft biometric descriptions. Pattern Recognit Lett 68(2):306–315. https://doi.org/10.1016/j.patrec.2015.06.015
Galiyawala H, Raval MS (2021) Person retrieval in surveillance using textual query: a review. Multim Tools Appl 80(18):27343–27383. https://doi.org/10.1007/s11042-021-10983-0
Galiyawala H, Shah K, Gajjar V, Raval M S (2018) Person retrieval in surveillance video using height, color and gender. In: 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6
Galiyawala H, Raval MS, Dave S (2019) Visual appearance based person retrieval in unconstrained environment videos. Image Vis Comput. https://doi.org/10.1016/j.imavis.2019.10.002
Galiyawala H, Raval M S,, Laddha A(2020) Person retrieval in surveillance videos using deep soft biometrics. In: Richard J, Chang-Tsun L, Danny C, Weizhi M, Christophe R (eds) Deep biometrics. Springer, , pp 191–214
Galiyawala H, Raval MS, Savaliya D (2021) Dsa-pr: discrete soft biometric attribute-based person retrieval in surveillance videos. In: 2021 17th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–7
Gao S, Cheng M, Zhao K, Zhang X, Yang M, Torr P (2019) Res2net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell 43(2):652–662. https://doi.org/10.1109/TPAMI.2019.2938758
Halstead M, Denman S, Sridharan S, Fookes C (2014) Locating people in video from semantic descriptions: a new database and approach. In: 2014 22nd international conference on pattern recognition (ICPR). IEEE, pp 4501–4506
Halstead M, Denman S, Fookes C, Tian Y, Nixon MS (2018) Semantic person retrieval in surveillance using soft biometrics: AVSS 2018 challenge II. In: 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 770–778
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 2961–2969
Huang G, Liu Z, Van Der Maaten L, Weinberger K (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4700–4708
Jain AK, Dass SC, Nandakumar K (2004) Can soft biometric traits assist user recognition? In: Biometric technology for human identification, vol 5404, pp 561–572
Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25
Li D, Chen X, Huang K (2015) Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios. In: 2015 3rd IAPR Asian conference on pattern recognition (ACPR). IEEE, pp 111–115
Li D, Zhang Z, Chen X, Huang K (2018) A richly annotated pedestrian dataset for person retrieval in real surveillance scenarios. IEEE Trans Image Process 28(4):1575–1590. https://doi.org/10.1109/TIP.2018.2878349
Lin T, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision(ICCV). IEEE, pp 2980–2988
Sakib S, Deb K, Dhar P, Kwon O (2022) A framework for pedestrian attribute recognition using deep learning. Appl Sci 12(2):622. https://doi.org/10.3390/app12020622
Schumann A, Specker A, Beyerer J (2018) Attribute-based person retrieval and search in video sequences. In: 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6
Shah P, Raval MS, Pandya S, Chaudhary S, Laddha A, Galiyawala H (2017) Description based person identification: use of clothes color and type. In: National conference on computer vision, pattern recognition, image processing, and graphics. Springer, pp 457–469
Shah P, Garg A, Gajjar V (2021) Per-vis: Person retrieval in video surveillance using semantic description. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision (WACV), pp 41–50
Specker A, Beyerer J (2021) Improving attribute-based person retrieval by using a calibrated, weighted, and distribution-based distance metric. In: 2021 IEEE international conference on image processing (ICIP). IEEE, pp 2378–2382
Sudowe P, Spitzer H, Leibe B (2015) Person attribute recognition with a jointly-trained holistic CNN model. In: Proceedings of the IEEE international conference on computer vision workshops. IEEE, pp 87–95
Tsai R (1987) A versatile camera calibration technique for high-accuracy 3d machine vision metrology using off-the-shelf tv cameras and lenses. IEEE J Robot Autom 3(4):323–344. https://doi.org/10.1109/JRA.1987.1087109
Yaguchi T, Nixon MS (2018) Transfer learning based approach for semantic person retrieval. In: 2018 15th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6
Zhao Y, Shen C, Yu X, Chen H, Gao Y, Xiong S (2021) Learning deep part-aware embedding for person retrieval. Pattern Recognit. https://doi.org/10.1016/j.patcog.2021.107938
Zhao Y, Yam G, Lu J, Bian Z, Tian J(2022) Flsrnet: pedestrian attribute recognition using focal label smoothing regularization. Signal Image Video Process. https://doi.org/10.1007/s11760-021-02099-7
Zhen L, Hu P, Wang X, Peng D (2019) Deep supervised cross-modal retrieval. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 10394–10403
Acknowledgements
The authors acknowledge NVIDIA Corporation’s support by way of a donation of the Quadro K5200 GPU used for this research. We would also like to thank the AVSS 2018 challenge II organizers and Mr. Dhyey Savaliya for providing inputs at various stages.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Galiyawala, H., Raval, M.S. & Patel, M. Person retrieval in surveillance videos using attribute recognition. J Ambient Intell Human Comput 15, 291–303 (2024). https://doi.org/10.1007/s12652-022-03891-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-022-03891-0