Abstract
Person search, which aims to find the target person in the whole scene images, has become a crucial application in security surveillance with various types of noise. Existing methods solve this problem by addressing two subtasks simultaneously: detection, distinguishing people from background noise; and identification, separating target from others noise. However, these two tasks are fundamentally contradictory, as detection is designed to be robust to intra-class variations to capture general human features, while identification is designed to be sensitive to inter-class variations to focus only on the target. Inspired by the observation that the police first narrowed the scope to a small group of people similar to the suspect, and then located the target, we propose a Dual-Focus framework consisting of Coarse-Grained Focus (search for similar people) and Fine-Grained Focus (find the target). Extensive experiments on datasets such as CUHK-SYSU and PRW are conducted to evaluate the effectiveness of our proposed methods. The evaluations demonstrate that our method delivers superior results compared with the state-of-art methods.







Similar content being viewed by others
References
Chu, Y., Zhao, L., Ahmad, T.: Multiple feature subspaces analysis for single sample per person face recognition. Vis. Comput. 1, 1–18 (2018)
Fan, H., Yang, Y.: Person tube retrieval via language description. Proc. AAAI Conf. Artif. Intell. 34, 10754–10761 (2020)
Fan, H., Zheng, L., Yan, C., Yang, Y.: Unsupervised person re-identification clustering and fine-tuning. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 14(4), 1–18 (2018)
Verma, A., Subramanyam, A., Wang, Z., Satoh, S., Shah, R.R.: Unsupervised domain adaptation for person re-identification via individual-preserving and environmental-switching cyclic generation. IEEE Trans. Multimed. (2021). https://doi.org/10.1109/TMM.2021.3126404
Cheng, Y., Liu, Y.: Person reidentification based on automotive radar point clouds. IEEE Trans. Geosci. Remote Sens. 60, 1–13 (2021)
Wan, Z., Xu, X., Wang, Z., Yamasaki, T., Zhang, X., Hu, R.: Efficient virtual data search for annotation-free vehicle reidentification. Int. J. Intell. Syst. 37(5), 2988–3005 (2022)
Yang, X., Wang, M., Tao, D.: Person re-identification with metric learning using privileged information. IEEE Trans. Image Process. 27(2), 791–805 (2018)
Zeng, Z., Wang, Z., Yang, F., Satoh, S.: Geo-localization via ground-to-satellite cross-view image retrieval. IEEE Trans. Multimed. (2022). https://doi.org/10.1109/TMM.2022.3144066
Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1367–1376 (2017)
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3415–3424 (2017)
Wang, X., Wang, Z., Liu, W., Xu, X., Chen, J., Lin, C.-W.: Consistency-constancy bi-knowledge learning for pedestrian detection in night surveillance. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4463–4471 (2021)
Wang, X., Liang, C., Chen, C., Chen, J., Wang, Z., Han, Z., Xiao, C.: S3d: scalable pedestrian detection via score scale surface discrimination. IEEE Trans. Circ. Syst. Video Technol. 30(10), 3332–3344 (2019)
Wang, W., Peng, Y., Cao, G., Guo, X., Kwok, N.: Low-illumination image enhancement for night-time UAV pedestrian detection. IEEE Trans. Industr. Inf. 17(8), 5208–5217 (2020)
Bochkovskiy, A., Wang, C.-Y., Liao, H.-Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 480–496 (2018)
Munjal, B., Amin, S., Tombari, F., Galasso, F.: Query-guided end-to-end person search. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 811–820 (2019)
Han, C., Ye, J., Zhong, Y., Tan, X., Zhang, C., Gao, C., Sang, N.: Re-id driven localization refinement for person search. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9814–9823 (2019)
Wang, X., Liu, W., Chen, J., Wang, X., Yan, C., Mei, T.: Listen, look, and find the one: Robust person search with multimodality index. ACM Trans. Multimed. Comput. Commun. Appl. 16(2), 1–20 (2020)
Dong, W., Zhang, Z., Song, C., Tan, T.: Bi-directional interaction network for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2839–2848 (2020)
Yang, W., Huang, H., Chen, X., Huang, K.: Bottom-up foreground-aware feature fusion for practical person search. IEEE Trans. Circ. Syst. Video Technol. 32(1), 262–274 (2021)
Yan, Y., Zhang, Q., Ni, B., Zhang, W., Xu, M., Yang, X.: Learning context graph for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2158–2167 (2019)
Chen, D., Zhang, S., Ouyang, W., Yang, J., Schiele, B.: Hierarchical online instance matching for person search. Proc. AAAI Conf. Artif. Intell. 34, 10518–10525 (2020)
Hou, S., Zhao, C., Chen, Z., Wu, J., Wei, Z., Miao, D.: Improved instance discrimination and feature compactness for end-to-end person search. IEEE Trans. Circ. Syst. Video Technol. 32(4), 2079–2090 (2021)
Kim, H., Joung, S., Kim, I.-J., Sohn, K.: Prototype-guided saliency feature learning for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4865–4874 (2021)
Yan, Y., Li, J., Qin, J., Bai, S., Liao, S., Liu, L., Zhu, F., Shao, L.: Anchor-free person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7690–7699 (2021)
Tian, Z., Shen, C., Chen, H., He, T.: Fcos: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9627–9636 (2019)
Lan, X., Zhu, X., Gong, S.: Person search by multi-scale matching. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 536–552 (2018)
Chen, D., Zhang, S., Ouyang, W., Yang, J., Tai, Y.: Person search by separated modeling and a mask-guided two-stream CNN model. IEEE Trans. Image Process. 29, 4669–4682 (2020)
Wang, C., Ma, B., Chang, H., Shan, S., Chen, X.: Tcts: A task-consistent two-stage framework for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11952–11961 (2020)
Yao, H., Xu, C.: Joint person objectness and repulsion for person search. IEEE Trans. Image Process. 30, 685–696 (2020)
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: IEEE International Conference on Computer Vision, pp. 1116–1124 (2015)
Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2197–2206 (2015)
Zhao, R., Ouyang, W., Wang, X.: Unsupervised salience learning for person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3586–3593 (2013)
Koestinger, M., Hirzer, M., Wohlhart, P., Roth, P.M., Bischof, H.: Large scale metric learning from equivalence constraints. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2288–2295 (2012)
Wang, M., Li, H., Tao, D., Lu, K., Wu, X.: Multimodal graph-based reranking for web image search. IEEE Trans. Image Process. 21(11), 4649–4661 (2012)
Wang, M., Hong, R., Yuan, X., Yan, S., Chua, T.-S.: Movie2comics: towards a lively video content presentation. IEEE Trans. Multimed. 14(3), 858–870 (2012)
Yang, X., Du, X., Wang, M.: Learning to match on graph for fashion compatibility modeling. In: The AAAI Conference on Artificial Intelligence, pp. 287–294 (2020)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893 (2005)
Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014)
Nam, W., Dollár, P., Han, J.H.: Local decorrelation for improved pedestrian detection. In: Advances in Neural Information Processing Systems, pp. 424–432 (2014)
Zhang, S., Benenson, R., Schiele, B., et al.. Filtered channel features for pedestrian detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4–10 (2015)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Girshick, R.: Fast r-cnn. In: IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37 (2016)
Farenzena, M., Bazzani, L., Perina, A., Murino, V., Cristani, M.: Person re-identification by symmetry-driven accumulation of local features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2360–2367 (2010)
Gray, D., Tao, H.: Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: European Conference on Computer Vision, pp. 262–275 (2008)
Wu, L., Shen, C., Hengel, A.v.d.: Personnet: Person re-identification with deep convolutional neural networks. arXiv preprint arXiv:1601.07255 (2016)
Xiao, T., Li, H., Ouyang, W., Wang, X.: Learning deep feature representations with domain guided dropout for person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1249–1258 (2016)
Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3908–3916 (2015)
Wang, Z., Bai, X., Ye, M., Satoh, S.: Incremental deep hidden attribute learning. In: ACM Multimedia Conference on Multimedia Conference, pp. 72–80 (2018)
Li, D., Chen, X., Zhang, Z., Huang, K.: Learning deep context-aware features over body and latent parts for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 384–393 (2017)
Yao, H., Zhang, S., Zhang, Y., Li, J., Qi, T.: Deep representation learning with part loss for person re-identification. IEEE Trans. Image Process. 28(6), 2860–2871 (2017)
Chen, D., Zhang, S., Yang, J., Schiele, B.: Norm-aware embedding for efficient person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12615–12624 (2020)
Acknowledgements
The work was supported by National Natural Science Foundation of China (U1803262, 62171325), National Key R&D Project (2021YFC3320301), Key Project of Scientific Research Plan of Hubei Provincial Department of Education (No. D20211106), and Opening Foundation of Key Laboratory of Fundamental Science for National Defense on Vision Synthetization, Sichuan University, China (Grant No. 2021SCUVS003). The numerical calculations in this paper have been done on the super-computing system in the Supercomputing Center of Wuhan University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hu, W., Wang, X., Wang, Z. et al. Dual-focus: person search from Coarse-Grained Focus to Fine-Grained Focus. Multimedia Systems 29, 3105–3114 (2023). https://doi.org/10.1007/s00530-022-00929-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-022-00929-3