ABSTRACT
Person re-identification has received more and more attention in recent years. However, the pedestrian images used in most existing algorithms are always produced by cropping the integral surveillance images in manual or machining ways, and there is usually only one person in each of the cropped images. In this paper, person re-identification based on the integral surveillance images is researched, which is more close to the real-world scenario. The challenge of person search mainly comes from: (1) unavailable bounding boxes for pedestrians, (2) large consumption on time and hardware. To address these two issues, we propose a multi-level feature fused framework (MLF), which can deal with pedestrian detection and person re-identification in a unified network. The first module of the framework is served as the common module for both pedestrian detection and person re-identification. The second module is designed for pedestrian detection, in which three scales of feature maps from different layers are fused to get precise pedestrian bounding boxes; In addition, we use appropriate anchors and introduce soft-NMS into our algorithm to reduce missed-detections and false-detections, especially for small pedestrians. The third module is designed for person re-identification, we use an aggregated residual transformation for deep neural network in some convolutional layers, in which way the convolutional layers' parameters and training time could be decreased. Experiments based on CUHK-SYSU and PRW show the effectiveness of the proposed method.
- Shengcai Liao, Yang Hu, Xiangyu Zhu and Stan Z Li. 2015. Person re-identification by local maximal occurrence representation and metric learning. 2015 IEEE conference on computer vision and pattern recognition, 2197--2206.Google ScholarCross Ref
- Guangcong Wang, Jianhuang Lain, Peigen Huang, Xiaohua Xie. 2018. Spatial-Temporal Person Re-identification. 2018 Thirty-Second AAAI Conference on Artificial Intelligence.Google Scholar
- Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang and Qi Tian. 2015. Scalable person re-identification: A benchmark. 2015 IEEE international conference on computer vision, 1116--1124.Google ScholarCross Ref
- Zechao Li and Jinhui Tang. 2015. Unsupervised feature selection via nonnegative spectral analysis and redundancy control. 2015 IEEE Transactions on Image Processing, 5343--5355.Google Scholar
- Martin Koestinger, Martin Hirzer, Paul Wohlhart, Peter M Roth and Horst Bischof. 2012. Large scale metric learning from equivalence constraints. 2012 IEEE conference on computer vision and pattern recognition, 2288--2295.Google ScholarCross Ref
- Yuanlu Xu, Bingpeng Ma, Rui Huang and Liang Lin. 2014. Person search in a scene by jointly modeling people commonness and person uniqueness. 2014 ACM international conference on Multimedia, 937--940.Google ScholarDigital Library
- Tong Xiao, Shuang Li, Bochao Wang, Liang Lin and Xiaogang Wang. 2016. End-to-end deep learning for person search. arXiv preprint arXiv:1604.01850, 2, 2.Google Scholar
- Tong Xiao, Shuang Li, Bochao Wang, Liang Lin and Xiaogang Wang. 2017. Joint detection and identification feature learning for person search. 2017 IEEE Conference on Computer Vision and Pattern Recognition, 3415--3424.Google ScholarCross Ref
- Di Chen, Shanshan Zhang, Wanli Ouyang, Jian Yang and Ying Tai. 2018. Person search via a mask-guided two-stream cnn model. 2018 European Conference on Computer Vision, 734--750.Google ScholarDigital Library
- Xu Lan, Xiatian Zhu and Shaogang Gong. 2018. Person search by multi-scale matching. 2018 European Conference on Computer Vision, 536--552.Google ScholarDigital Library
- Jimin Xiao, Yanchun Xie, Tammam Tillo, Kaizhu Huang, Yunchao Wei and Jiashi Feng. 2019. IAN: the individual aggregation network for person search. Pattern Recognition, 87(2019), 332--340.Google ScholarCross Ref
- Hong Liu, Wei Shi, Weipeng Huang and Qiao Guan. 2018. A discriminatively learned feature embedding based on multi-loss fusion for person search. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, 1668--1672.Google ScholarDigital Library
- Wei Shi, Hong Liu, Fanyang Meng, Weipeng Huang. 2018. Instance Enhancing Loss: Deep Identity-Sensitive Feature Embedding for Person Search. 2018 IEEE International Conference on Image Processing, 4108--4112.Google Scholar
- Hao Liu, Jiashi Feng, Zequn Jie, Karlekar Jayashree, Bo Zhao, Meibin Qi, Jianguo Jiang, Shuicheng Yan. 2017. Neural person search machines. 2017 IEEE International Conference on Computer Vision, 493--501.Google ScholarCross Ref
- Xiaojun Chang, PoYao Huang, YiDong Shen, Xiaodan Liang, Yi Yang and Alexander G Hauptmann. 2018. RCAA: Relational context-aware agents for person search. 2018 European Conference on Computer Vision, 84--100.Google ScholarDigital Library
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems,(2015), 91--99.Google Scholar
- Liang Zheng, Hengheng Zhang, Shaoyan Sun, Manmohan Chandraker, Yi Yang and Qi Tian. 2017. Person re-identification in the wild. 2017 IEEE Conference on Computer Vision and Pattern Recognition, 1367--1376.Google ScholarCross Ref
- Navaneeth Bodla, Bharat Singh, Rama Chellappa and Larry S Davis. 2017. Soft-NMS--Improving Object Detection With One Line of Code. 2017 IEEE international conference on computer vision, 5561--5569.Google Scholar
- Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. 2015 IEEE conference on computer vision and pattern recognition,1--9.Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun. 2016. Deep residual learning for image recognition. 2016 IEEE conference on computer vision and pattern recognition, 770--778.Google ScholarCross Ref
- Saining Xie and Ross Girshick, Piotr Doll{\'a}r, Zhuowen Tu, Kaiming He. 2017. Aggregated residual transformations for deep neural networks. 2017 IEEE conference on computer vision and pattern recognition, 1492--1500.Google ScholarCross Ref
- Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama and Trevor Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. 2014 ACM international conference on Multimedia, 675--678.Google ScholarDigital Library
Index Terms
- Person Search Based on Improved Joint Learning Network
Recommendations
Sequential Transformer for End-to-End Person Search
Neural Information ProcessingAbstractPerson Search aims to simultaneously localize and recognize a target person from realistic and uncropped gallery images. One major challenge of person search comes from the contradictory goals of the two sub-tasks, i.e., person detection focuses ...
Norm-Aware Embedding for Efficient Person Search and Tracking
AbstractPerson detection and Re-identification are two well-defined support tasks for practically relevant tasks such as Person Search and Multiple Person Tracking. Person Search aims to find and locate all instances with the same identity as the query ...
Person Search via a Mask-Guided Two-Stream CNN Model
Computer Vision – ECCV 2018AbstractIn this work, we tackle the problem of person search, which is a challenging task consisted of pedestrian detection and person re-identification (re-ID). Instead of sharing representations in a single joint model, we find that separating detector ...
Comments