Count- and Similarity-Aware R-CNN for Pedestrian Detection

Xie, Jin; Cholakkal, Hisham; Muhammad Anwer, Rao; Shahbaz Khan, Fahad; Pang, Yanwei; Shao, Ling; Shah, Mubarak

doi:10.1007/978-3-030-58520-4_6

Jin Xie¹²,
Hisham Cholakkal^13,14,
Rao Muhammad Anwer^13,14,
Fahad Shahbaz Khan^13,14,
Yanwei Pang¹²,
Ling Shao^13,14 &
…
Mubarak Shah¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12362))

Included in the following conference series:

European Conference on Computer Vision

3032 Accesses
16 Citations

Abstract

Recent pedestrian detection methods generally rely on additional supervision, such as visible bounding-box annotations, to handle heavy occlusions. We propose an approach that leverages pedestrian count and proposal similarity information within a two-stage pedestrian detection framework. Both pedestrian count and proposal similarity are derived from standard full-body annotations commonly used to train pedestrian detectors. We introduce a count-weighted detection loss function that assigns higher weights to the detection errors occurring at highly overlapping pedestrians. The proposed loss function is utilized at both stages of the two-stage detector. We further introduce a count-and-similarity branch within the two-stage detection framework, which predicts pedestrian count as well as proposal similarity. Lastly, we introduce a count and similarity-aware NMS strategy to identify distinct proposals. Our approach requires neither part information nor visible bounding-box annotations. Experiments are performed on the CityPersons and CrowdHuman datasets. Our method sets a new state-of-the-art on both datasets. Further, it achieves an absolute gain of 2.4% over the current state-of-the-art, in terms of log-average miss rate, on the heavily occluded (HO) set of CityPersons test set. Finally, we demonstrate the applicability of our approach for the problem of human instance segmentation. Code and models are available at: https://github.com/Leotju/CaSe.

J. Xie and H. Cholakkal—Contribute equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Thanks to the PedHunter [7] authors for sharing head annotation on CityPersons validation set through email correspondence.
2.
More results are available at https://github.com/Leotju/CaSe.

References

Bodla, N., Singh, B., Chellappa, R., Davis, L.S.: Soft-NMS - improving object detection with one line of code. In: ICCV (2017)
Google Scholar
Brazil, G., Yin, X., Liu, X.: Illuminating pedestrians via simultaneous detection & segmentation. In: ICCV (2017)
Google Scholar
Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_22
Chapter Google Scholar
Cai, Z., Vasconcelos, N.: Cascade r-cnn: High quality object detection and instance segmentation. arXiv preprint arXiv:1906.09756 (2019)
Cao, J., Pang, Y., Han, J., Gao, B., Li, X.: Taking a look at small-scale pedestrians and occluded pedestrians. IEEE Trans. Image Process. 29, 3143–3152 (2020)
Article Google Scholar
Cao, J., Pang, Y., Zhao, S., Li, X.: High-level semantic networks for multi-scale object detection. IEEE Trans. Circ. Syst. Video Technol. 30, 3372–3386 (2019)
Article Google Scholar
Chi, C., Zhang, S., Xing, J., Lei, Z., Li, S.Z.X.Z.: Pedhunter: occlusion robust pedestrian detector in crowded scenes. In: AAAI (2020)
Google Scholar
Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. TPAMI 34, 743–761 (2012)
Article Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV (2017)
Google Scholar
Hosang, J., Benenson, R., Schiele, B.: Learning non-maximum suppression. In: CVPR (2017)
Google Scholar
Hu, H., Gu, J., Zhang, Z., Dai, J., Wei, Y.: Relation networks for object detection. In: CVPR (2018)
Google Scholar
Jiang, B., Luo, R., Mao, J., Xiao, T., Jiang, Y.: Acquisition of localization confidence for accurate object detection. In: ECCV (2018)
Google Scholar
Liu, S., Huang, D., Wang, Y.: Adaptive NMS: refining pedestrian detection in a crowd. In: CVPR (2019)
Google Scholar
Liu, W., Liao, S., Hu, W., Liang, X., Chen, X.: Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: ECCV (2018)
Google Scholar
Liu, W., Liao, S., Ren, W., Hu, W., Yu, Y.: High-level semantic feature detection: a new perspective for pedestrian detection. In: CVPR (2019)
Google Scholar
Mao, J., Xiao, T., Jiang, Y., Cao, Z.: What can help pedestrian detection? In: CVPR (2017)
Google Scholar
Mathias, M., Benenson, R., Timofte, R., Gool, L.V.: Handling occlusions with franken-classifiers. In: ICCV (2013)
Google Scholar
Nie, J., Anwer, R.M., Cholakkal, H., Khan, F.S., Pang, Y., Shao, L.: Enriched feature guided refinement network for object detection. In: ICCV (2019)
Google Scholar
Ouyang, W., Wang, X.: Joint deep learning for pedestrian detection. In: ICCV (2013)
Google Scholar
Pang, Y., Xie, J., Khan, M.H., Anwer, R.M., Khan, F.S., Shao, L.: Mask-Guided attention network for occluded pedestrian detection. In: ICCV (2019)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
Google Scholar
Shao, S., et al.: Crowdhuman: A benchmark for detecting human in a crowd. arXiv preprint arXiv:1805.00123 (2018)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Song, T., Sun, L., Xie, D., Sun, H., Pu, S.: Small-scale pedestrian detection based on topological line localization and temporal feature aggregation. In: ECCV (2018)
Google Scholar
Tian, Y., Luo, P., Wang, X., Tang, X.: Deep learning strong parts for pedestrian detection. In: ICCV (2015)
Google Scholar
Tychsen-Smith, L., Petersson, L.: Improving object localization with fitness nms and bounded IOU loss. In: CVPR (2018)
Google Scholar
Wang, X., Xiao, T., Jiang, Y., Shao, S., Sun, J., Shen, C.: Repulsion loss: detecting pedestrians in a crowd. In: CVPR (2018)
Google Scholar
Xie, J., Pang, Y., Cholakkal, H., Anwer, R.M., Khan, F.S., Shao, L.: PSC-net: learning part spatial co-occurrence for occluded pedestrian detection. arXiv preprint arXiv:2001.09252 (2020)
Zhang, J., et al.: Attribute-aware pedestrian detection in a crowd. arXiv preprint arXiv:1910.09188 (2019)
Zhang, S., Benenson, R., Schiele, B.: Citypersons: a diverse dataset for pedestrian detection. In: CVPR (2017)
Google Scholar
Zhang, S., Yang, J., Schiele, B.: Occluded pedestrian detection through guided attention in CNNs. In: CVPR (2018)
Google Scholar
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: ECCV (2018)
Google Scholar
Zhang, S.H., et al.: Pose2seg: detection free human instance segmentation. In: CVPR (2019)
Google Scholar
Zhou, C., Yang, M., Yuan, J.: Discriminative feature transformation for occluded pedestrian detection. In: ICCV (2019)
Google Scholar
Zhou, C., Yuan, J.: Non-rectangular part discovery for object detection. In: BMVC (2014)
Google Scholar
Zhou, C., Yuan, J.: Multi-label learning of part detectors for heavily occluded pedestrian detection. In: ICCV (2017)
Google Scholar
Zhou, C., Yuan, J.: Bi-box regression for pedestrian detection and occlusion estimation. In: ECCV (2018)
Google Scholar

Download references

Acknowledgment

The work is supported by the National Key R&D Program of China (Grant # 2018AAA0102800 and 2018AAA0102802) and National Natural Science Foundation of China (Grant # 61632018).

Author information

Authors and Affiliations

Tianjin Key Laboratory of Brain-Inspired Artificial Intelligence, School of Electrical and Information Engineering, Tianjin University, Tianjin, China
Jin Xie & Yanwei Pang
Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Ling Shao
Inception Institute of Artificial Intelligence, Abu Dhabi, UAE
Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan & Ling Shao
University of Central Florida, Orlando, USA
Mubarak Shah

Authors

Jin Xie
View author publications
You can also search for this author in PubMed Google Scholar
Hisham Cholakkal
View author publications
You can also search for this author in PubMed Google Scholar
Rao Muhammad Anwer
View author publications
You can also search for this author in PubMed Google Scholar
Fahad Shahbaz Khan
View author publications
You can also search for this author in PubMed Google Scholar
Yanwei Pang
View author publications
You can also search for this author in PubMed Google Scholar
Ling Shao
View author publications
You can also search for this author in PubMed Google Scholar
Mubarak Shah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yanwei Pang .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 15283 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xie, J. et al. (2020). Count- and Similarity-Aware R-CNN for Pedestrian Detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12362. Springer, Cham. https://doi.org/10.1007/978-3-030-58520-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-58520-4_6
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58519-8
Online ISBN: 978-3-030-58520-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics