Pyramid context learning for object detection

Ding, Pengxin; Zhang, Jianping; Zhou, Huan; Zou, Xiang; Wang, Minghui

doi:10.1007/s11227-020-03168-3

Pyramid context learning for object detection

Published: 24 February 2020

Volume 76, pages 9374–9387, (2020)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Pengxin Ding^1,2,
Jianping Zhang²,
Huan Zhou¹,
Xiang Zou² &
…
Minghui Wang ORCID: orcid.org/0000-0002-1594-136X¹

394 Accesses
9 Citations
Explore all metrics

Abstract

Contextual information in complex scenarios is critical for accurate object detection. Existing state-of-the-art detectors have greatly improved detection performance with the use of contexts around objects. However, these detectors consider the local and global contexts separately, which limits the improvement in detection accuracy. In this paper, we propose a pyramid context learning module (PCL) for object detection, which makes full use of the feature context at different levels. Specifically, two operators, named aggregation and distribution, are designed to assemble and synthesize contextual information at different levels. In addition, a channel context learning operator is also used to capture the channel context. PCL is a universal module, so it can be easily integrated into most of the detection frameworks. To evaluate our PCL, we apply it into some popular detectors, e.g., SSD, Faster R-CNN and RetinaNet, and conduct extensive experiments on PASCAL VOC and MS COCO datasets. Experimental results show that PCL can produce competitive performance gains and significantly improve the baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Fig. 3

Context Refinement for Object Detection

Feature refinement with multi-level context for object detection

Article 12 May 2023

Multi-scale Feature and Spatial Relation Inference for Object Detection

References

Bell S, Lawrence Zitnick C, Bala K, Girshick R (2016) Inside–outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2874–2883
Cai Z, Vasconcelos N (2018) Cascade R-CNN: delving into high quality object detection. In: IEEE CVPR
Chen X, Gupta A (2017) Spatial memory for context reasoning in object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp 4086–4096
Chen X, Li LJ, Fei-Fei L, Gupta A (2018) Iterative visual reasoning beyond convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7239–7248
Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster R-CNN for object detection in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 3339–3348
Chen Z, Huang S, Tao D (2018) Context refinement for object detection. In: The European Conference on Computer Vision (ECCV)
Dai J, Li Y, He K, Sun J (2016) R-FCN: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems. pp 379–387
Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision. pp 764–773
Duan K, Bai S, Xie L, Qi H, Huang Q, Tian Q (2019) Centernet: object detection with keypoint triplets. ArXiv preprint arXiv:1904.08189
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659
Ghiasi G, Lin TY, Le QV (2019) Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7036–7045
Hara K, Liu MY, Tuzel O, Farahmand Am (2017) Attentional network for visual object detection. arXiv preprint arXiv:1702.01478
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Computer Vision (ICCV), 2017 IEEE International Conference on. IEEE, pp 2980–2988
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European Conference on Computer Vision. Springer, pp 630–645
Kong T, Sun F, Yao A, Liu H, Lu M, Chen Y (2017) Ron: Reverse connection with objectness prior networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 5936–5944
Kou G, Yang P, Peng Y, Xiao F, Chen Y, Alsaadi FE (2020) Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Appl Soft Comput 86:105836
Article Google Scholar
Law H, Deng J (2018) Cornernet: Detecting objects as paired keypoints. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 734–750
Lee H, Eum S, Kwon H (2017) Me r-cnn: Multi-expert r-cnn for object detection. arXiv preprint arXiv:1704.01069
Leng J, Liu Y (2019) An enhanced SSD with feature fusion and visual reasoning for object detection. Neural Comput Appl 31(10):6549–6558
Article Google Scholar
Leng J, Liu Y, Du D, Zhang T, Quan P (2019) Robust obstacle detection and recognition for driver assistance systems. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2019.2909275
Article Google Scholar
Li J, Wei Y, Liang X, Dong J, Xu T, Feng J, Yan S (2016) Attentive contexts for object detection. IEEE Trans Multimedia 19(5):944–954
Article Google Scholar
Li J, Wei Y, Liang X, Dong J, Xu T, Feng J, Yan S (2017) Attentive contexts for object detection. IEEE Trans Multimedia 19(5):944–954
Article Google Scholar
Li T, Kou G, Peng Y, Shi Y (2017) Classifying with adaptive hyper-spheres: an incremental classifier based on competitive learning. IEEE Trans Syst Man Cybern Syst. https://doi.org/10.1109/TSMC.2017.2761360
Article Google Scholar
Li X, Jiang S (2019) Know more say less: image captioning based on scene graphs. IEEE Trans Multimedia 21(8):2117–2130
Article Google Scholar
Li Z, Peng C, Yu G, Zhang X, Deng Y, Sun J (2017) Light-head r-cnn: In defense of two-stage object detector. arXiv preprint arXiv:1711.07264
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2117–2125
Lin TY, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp 2980–2988
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European Conference on Computer Vision. Springer, pp 740–755
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European Conference on Computer Vision. Springer, pp 21–37
Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. In: 18th International Conference on Pattern Recognition (ICPR’06). IEEE, vol 3, pp 850–855
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 7263–7271
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767
Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 6:1137–1149
Article Google Scholar
Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 761–769
Simonyan K, Zisserman (2014) A Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Su Y, Li Y, Xu N, Liu AA (2019) Hierarchical deep neural network for image captioning. Neural Process Lett. https://doi.org/10.1007/s11063-019-09997-5
Article Google Scholar
Tang X, Du DK, He Z, Liu J (2018) Pyramidbox: A context-assisted single shot face detector. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 797–813
Tychsen-Smith L, Petersson L (2018) Improving object localization with fitness nms and bounded iou loss. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 6877–6885
Wang X, Shrivastava A, Gupta A (2017) A-fast-rcnn: Hard positive generation via adversary for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2606–2615
Woo S, Park J, Lee JY, So Kweon I (2018) Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV). pp 3–19
Yang L, Tang K, Yang J, Li LJ (2017) Dense captioning with joint inference and visual context. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 2193–2202
Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 4203–4212
Zhu C, He Y, Savvides M (2019) Feature selective anchor-free module for single-shot object detection. arXiv preprint arXiv:1903.00621
Zhu Y, Zhao C, Wang J, Zhao X, Wu Y, Lu H (2017) Couplenet: Coupling global structure with local parts for object detection. In: Proceedings of the IEEE International Conference on Computer Vision. pp 4126–4134

Download references

Author information

Authors and Affiliations

College of Computer Science, Sichuan University, Chengdu, China
Pengxin Ding, Huan Zhou & Minghui Wang
The Second Research Institute of CAAC, Chengdu, China
Pengxin Ding, Jianping Zhang & Xiang Zou

Authors

Pengxin Ding
View author publications
You can also search for this author in PubMed Google Scholar
Jianping Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Huan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Zou
View author publications
You can also search for this author in PubMed Google Scholar
Minghui Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Minghui Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, P., Zhang, J., Zhou, H. et al. Pyramid context learning for object detection. J Supercomput 76, 9374–9387 (2020). https://doi.org/10.1007/s11227-020-03168-3

Download citation

Published: 24 February 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11227-020-03168-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pyramid context learning for object detection

Abstract

Access this article

Similar content being viewed by others

Context Refinement for Object Detection

Feature refinement with multi-level context for object detection

Multi-scale Feature and Spatial Relation Inference for Object Detection

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Pyramid context learning for object detection

Abstract

Access this article

Similar content being viewed by others

Context Refinement for Object Detection

Feature refinement with multi-level context for object detection

Multi-scale Feature and Spatial Relation Inference for Object Detection

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation