Object detector with enriched global context information

Guo, Jingjuan; Yuan, Caihong; Zhao, Zhiqiang; Feng, Ping; Luo, Yihao; Wang, Tianjiang

doi:10.1007/s11042-020-09500-6

Object detector with enriched global context information

Published: 11 August 2020

Volume 79, pages 29551–29571, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jingjuan Guo^1,2,
Caihong Yuan³,
Zhiqiang Zhao²,
Ping Feng⁴,
Yihao Luo¹ &
…
Tianjiang Wang¹

308 Accesses
6 Citations
Explore all metrics

Abstract

How to add more context information and bring more accurate detection is an important problem to be considered in object detection. In this paper, we propose a new object detector with enriched global context information by a pyramid feature pool module and several global activation blocks, named EGCI-Net, which is a one-stage object detector from scratch as DSOD.The global activation blocks are added into the backbone sub network of the detector to weaken the local information of the detected object feature maps and increase the global context of them. And the pyramid feature pool module produces multi-scale global context features to supervise the pyramid features by multi-scale global average pooling. Then the features obtained by the main structure are fused with the pyramid pooling features to merge into the final multibox detector. We have evaluated our detector on the Pascal VOC and MS COCO datasets. The experimental results show that our proposed detector achieves better results than DSOD and exceeds most of the existing excellent detectors, especially detects partially occluded objects and small objects well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature refinement with multi-level context for object detection

Article 12 May 2023

Pyramid context learning for object detection

Article 24 February 2020

Sequential Feature Fusion for Object Detection

References

Bell S, Lawrence Zitnick C, Bala Kavita, Girshick Ross (2016) Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2874–2883
Chabot F, Chaouch M, Rabarisoa J, Teuliere C, Chateau T (2017) Deep manta: a coarse-to-fine many-task network for joint 2d and 3d vehicle analysis from monocular image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1827–1836
Chen Y, Li J, Zhou B, Feng J, Yan S (2017) Weaving multi-scale context for single shot detector. arXiv preprint arXiv:1712.03149
Cheng-Yang F, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659
Dai J, Yi L, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387
Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages 248–255. IEEE
Everingham M, Gool LV, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. International journal of computer vision 88(2):303–338
Article Google Scholar
Girshick R (2015) Fast r-cnn. In: IEEE International Conference on Computer Vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 580–587
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(9):1904–1916
Article Google Scholar
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn.. In: Computer Vision (ICCV) IEEE International Conference On, pages 2980–2988. IEEE, p 2017
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hoiem D, Chodpathumwan Y, Dai Q (2012) Diagnosing error in object detectors. In: European conference on computer vision, pages 340–353. Springer
Huang G, Liu Z, Weinberger K Q, Maaten van der L (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 1, p 3
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Jie H, Li S, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia, pages 675–678 ACM
Kim S-W, Kook H-K, Sun J-Y, Kang M-C, Ko S-J (2018) Parallel feature pyramid network for object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 234–250
Kong T, Sun F, Tan C, Liu H, Huang W (2018) Deep feature pyramid reconfiguration for object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 169–185
Leng Q, Yang H, Jiang J, Tian Q (2020) Adaptive MultiScale Segmentations for Hyperspectral Image Classification. IEEE Transactions on Geoscience and Remote Sensing 58(8):5847–5860
Article Google Scholar
Li J, Liang X, Shen S, Tingfa X, Feng J, Yan S (2018) Scale-aware fast r-cnn for pedestrian detection. IEEE transactions on Multimedia 20(4):985–996
Google Scholar
Li J, Wei Y, Liang X, Dong J, Tingfa X, Feng J, Yan S (2017) Attentive contexts for object detection. IEEE Transactions on Multimedia 19(5):944–954
Article Google Scholar
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: CVPR, vol 1, p 4
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, pages 740–755. Springer
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, pages 21–37. Springer
Liu W, Rabinovich A, Berg AC (2015) Parsenet: Looking wider to see better. arXiv preprint arXiv:1506.04579
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger arXiv preprint
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: International Conference on Neural Information Processing Systems, pp 91–99
Shao Z, Wenjing W, Wang Z, Wan D, Li C (2018) Seaships: a large-scale precisely annotated dataset for ship detection. IEEE Transactions on Multimedia 20(10):2593–2604
Article Google Scholar
Shen Z, Liu Z, Li J, Jiang Y-G, Chen Y, Xue X (2017) Dsod: Learning deeply supervised object detectors from scratch. In: The IEEE International Conference on Computer Vision (ICCV), vol 3, p 7
Shen Z, Liu Z, Li J, Jiang Y-G, Chen Y, Xue X (2018) Object detection from scratch with deep supervision. arXiv preprint arXiv:1809.09294
Shen Z, Shi H, Feris R, Cao L, Yan S, Liu D, Wang X, Xue X, Huang TS (2017) Learning object detectors from scratch with gated recurrent feature pyramids. arXiv preprint arXiv:1712.00886
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI, vol 4, p 12
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2818–2826
Tian L, Li M, Hao Y, Liu J, Zhang G, Chen YQ (2018) Robust 3-d human detection in complex environments with a depth camera. IEEE Transactions on Multimedia 20(9):2249–2261
Article Google Scholar
Uijlings J RR, Sande Van De KEA , Gevers T, Smeulders AWM (2013) Selective search for object recognition. International journal of computer vision 104(2):154–171
Article Google Scholar
Wang S, Cheng J, Liu H, Wang F, Zhou H (2018) Pedestrian detection via body part semantic and contextual information with dnn. IEEE Transactions on Multimedia 20(11):3148–3159
Article Google Scholar
Woo S, Hwang S (2018) In So Kweon. Stairnet: Top-down semantic aggregation for accurate one shot detection. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1093–1102. IEEE
Xiang W, Zhang D-Q, Athitsos V, Yu H (2017) Context-aware single-shot detector. arXiv preprint arXiv:1707.08682
Yi S, Wang X, Tang X (2016) Sparsifying neural network connections for face recognition. In: Computer Vision and Pattern Recognition, pp 4856–4864
Zhang S, Wen L, Bian X, Lei Z, Li SZ (2018) Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4203–4212
Zhang Z, Qiao S, Xie C, Shen W, Bo W, Yuille A L (2018) Single-shot object detection with enriched semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5813–5821
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Zhong Q, Li C, Zhang Y, Xie D, Yang S, Pu S (2017) Cascade region proposal and global context for deep object detection. arXiv preprint arXiv:1710.10749
Zhou H, Li Z, Ning C, Tang J (2017) Cad: Scale invariant framework for real-time object detection. In: IEEE International Conference on Computer Vision Workshop
Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. In: European Conference on Computer Vision, pages 391–405. Springer

Download references

Acknowledgment

This work is supported by the Natural Science Foundation of China (Grant 61572214 and U1536203).

Author information

Authors and Affiliations

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
Jingjuan Guo, Yihao Luo & Tianjiang Wang
School of Information Science and Technology, Jiujiang University, Jiujiang, 332005, China
Jingjuan Guo & Zhiqiang Zhao
School of Computer and Information Engineering, Henan University, Kaifeng, 475004, China
Caihong Yuan
International Joint Research Center For Data Science and High-Performance Computing, Guizhou University of Finance and Economics, Guiyang, 550025, China
Ping Feng

Authors

Jingjuan Guo
View author publications
You can also search for this author in PubMed Google Scholar
Caihong Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Ping Feng
View author publications
You can also search for this author in PubMed Google Scholar
Yihao Luo
View author publications
You can also search for this author in PubMed Google Scholar
Tianjiang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jingjuan Guo or Tianjiang Wang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, J., Yuan, C., Zhao, Z. et al. Object detector with enriched global context information. Multimed Tools Appl 79, 29551–29571 (2020). https://doi.org/10.1007/s11042-020-09500-6

Download citation

Received: 01 October 2019
Revised: 10 July 2020
Accepted: 29 July 2020
Published: 11 August 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11042-020-09500-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detector with enriched global context information

Abstract

Access this article

Similar content being viewed by others

Feature refinement with multi-level context for object detection

Pyramid context learning for object detection

Sequential Feature Fusion for Object Detection

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Object detector with enriched global context information

Abstract

Access this article

Similar content being viewed by others

Feature refinement with multi-level context for object detection

Pyramid context learning for object detection

Sequential Feature Fusion for Object Detection

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation