Skip to main content
Log in

Concise feature pyramid region proposal network for multi-scale object detection

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Object detection is a hot research issue in the field of computer vision. Many methods focus on detecting large objects. And features of small objects are easily weakened or even disappeared after multiple convolution layers. So the detection rate of multi-scale objects is unsatisfied. Aiming at this problem, a concise feature pyramid region proposal network (CFPRPN) is proposed to address the problem of small objects detection in this paper without missing the large objects. In the process of object detection, we propose a new method of adjustment for the object location. So the balanced detection of multi-scale objects is realized. CFPRPN combines image pyramids and feature pyramids. An image pyramid consists of scaled versions of an image and the feature pyramids produce multiple layers’ feature maps. They are both conducive to capturing the feature information of small objects in deep convolutional networks. At the same time, proposals of overlapping sizes from different layers are applied to improve the recall rate of multi-scale objects. These series operations are beneficial for CFPRPN to extract better proposals. We experimentally prove that after adding the fine-tuning location, the detection rate of multi-scale object is further improved. The inspiring thing is that refining location method is suitable for most algorithms of object detection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Zhou Y, Han J, Yuan X, Wei Z, Hong R (2017) Inverse sparse group lasso model for robust object tracking. IEEE Trans Multimed 19(8):1798–1810

    Article  Google Scholar 

  2. Wang H, Fan Y, Fang B (2018) Generalized linear discriminant analysis based on Euclidean norm for gait recognition. Int J Mach Learn Cybernet 9(4):569–576

    Article  Google Scholar 

  3. Ommer B, Malik J (2009) Multi-scale object detection by clustering lines. In: IEEE International Conference on Computer Vision, pp 484–491

  4. Uijlings JR, Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

    Article  Google Scholar 

  5. Zitnick CL, Doll´ar P (2014) Edge boxes: Locating object proposals from edges. In European Conference on Computer Vision, pp 391–405

  6. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp 580–587

  7. Girshick R (2015) Fast R-CNN. In: IEEE International Conference on Computer Vision, pp 1440–1448

  8. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In International Conference on Neural Information Processing Systems, pp 91–99

  9. Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In IEEE International Conference on Computer Vision and Pattern Recognition, pp 761–769

  10. Gkioxari G, Girshick R, Malik J (2015) Contextual action recognition with R-CNN. In: IEEE International Conference on Computer Vision, pp 1080–1088

  11. Yuan X, Xie L, Abouelenien M (2018) A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data. Pattern Recognit 77:160–172

    Article  Google Scholar 

  12. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, p 4

  13. Zeiler MD, Krishnan D, Taylor GW, Fergus R (2010) Deconvolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2528–2535

  14. Kantorov V, Oquab M, Cho, Laptev I (2016) ContextLocNet: context-aware deep network models for weakly supervised localization. In: European Conference on Computer Vision, pp 350–365

  15. Everingham M, Zisserman A, Williams CK et al The PASCAL visual object classes challenge 2007 (VOC2007) results. In: International Conference on Machine Learning Challenges: Evaluating Predictive Uncertainty Visual Object Classification and Recognizing Textual Entailment. Springer, Berlin, pp 117–176

  16. Deng J, Dong W, Socher R, Li LJ, Li K, Li FF (2009) ImageNet: a large-scale hierarchical image database. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp 248–255

Download references

Acknowledgements

This work is supported by the Natural Science Foundation of Anhui Province (1708085MF146), Science and Technology Support Project of Sichuan Province (2016GZ0389), Project of Innovation Team of Ministry of Education of China (IRT17R32) and the Fundamental Research Funds for the Central Universities (No. PA2018GDQT0011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Baofu Fang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, B., Fang, L. Concise feature pyramid region proposal network for multi-scale object detection. J Supercomput 76, 3327–3337 (2020). https://doi.org/10.1007/s11227-018-2569-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2569-1

Keywords

Navigation