Abstract
Object detection is a hot research issue in the field of computer vision. Many methods focus on detecting large objects. And features of small objects are easily weakened or even disappeared after multiple convolution layers. So the detection rate of multi-scale objects is unsatisfied. Aiming at this problem, a concise feature pyramid region proposal network (CFPRPN) is proposed to address the problem of small objects detection in this paper without missing the large objects. In the process of object detection, we propose a new method of adjustment for the object location. So the balanced detection of multi-scale objects is realized. CFPRPN combines image pyramids and feature pyramids. An image pyramid consists of scaled versions of an image and the feature pyramids produce multiple layers’ feature maps. They are both conducive to capturing the feature information of small objects in deep convolutional networks. At the same time, proposals of overlapping sizes from different layers are applied to improve the recall rate of multi-scale objects. These series operations are beneficial for CFPRPN to extract better proposals. We experimentally prove that after adding the fine-tuning location, the detection rate of multi-scale object is further improved. The inspiring thing is that refining location method is suitable for most algorithms of object detection.
Similar content being viewed by others
References
Zhou Y, Han J, Yuan X, Wei Z, Hong R (2017) Inverse sparse group lasso model for robust object tracking. IEEE Trans Multimed 19(8):1798–1810
Wang H, Fan Y, Fang B (2018) Generalized linear discriminant analysis based on Euclidean norm for gait recognition. Int J Mach Learn Cybernet 9(4):569–576
Ommer B, Malik J (2009) Multi-scale object detection by clustering lines. In: IEEE International Conference on Computer Vision, pp 484–491
Uijlings JR, Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Zitnick CL, Doll´ar P (2014) Edge boxes: Locating object proposals from edges. In European Conference on Computer Vision, pp 391–405
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp 580–587
Girshick R (2015) Fast R-CNN. In: IEEE International Conference on Computer Vision, pp 1440–1448
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In International Conference on Neural Information Processing Systems, pp 91–99
Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In IEEE International Conference on Computer Vision and Pattern Recognition, pp 761–769
Gkioxari G, Girshick R, Malik J (2015) Contextual action recognition with R-CNN. In: IEEE International Conference on Computer Vision, pp 1080–1088
Yuan X, Xie L, Abouelenien M (2018) A regularized ensemble framework of deep learning for cancer detection from multi-class, imbalanced training data. Pattern Recognit 77:160–172
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, p 4
Zeiler MD, Krishnan D, Taylor GW, Fergus R (2010) Deconvolutional networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2528–2535
Kantorov V, Oquab M, Cho, Laptev I (2016) ContextLocNet: context-aware deep network models for weakly supervised localization. In: European Conference on Computer Vision, pp 350–365
Everingham M, Zisserman A, Williams CK et al The PASCAL visual object classes challenge 2007 (VOC2007) results. In: International Conference on Machine Learning Challenges: Evaluating Predictive Uncertainty Visual Object Classification and Recognizing Textual Entailment. Springer, Berlin, pp 117–176
Deng J, Dong W, Socher R, Li LJ, Li K, Li FF (2009) ImageNet: a large-scale hierarchical image database. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp 248–255
Acknowledgements
This work is supported by the Natural Science Foundation of Anhui Province (1708085MF146), Science and Technology Support Project of Sichuan Province (2016GZ0389), Project of Innovation Team of Ministry of Education of China (IRT17R32) and the Fundamental Research Funds for the Central Universities (No. PA2018GDQT0011).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fang, B., Fang, L. Concise feature pyramid region proposal network for multi-scale object detection. J Supercomput 76, 3327–3337 (2020). https://doi.org/10.1007/s11227-018-2569-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2569-1