R-PFN: Towards Precise Object Detection by Recurrent Pyramidal Feature Fusion

Jia, Qifei; Wei, Shikui; Zhao, Yufeng

doi:10.1007/978-3-030-60633-6_47

R-PFN: Towards Precise Object Detection by Recurrent Pyramidal Feature Fusion

Qifei Jia¹⁶,
Shikui Wei¹⁶ &
Yufeng Zhao¹⁷

Conference paper
First Online: 11 October 2020

2523 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12305))

Abstract

Object detection has been widely studied in the last few decades. However, handling objects with different scales is still marked as a challenging requirement. To solve this problem, we explore how to better utilize the features of multiple scales generated by a convolutional neural network. Specifically, we fuse a set of pyramidal features in a circular manner and propose a cascaded module, whose consideration is to enhance a single-scaled feature with information from another scale-different feature. Then, we make it recurrent to further facilitate the fusion of information among multi-scaled features. The proposed module can be integrated into any pyramid architecture. In this paper, we combine it with FPN-based Faster R-CNN, result in a framework named Recurrent Pyramidal Fusion Network (R-PFN). Experiments prove the effectiveness of R-PFN. We achieve new state-of-the-art performances, i.e., 82.0%, 43.3% on the PASCAL VOC 2007 benchmark and MS COCO benchmark in terms of mean AP, respectively.

Student Paper.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
Google Scholar
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS (2016)
Google Scholar
Lin, T., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)
Google Scholar
Girshick, R.: Fast R-CNN. In: ICCV (2015)
Google Scholar
Gidaris, S., Komodakis, N.: Object detection via a multi-region and semantic segmentation-aware CNN model. In: ICCV (2015)
Google Scholar
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: CVPR (2018)
Google Scholar
Zhao, Q., et al.: M2det: a single-shot object detector based on multi-level feature pyramid network. In: AAAI (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37, 1904–1916 (2015)
Article Google Scholar
Lin, T., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)
Google Scholar
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., Lin, D.: Libra R-CNN: towards balanced learning for object detection. In: CVPR (2019)
Google Scholar
Ghiasi, G. and Lin, T. and Le, Q. V.: NAS-FPN: learning scalable feature pyramid architecture for object detection. In: CVPR (2019)
Google Scholar
Li, Y., Chen, Y., Wang, N., Zhang, Z.: Scale-aware trident networks for object detection (2019). arXiv:1901.01892
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Zhou, X., Wang, D., Krähenbühl, P.: Objects as Points (2019). arXiv:1904.07850
Fu, C., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD: deconvolutional single shot detector (2017). arXiv:1701.06659
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: CVPR (2018)
Google Scholar
Zhu, R., et al.: ScratchDet: exploring to train single-shot object detectors from scratch (2018). arXiv:1810.08425
Dai, J., et al.: Deformable convolutional networks. In: ICCV (2017)
Google Scholar
Tychsen-Smith, L., Petersson, L.: Improving object localization with fitness NMS and bounded IOU loss. In: CVPR (2018)
Google Scholar
Bae, S.: Object Detection based on Region Decomposition and Assembly (2019). arXiv:1901.08225
Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 765–781. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_45
Chapter Google Scholar
Zhu, C., He, Y., Savvides, M.: Feature selective anchor-free module for single-shot object detection (2019). arXiv:1903.00621
Redmon, J., Farhadi, A.: YOLOV3: an incremental improvement (2018). arXiv:1804.02767

Download references

Acknowledgements

This work is in part supported by National Key Research and Development of China (2017YFC1703503) and National Natural Science Foundation of China (61972022, 61532005).

Author information

Authors and Affiliations

Beijing Jiaotong University, Beijing, China
Qifei Jia & Shikui Wei
China Academy of Chinese Medical Sciences, Beijing, China
Yufeng Zhao

Authors

Qifei Jia
View author publications
You can also search for this author in PubMed Google Scholar
Shikui Wei
View author publications
You can also search for this author in PubMed Google Scholar
Yufeng Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shikui Wei .

Editor information

Editors and Affiliations

Peking University, Beijing, China
Yuxin Peng
Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Dalian University of Technology, Dalian, China
Huchuan Lu
Chinese Academy of Sciences, Beijing, China
Zhenan Sun
Chinese Academy of Sciences, Beijing, China
Chenglin Liu
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Xilin Chen
Peking University, Beijing, China
Hongbin Zha
Nanjing University of Science and Technology, Nanjing, China
Jian Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jia, Q., Wei, S., Zhao, Y. (2020). R-PFN: Towards Precise Object Detection by Recurrent Pyramidal Feature Fusion. In: Peng, Y., et al. Pattern Recognition and Computer Vision. PRCV 2020. Lecture Notes in Computer Science(), vol 12305. Springer, Cham. https://doi.org/10.1007/978-3-030-60633-6_47

Download citation

DOI: https://doi.org/10.1007/978-3-030-60633-6_47
Published: 11 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60632-9
Online ISBN: 978-3-030-60633-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics