Skip to main content

PA-RetinaNet: Path Augmented RetinaNet for Dense Object Detection

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning (ICANN 2019)

Abstract

Object detection methods can be divided into two categories that are the two-stage methods with higher accuracy but lower speed and the one-stage methods with lower accuracy but higher speed. In order to inherit the advantages of both approaches, a novel dense object detector, called Path Augmented RetinaNet (PA-RetinaNet), is proposed in this paper. It not only achieves a better accuracy than the two-stage methods, but also maintains the efficiency of the one-stage methods. Specifically, we introduce a bottom-up path augmentation module to enhance the feature exaction hierarchy, which shortens the information path between lower feature layers and topmost layers. Furthermore, we address the class imbalance problem by introducing a Class-Imbalance loss, where the loss of each training sample is weighted by a function of its predicted probability, so that the trained model focuses more on hard examples. To evaluate the effectiveness of our PA-RetinaNet, we conducted a number of experiments on the MS COCO dataset. The results show that our method is 4.3% higher than the existing two-stage method, while the speed is similar to the state-of-the-art one-stage methods.

Supported by the National Key R&D Program of China (2018YFB0203904), National Natural Science Foundation of China (61602165) and Natural Science Foundation of Hunan Province (2018JJ3074), NSFC from PRC (61872137, 61502158), Hunan NSF (2017JJ3042).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Lin et al. [9] found \(\gamma \) = 2 to work best through a large number of experiments. The function in this paper is mainly compared with the focal loss at \(\gamma \) = 2.

References

  1. Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks (2015). https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  2. Dai, J., Li, Y., He, K., et al.: R-FCN: Object Detection Via Region-based Fully Convolutional Networks (2016)

    Google Scholar 

  3. Lin, T., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection (2016). https://doi.org/10.1109/CVPR.2017.106

  4. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  5. Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection (2015). https://doi.org/10.1109/CVPR.2016.91

  6. Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517–6525 (2017)

    Google Scholar 

  7. Fu, C.Y., Liu, W., Ranga, A., et al.: DSSD: Deconvolutional Single Shot Detector (2017)

    Google Scholar 

  8. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  9. Lin, T.Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 2999–3007 (2017). https://doi.org/10.1109/TPAMI.2018.2858826

    Article  Google Scholar 

  10. Uijlings, J.R.R., van de Sande, K.E.A.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013). https://doi.org/10.1007/s11263-013-0620-5

    Article  Google Scholar 

  11. Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 761–769 (2016). https://doi.org/10.1109/CVPR.2016.89

  12. Zhang, S., Zhu, X., Lei, Z., et al.: S\(^3\)FD: single shot scale-invariant face detector (2017). https://doi.org/10.1109/ICCV.2017.30

  13. Kong, T., Sun, F., Yao, A., et al.: RON: reverse connection with objectness prior networks for object detection (2017). https://doi.org/10.1109/CVPR.2017.557

  14. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition (2015). https://doi.org/10.1109/CVPR.2016.90

  15. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Computer Society (2001). https://doi.org/10.1109/CVPR.2001.990517

  16. Felzenszwalb, P.F., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. PAMI 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  17. Girshick, R.: Fast R-CNN. In: Computer Science (2015). https://doi.org/10.1109/ICCV.2015.169

  18. Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 391–405. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_26

    Chapter  Google Scholar 

  19. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

    Google Scholar 

  20. He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2014). https://doi.org/10.1007/978-3-319-10578-9_23

    Article  Google Scholar 

  21. Wang, X., Shrivastava, A., Gupta, A.: A-Fast-RCNN: hard positive generation via adversary for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3039–3048 (2017). https://doi.org/10.1109/cvpr.2017.324

  22. Bell, S., Zitnick, C.L., Bala, K., et al.: Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks (2015). https://doi.org/10.1109/CVPR.2016.314

  23. Cai, Z., Fan, Q., Feris, R.S., Vasconcelos, N.: A unified multi-scale deep convolutional neural network for fast object detection. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 354–370. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_22

    Chapter  Google Scholar 

  24. Kong, T., Yao, A., Chen, Y., et al.: HyperNet: towards accurate region proposal generation and joint object detection (2016). https://doi.org/10.1109/CVPR.2016.98

  25. Shrivastava, A., Sukthankar, R., Malik, J., et al.: Beyond Skip Connections: Top-down Modulation For Object Detection (2016)

    Google Scholar 

  26. Sermanet, P., Eigen, D., Zhang, X., et al.: OverFeat: Integrated Recognition, Localization And Detection Using Convolutional Networks. Eprint Arxiv (2013)

    Google Scholar 

  27. Liu, S., Qi, L., Qin, H., et al.: Path aggregation network for instance segmentation (2018). https://doi.org/10.1109/CVPR.2018.00913

  28. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2014). https://doi.org/10.1109/TPAMI.2016.2572683

    Article  Google Scholar 

  29. He, K., Gkioxari, G., Dollar, P., et al.: Mask R-CNN. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1 (2017). https://doi.org/10.1109/TPAMI.2018.2844175

    Article  Google Scholar 

  30. Pytorch homepage. https://pytorch.org

Download references

Acknowledgments

This project is an improvement on yhenon’s work, thanks for the code provided by yhenon (https://github.com/yhenon/pytorch-retinanet).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guanghua Tan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tan, G., Guo, Z., Xiao, Y. (2019). PA-RetinaNet: Path Augmented RetinaNet for Dense Object Detection. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. ICANN 2019. Lecture Notes in Computer Science(), vol 11728. Springer, Cham. https://doi.org/10.1007/978-3-030-30484-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30484-3_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30483-6

  • Online ISBN: 978-3-030-30484-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics