Skip to main content
Log in

Densely convolutional and feature fused object detector

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a novel deep convolutional network for object detection named densely convolutional and feature fused object detector(DCFF-Net), which is a one-stage object detector from scratch similarly to DSOD. The base network is stacking by several densely convolutional blocks to extract the powerful semantic information, and the feature fusion module is used to obtain the enriching features by fusing the extracted feature maps from different convolutional layers. In the fusion module, the feature maps are concatenated of three adjacent scales, which are from the features extracted by the convolution with big kernels, the features extracted by down-sampling pooling and the features extracted by up-sampling deconvolution. The fused feature pyramid has more representative information and gets better performances when it is fed to the final multibox detectors. On the Pascal VOC 2007/2012 and MS COCO, our network achieves better results than DSOD and several methods with pre-training models. The experimental results show that our proposed network has better detection performance by the aid of the fusion of different layers’ feature maps, especially on small objects and occluded objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Bell S, Zitnick CL, Bala K, Girshick R (2016) Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2874–2883

  2. Chabot F, Chaouch M, Rabarisoa J, Teuliere C, Chateau T (2017) Deep manta: A coarse-to-fine many-task network for joint 2d and 3d vehicle analysis from monocular image. In: IEEE Conference on computer vision and pattern recognition, pp 1827–1836

  3. Chen Y, Li J, Zhou B, Feng J, Yan S (2017) Weaving multi-scale context for single shot detector. arXiv:1712.03149

  4. Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387

  5. Deng J, Dong W, Socher R, Li L-J, Li K, Li F-F (2009) Imagenet a large-scale hierarchical image database. In: 2009. CVPR 2009. IEEE conference on Computer vision and pattern recognition. IEEE, pp 248–255

  6. Dong S, Gao Z, Pirbhulal S, Bian G-B, Zhang H, Wu W, Li S (2019) Iot-based 3d convolution for video salient object detection. Neural Comput Applic 4:1–12

    Google Scholar 

  7. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338

    Article  Google Scholar 

  8. Fu C-Y, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: Deconvolutional single shot detector. arXiv:1701.06659

  9. Girshick R (2015) Fast r-cnn. In: IEEE International conference on computer vision, pp 1440–1448

  10. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on computer vision and pattern recognition, pp 580–587

  11. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256

  12. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37 (9):1904–1916

    Article  Google Scholar 

  13. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  14. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: 2017 IEEE international conference on Computer vision (ICCV). IEEE, pp 2980–2988

  15. Hoiem D, Chodpathumwan Y, Dai Q (2012) Diagnosing error in object detectors. In: European conference on computer vision. Springer, pp 340–353

  16. Huang G, Liu Z, Weinberger KQ, van der Maaten L (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 1, pp 3

  17. Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, pp 675–678

  18. Kong T, Yao A, Chen Y, Sun F (2016) Hypernet: Towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 845–853

  19. Kong T, Sun F, Yao A, Liu H, Lu M, Chen Y (2017) Ron: Reverse connection with objectness prior networks for object detection. In: IEEE Conference on computer vision and pattern recognition, vol 1, pp 2

  20. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105

  21. Lawrence Zitnick C, Dollár P (2014) Edge boxes: Locating object proposals from edges. In: European conference on computer vision. Springer, pp 391–405

  22. Le Cun Y (1995) Convolutional networks for images, speech, and time series Handbook of Brain Theory and Neural Networks

  23. Li Z, Zhou F (2017) Fssd: Feature fusion single shot multibox detector. arXiv:1712.00960

  24. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision. Springer, pp 740–755

  25. Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: CVPR, vol 1, pp 4

  26. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37

  27. Pirbhulal S, Samuel OW, Wu W, Sangaiah AK, Li G (2019) A joint resource-aware and medical data security framework for wearable healthcare systems. Futur Gener Comput Syst 95:382–391

    Article  Google Scholar 

  28. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger arXiv preprint

  29. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  30. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: International conference on neural information processing systems, pp 91–99

  31. Samuel OW, Asogbon MG, Geng Y, Al-Timemy AH, Pirbhulal S, Ji N, Chen S, Fang P, Li G (2019) Intelligent emg pattern recognition control method for upper-limb multifunctional prostheses Advances, current challenges, and future prospects. IEEE Access 7:10150–10165

    Article  Google Scholar 

  32. Shelhamer E, Long J, Darrell T (2014) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651

    Article  Google Scholar 

  33. Shen Z, Shi H, Feris R, Cao L, Yan S, Liu D, Wang X, Xue X, Huang TS (2017) Learning object detectors from scratch with gated recurrent feature pyramids. arXiv:1712.00886

  34. Shen Z, Liu Z, Li J, Jiang Y-G, Chen Y, Xue X (2017) Dsod: Learning deeply supervised object detectors from scratch. In: The IEEE international conference on computer vision (ICCV), vol 3, pp 7

  35. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  36. Srivastava RK, Greff K, Schmidhuber J (2015) Highway networks. arXiv:1505.00387

  37. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A et al (2015) Going deeper with convolutions. CVPR

  38. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2818–2826

  39. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI, vol 4, pp 12

  40. Uijlings JRR, Van De Sande KEA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171

    Article  Google Scholar 

  41. Xiang W, Zhang D-Q, Athitsos V, Yu H (2017) Context-aware single-shot detector. arXiv:1707.08682

  42. Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: 2017 IEEE conference on Computer vision and pattern recognition (CVPR). IEEE, pp 5987–5995

  43. Yi S, Wang X, Tang X (2016) Sparsifying neural network connections for face recognition. In: Computer vision and pattern recognition, pp 4856–4864

  44. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833

  45. Zheng L, Fu C, Zhao Y (2018) Extend the shallow part of single shot multibox detector via convolutional neural network. arXiv:1801.05918

  46. Zhou P, Ni B, Geng C, Hu J, Xu Y (2018) Scale-transferrable object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Download references

Acknowledgment

This work is supported by the Natural Science Foundation of China (Grant 61572214 and U1536203), Independent Innovation Research Fund Sponsored by Huazhong university of science and technology (Project No. 2016YXMS089).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jingjuan Guo or Kui Duan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, J., Yuan, C., Zhao, Z. et al. Densely convolutional and feature fused object detector. Multimed Tools Appl 78, 35559–35584 (2019). https://doi.org/10.1007/s11042-019-08119-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-08119-6

Keywords

Navigation