Densely convolutional and feature fused object detector

Guo, Jingjuan; Yuan, Caihong; Zhao, Zhiqiang; Feng, Ping; Wang, Tianjiang; Duan, Kui

doi:10.1007/s11042-019-08119-6

Densely convolutional and feature fused object detector

Published: 03 September 2019

Volume 78, pages 35559–35584, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jingjuan Guo^1,2,
Caihong Yuan^1,3,
Zhiqiang Zhao²,
Ping Feng¹,
Tianjiang Wang¹ &
…
Kui Duan¹

264 Accesses
Explore all metrics

Abstract

In this paper, we propose a novel deep convolutional network for object detection named densely convolutional and feature fused object detector(DCFF-Net), which is a one-stage object detector from scratch similarly to DSOD. The base network is stacking by several densely convolutional blocks to extract the powerful semantic information, and the feature fusion module is used to obtain the enriching features by fusing the extracted feature maps from different convolutional layers. In the fusion module, the feature maps are concatenated of three adjacent scales, which are from the features extracted by the convolution with big kernels, the features extracted by down-sampling pooling and the features extracted by up-sampling deconvolution. The fused feature pyramid has more representative information and gets better performances when it is fed to the final multibox detectors. On the Pascal VOC 2007/2012 and MS COCO, our network achieves better results than DSOD and several methods with pre-training models. The experimental results show that our proposed network has better detection performance by the aid of the fusion of different layers’ feature maps, especially on small objects and occluded objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

Tausif Diwan, G. Anirudh & Jitendra V. Tembhurne

CBAM: Convolutional Block Attention Module

References

Bell S, Zitnick CL, Bala K, Girshick R (2016) Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2874–2883
Chabot F, Chaouch M, Rabarisoa J, Teuliere C, Chateau T (2017) Deep manta: A coarse-to-fine many-task network for joint 2d and 3d vehicle analysis from monocular image. In: IEEE Conference on computer vision and pattern recognition, pp 1827–1836
Chen Y, Li J, Zhou B, Feng J, Yan S (2017) Weaving multi-scale context for single shot detector. arXiv:1712.03149
Dai J, Li Y, He K, Sun J (2016) R-fcn: Object detection via region-based fully convolutional networks. In: Advances in neural information processing systems, pp 379–387
Deng J, Dong W, Socher R, Li L-J, Li K, Li F-F (2009) Imagenet a large-scale hierarchical image database. In: 2009. CVPR 2009. IEEE conference on Computer vision and pattern recognition. IEEE, pp 248–255
Dong S, Gao Z, Pirbhulal S, Bian G-B, Zhang H, Wu W, Li S (2019) Iot-based 3d convolution for video salient object detection. Neural Comput Applic 4:1–12
Google Scholar
Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Fu C-Y, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: Deconvolutional single shot detector. arXiv:1701.06659
Girshick R (2015) Fast r-cnn. In: IEEE International conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on computer vision and pattern recognition, pp 580–587
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37 (9):1904–1916
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: 2017 IEEE international conference on Computer vision (ICCV). IEEE, pp 2980–2988
Hoiem D, Chodpathumwan Y, Dai Q (2012) Diagnosing error in object detectors. In: European conference on computer vision. Springer, pp 340–353
Huang G, Liu Z, Weinberger KQ, van der Maaten L (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 1, pp 3
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: Convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on Multimedia. ACM, pp 675–678
Kong T, Yao A, Chen Y, Sun F (2016) Hypernet: Towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 845–853
Kong T, Sun F, Yao A, Liu H, Lu M, Chen Y (2017) Ron: Reverse connection with objectness prior networks for object detection. In: IEEE Conference on computer vision and pattern recognition, vol 1, pp 2
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International conference on neural information processing systems, pp 1097–1105
Lawrence Zitnick C, Dollár P (2014) Edge boxes: Locating object proposals from edges. In: European conference on computer vision. Springer, pp 391–405
Le Cun Y (1995) Convolutional networks for images, speech, and time series Handbook of Brain Theory and Neural Networks
Li Z, Zhou F (2017) Fssd: Feature fusion single shot multibox detector. arXiv:1712.00960
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision. Springer, pp 740–755
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: CVPR, vol 1, pp 4
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Pirbhulal S, Samuel OW, Wu W, Sangaiah AK, Li G (2019) A joint resource-aware and medical data security framework for wearable healthcare systems. Futur Gener Comput Syst 95:382–391
Article Google Scholar
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger arXiv preprint
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: International conference on neural information processing systems, pp 91–99
Samuel OW, Asogbon MG, Geng Y, Al-Timemy AH, Pirbhulal S, Ji N, Chen S, Fang P, Li G (2019) Intelligent emg pattern recognition control method for upper-limb multifunctional prostheses Advances, current challenges, and future prospects. IEEE Access 7:10150–10165
Article Google Scholar
Shelhamer E, Long J, Darrell T (2014) Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 39(4):640–651
Article Google Scholar
Shen Z, Shi H, Feris R, Cao L, Yan S, Liu D, Wang X, Xue X, Huang TS (2017) Learning object detectors from scratch with gated recurrent feature pyramids. arXiv:1712.00886
Shen Z, Liu Z, Li J, Jiang Y-G, Chen Y, Xue X (2017) Dsod: Learning deeply supervised object detectors from scratch. In: The IEEE international conference on computer vision (ICCV), vol 3, pp 7
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Srivastava RK, Greff K, Schmidhuber J (2015) Highway networks. arXiv:1505.00387
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A et al (2015) Going deeper with convolutions. CVPR
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2818–2826
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI, vol 4, pp 12
Uijlings JRR, Van De Sande KEA, Gevers T, Smeulders AWM (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Article Google Scholar
Xiang W, Zhang D-Q, Athitsos V, Yu H (2017) Context-aware single-shot detector. arXiv:1707.08682
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: 2017 IEEE conference on Computer vision and pattern recognition (CVPR). IEEE, pp 5987–5995
Yi S, Wang X, Tang X (2016) Sparsifying neural network connections for face recognition. In: Computer vision and pattern recognition, pp 4856–4864
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833
Zheng L, Fu C, Zhao Y (2018) Extend the shallow part of single shot multibox detector via convolutional neural network. arXiv:1801.05918
Zhou P, Ni B, Geng C, Hu J, Xu Y (2018) Scale-transferrable object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Download references

Acknowledgment

This work is supported by the Natural Science Foundation of China (Grant 61572214 and U1536203), Independent Innovation Research Fund Sponsored by Huazhong university of science and technology (Project No. 2016YXMS089).

Author information

Authors and Affiliations

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
Jingjuan Guo, Caihong Yuan, Ping Feng, Tianjiang Wang & Kui Duan
School of Information Science and Technology, Jiujiang University, Jiujiang, 332005, China
Jingjuan Guo & Zhiqiang Zhao
School of Computer and Information Engineering, Henan University, Kaifeng, 475004, China
Caihong Yuan

Authors

Jingjuan Guo
View author publications
You can also search for this author in PubMed Google Scholar
Caihong Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Ping Feng
View author publications
You can also search for this author in PubMed Google Scholar
Tianjiang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kui Duan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jingjuan Guo or Kui Duan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, J., Yuan, C., Zhao, Z. et al. Densely convolutional and feature fused object detector. Multimed Tools Appl 78, 35559–35584 (2019). https://doi.org/10.1007/s11042-019-08119-6

Download citation

Received: 06 December 2018
Revised: 31 July 2019
Accepted: 13 August 2019
Published: 03 September 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s11042-019-08119-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Densely convolutional and feature fused object detector

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

CBAM: Convolutional Block Attention Module

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Densely convolutional and feature fused object detector

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

CBAM: Convolutional Block Attention Module

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation