Multi-branch Bounding Box Regression for Object Detection

Yuan, Hui-Shen; Chen, Si-Bao; Luo, Bin; Huang, Hao; Li, Qiang

doi:10.1007/s12559-021-09983-x

Multi-branch Bounding Box Regression for Object Detection

Published: 05 January 2022

Volume 15, pages 1300–1307, (2023)
Cite this article

Cognitive Computation Aims and scope Submit manuscript

Hui-Shen Yuan ORCID: orcid.org/0000-0003-4126-9041¹,
Si-Bao Chen¹,
Bin Luo¹,
Hao Huang² &
…
Qiang Li²

411 Accesses
1 Citation
Explore all metrics

Abstract

Localization and classification are two important components in the task of visual object detection. In recent years, object detectors have increasingly focused on creating various localization branches. Bounding box regression is vital for two-stage detectors. Therefore, we propose a multi-branch bounding box regression method called Multi-Branch R-CNN for robust object localization. Multi-Branch R-CNN is composed of the fully connected head and the fully convolutional head. The fully convolutional head focuses on the utilization of spatial semantics. It is complementary to the fully connected head that prefers local features. The features extracted from the two localization branches are fused, then flow to the next stage for classification and regression. The two branches cooperate to predict more precise localization, which significantly improves the performance of the detector. Extensive experiments were conducted on public PASCAL VOC and MS COCO benchmarks. On the COCO dataset, our Multi-Branch R-CNN with ResNet-101 backbone achieved state-of-the-art single model results by obtaining an mAP of 43.2. Extensive comparative experiments prove the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

CBAM: Convolutional Block Attention Module

References

Ren S, He K, Girshick R, Sun J. Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems. 2015. p. 91–99.
Lu X, Li B, Yue Y, Li Q, Yan J. Grid R-CNN. In: Proceedings of the IEEE Conference on CVPR. 2019. p. 7363–7372.
Wu Y, Chen Y, Yuan L, Liu Z, Wang L, et al. Double-head RCNN: rethinking classification and localization for object detection. arXiv. 2019;1904:06493.
Cai Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE Conference on CVPR. 2018. p. 6154–6162.
Vasamsetti S, Mittal N, Neelapu BC, et al. 3D local spatio-temporal ternary patterns for moving object detection in complex scenes. Cogn Comput. 2019;11:18–30.
Article Google Scholar
Kim J, Oh K, Oh B, et al. A line feature extraction method for finger-knuckle-print verification. Cogn Comput. 2019;11:50–70.
Article Google Scholar
Gao F, Huang T, Sun J, et al. A new algorithm for SAR image target recognition based on an improved deep convolutional neural network. Cogn Comput. 2019;11:809–24.
Article Google Scholar
Lin T-Y, Doll´ar P, Girshick R, He K, Hariharan B, et al. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on CVPR. 2017. p. 2117–2125.
Xu H, et al. Auto-fpn: Automatic network architecture adaptation for object detection beyond classification. In: Proceedings of the IEEE International Conference on Computer Vision. 2019.
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, et al. SSD: single shot multibox detector. In: European Conference on Computer Vision. Springer; 2016. p. 21–37.
Redmon J, Farhadi A. YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on CVPR. 2017. p. 7263–7271.
Redmon J, Farhadi A. Yolov3: An incremental improvement. arXiv. 2018;1804:02767.
Lin T-Y, Goyal P, Girshick R, He K, Doll´ar P. Focal loss for dense object detection. In: Proceedings of the IEEE ICCV. 2017. p. 2980–2988.
He K, Zhang X, Ren S, Sun J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell. 2015;37(9):1904–16.
Article Google Scholar
Girshick R. Fast R-CNN. In: Proceedings of the IEEE ICCV. 2015. p. 1440–1448.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on CVPR. 2016. p. 770–778.
Deng J, Dong W, Socher R, Li L-J, Li K, et al. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on CVPR. IEEE; 2009. p. 248–255.
He K, Gkioxari G, Doll´ar P, Girshick R. Mask R-CNN. In: Proceedings of the IEEE ICCV. 2017. p. 2961–2969.
Gidaris S, Komodakis N. LocNet: Improve in localization accuracy for object detection. In: Proceedings of the IEEE ICCV. 2016. p. 789–798.
Fu C-Y, Liu W, Ranga A, Tyagi A, Berg AC. DSSD: Deconvolutional single shot detector. arXiv. 2017;1701:06659.
Google Scholar
Bochkovskiy A, Wang CY, Liao H. YOLOv4: optimal speed and accuracy of object detection. arXiv. 2020;2004:10934.
Tychsen-Smith L, Petersson L. DeNet: scalable real-time object detection with directed sparse sampling. In: Proceedings of the IEEE ICCV. 2017. p. 428–436.
Zhang S, Wen L, Bian X, Lei Z, Li SZ. Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on CVPR. 2018. p. 4203–4212.
Zhu Y, Zhao C, Wang J, Zhao X, Wu Y, et al. CoupleNet: Coupling global structure with local parts for object detection. In: Proceedings of the IEEE ICCV. 2017. p. 4126–4134.
Gao Z, Wang L, Wu G. Lip: Local importance-based pooling. In: Proceedings of the IEEE International Conference on Computer Vision. 2019.
Li Y, et al. Scale-aware trident networks for object detection. In: Proceedings of the IEEE international conference on computer vision. 2019.

Download references

Acknowledgements

The authors would like to thank the editor and anonymous reviewers for their valuable comments and suggestions, which are very helpful in improving this paper.

Funding

This work was supported in part by NSFC Key Project of International (Regional) Cooperation and Exchanges (No.61860206004), National Natural Science Foundation of China (NO.61976004), and Collegiate Natural Science Fund of Anhui Province (NO.KJ2017A014).

Author information

Authors and Affiliations

Key Lab of IC&SP of MOE, School of Computer Science and Technology, Anhui University, Anhui, Hefei, 230601, China
Hui-Shen Yuan, Si-Bao Chen & Bin Luo
Postdoctoral Workstation, Suzhou Maxwell Technologies Co., Ltd., Suzhou, 215200, Jiangsu, China
Hao Huang & Qiang Li

Authors

Hui-Shen Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Si-Bao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Bin Luo
View author publications
You can also search for this author in PubMed Google Scholar
Hao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Si-Bao Chen.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Ethics Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yuan, HS., Chen, SB., Luo, B. et al. Multi-branch Bounding Box Regression for Object Detection. Cogn Comput 15, 1300–1307 (2023). https://doi.org/10.1007/s12559-021-09983-x

Download citation

Received: 29 September 2020
Accepted: 17 December 2021
Published: 05 January 2022
Issue Date: July 2023
DOI: https://doi.org/10.1007/s12559-021-09983-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-branch Bounding Box Regression for Object Detection

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

CBAM: Convolutional Block Attention Module

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Ethics Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-branch Bounding Box Regression for Object Detection

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

CBAM: Convolutional Block Attention Module

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Ethics Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation