Side-Aware Boundary Localization for More Precise Object Detection

Wang, Jiaqi; Zhang, Wenwei; Cao, Yuhang; Chen, Kai; Pang, Jiangmiao; Gong, Tao; Shi, Jianping; Loy, Chen Change; Lin, Dahua

doi:10.1007/978-3-030-58548-8_24

Jiaqi Wang¹²,
Wenwei Zhang¹³,
Yuhang Cao¹²,
Kai Chen¹⁴,
Jiangmiao Pang¹⁵,
Tao Gong¹⁶,
Jianping Shi¹⁴,
Chen Change Loy¹³ &
…
Dahua Lin¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12349))

Included in the following conference series:

European Conference on Computer Vision

6092 Accesses
102 Citations

Abstract

Current object detection frameworks mainly rely on bounding box regression to localize objects. Despite the remarkable progress in recent years, the precision of bounding box regression remains unsatisfactory, hence limiting performance in object detection. We observe that precise localization requires careful placement of each side of the bounding box. However, the mainstream approach, which focuses on predicting centers and sizes, is not the most effective way to accomplish this task, especially when there exists displacements with large variance between the anchors and the targets. In this paper, we propose an alternative approach, named as Side-Aware Boundary Localization (SABL), where each side of the bounding box is respectively localized with a dedicated network branch. To tackle the difficulty of precise localization in the presence of displacements with large variance, we further propose a two-step localization scheme, which first predicts a range of movement through bucket prediction and then pinpoints the precise position within the predicted bucket. We test the proposed method on both two-stage and single-stage detection frameworks. Replacing the standard bounding box regression branch with the proposed design leads to significant improvements on Faster R-CNN, RetinaNet, and Cascade R-CNN, by 3.0%, 1.7%, and 0.9%, respectively. Code is available at https://github.com/open-mmlab/mmdetection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-branch Bounding Box Regression for Object Detection

Article 05 January 2022

High accuracy object detection via bounding box regression network

Article 15 September 2019

Bounding convolutional network for refining object locations

Article 25 June 2023

References

Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: CVPR (2018)
Google Scholar
Cao, Y., Chen, K., Loy, C.C., Lin, D.: Prime sample attention in object detection. In: CVPR (2020)
Google Scholar
Chen, K., et al.: Hybrid task cascade for instance segmentation. In: CVPR (2019)
Google Scholar
Chen, K., et al.: MMDetection: Open MMLab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Chen, K., et al.: Optimizing video object detection via a scale-time lattice. In: CVPR (2018)
Google Scholar
Choi, J., Chun, D., Kim, H., Lee, H.J.: Gaussian YOLOv3: an accurate and fast object detector using localization uncertainty for autonomous driving. In: ICCV (2019)
Google Scholar
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS (2016)
Google Scholar
Ghiasi, G., Lin, T., Pang, R., Le, Q.V.: NAS-FPN: learning scalable feature pyramid architecture for object detection. CoRR abs/1904.07392 (2019). http://arxiv.org/abs/1904.07392
Gidaris, S., Komodakis, N.: LocNet: improving localization accuracy for object detection. In: CVPR (2016)
Google Scholar
Girshick, R.: Fast R-CNN. In: ICCV (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Google Scholar
Girshick, R., Radosavovic, I., Gkioxari, G., Dollár, P., He, K.: Detectron (2018). https://github.com/facebookresearch/detectron
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: ICCV (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Huang, Q., Xiong, Y., Lin, D.: Unifying identification and context learning for person recognition. In: CVPR (2018)
Google Scholar
Jiang, B., Luo, R., Mao, J., Xiao, T., Jiang, Y.: Acquisition of localization confidence for accurate object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 816–832. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_48
Chapter Google Scholar
Kong, T., Sun, F., Liu, H., Jiang, Y., Shi, J.: FoveaBox: beyond anchor-based object detector. CoRR abs/1904.03797 (2019)
Google Scholar
Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 765–781. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_45
Chapter Google Scholar
Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: CVPR (2017)
Google Scholar
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Lu, X., Li, B., Yue, Y., Li, Q., Yan, J.: Grid R-CNN. In: CVPR (2019)
Google Scholar
Lu, X., Li, B., Yue, Y., Li, Q., Yan, J.: Grid R-CNN plus: faster and better. arXiv preprint arXiv:1906.05688 (2019)
Najibi, M., Rastegari, M., Davis, L.S.: G-CNN: an iterative grid based object detector. In: CVPR (2016)
Google Scholar
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR (2016)
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR (2017)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
Google Scholar
Shi, S., Wang, X., Li, H.: PointRCNN: 3D object proposal generation and detection from point cloud. In: CVPR (2019)
Google Scholar
Tian, Z., Shen, C., Chen, H., He, T.: FCOS: fully convolutional one-stage object detection. CoRR abs/1904.01355 (2019)
Google Scholar
Wang, J., Chen, K., Xu, R., Liu, Z., Loy, C.C., Lin, D.: CARAFE: content-aware reassembly of features. In: ICCV (2019)
Google Scholar
Wang, J., Chen, K., Yang, S., Loy, C.C., Lin, D.: Region proposal by guided anchoring. In: CVPR (2019)
Google Scholar
Wu, Y., He, K.: Group normalization. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_1
Chapter Google Scholar
Xiong, Y., Huang, Q., Guo, L., Zhou, H., Zhou, B., Lin, D.: A graph-based framework to bridge movies and synopses. In: ICCV (2019)
Google Scholar
Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: RepPoints: point set representation for object detection. In: ICCV (2019)
Google Scholar
Zhan, X., Pan, X., Dai, B., Liu, Z., Lin, D., Loy, C.C.: Self-supervised scene de-occlusion. In: CVPR (2020)
Google Scholar
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: CVPR (2018)
Google Scholar
Zhang, W., Zhou, H., Sun, S., Wang, Z., Shi, J., Loy, C.C.: Robust multi-modality multi-object tracking. In: ICCV (2019)
Google Scholar
Zhang, X., Wan, F., Liu, C., Ji, R., Ye, Q.: FreeAnchor: learning to match anchors for visual object detection. In: NIPS
Google Scholar
Zhao, Q., et al.: M2Det: a single-shot object detector based on multi-level feature pyramid network. In: AAAI (2019)
Google Scholar
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)
Zhu, C., He, Y., Savvides, M.: Feature selective anchor-free module for single-shot object detection. In: CVPR (2019)
Google Scholar

Download references

Acknowledgement

This work is partially supported by the SenseTime Collaborative Grant on Large-scale Multi-modality Analysis (CUHK Agreement No. TS1610626 & No. TS1712093), the General Research Fund (GRF) of Hong Kong (No. 14203518 & No. 14205719), SenseTime-NTU Collaboration Project and NTU NAP.

Author information

Authors and Affiliations

The Chinese University of Hong Kong, Shatin, Hong Kong
Jiaqi Wang, Yuhang Cao & Dahua Lin
Nanyang Technological University, Singapore, Singapore
Wenwei Zhang & Chen Change Loy
SenseTime Research, Beijing, China
Kai Chen & Jianping Shi
Zhejiang University, Hangzhou, China
Jiangmiao Pang
University of Science and Technology of China, Hefei, China
Tao Gong

Authors

Jiaqi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wenwei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuhang Cao
View author publications
You can also search for this author in PubMed Google Scholar
Kai Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jiangmiao Pang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Gong
View author publications
You can also search for this author in PubMed Google Scholar
Jianping Shi
View author publications
You can also search for this author in PubMed Google Scholar
Chen Change Loy
View author publications
You can also search for this author in PubMed Google Scholar
Dahua Lin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiaqi Wang .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2370 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J. et al. (2020). Side-Aware Boundary Localization for More Precise Object Detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12349. Springer, Cham. https://doi.org/10.1007/978-3-030-58548-8_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-58548-8_24
Published: 29 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58547-1
Online ISBN: 978-3-030-58548-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Side-Aware Boundary Localization for More Precise Object Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-branch Bounding Box Regression for Object Detection

High accuracy object detection via bounding box regression network

Bounding convolutional network for refining object locations

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 2370 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Side-Aware Boundary Localization for More Precise Object Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-branch Bounding Box Regression for Object Detection

High accuracy object detection via bounding box regression network

Bounding convolutional network for refining object locations

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 2370 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation