Weakly-supervised object localization with gradient-pyramid feature

Mao, Zhongjie; Zhou, Yipeng; Sun, Jun; Wu, Hao; Pan, Feng; Ahmad, Bilal

doi:10.1007/s10489-022-03686-y

Weakly-supervised object localization with gradient-pyramid feature

Published: 14 May 2022

Volume 53, pages 2923–2935, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Zhongjie Mao¹,
Yipeng Zhou¹,
Jun Sun ORCID: orcid.org/0000-0002-9824-4294¹,
Hao Wu¹,
Feng Pan¹ &
…
Bilal Ahmad¹

429 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

As a basic task of computer vision task, object localization plays an important role in many computer vision based applications. Supervised methods employ manual location labels to learn to localize the objects directly, but incomplete or incorrectly assigned location labels affect localization accuracy, and the cost of manual labelling should also be extremely large. This paper proposes a weakly-supervised localization method based on a multi-scale gradient-pyramid feature, which employs the weighted gradient features on the multiple convolutional layers in order to generate a gradient-pyramid feature for object localization. Pairs of gradients and features from different layers are first extracted to compute the gradient features. Then, during the fusion of the gradient features through a pyramid model, the larger value is selected as the result of the fusion task without using the concatenated method. Finally, the multi-scale gradient-pyramid feature is obtained and used to have a more accurate object localization by using the region scaling operation. Our proposed method can be directly integrated into the pre-trained classification model to perform object localization without additional training. Experimental results on the ILSVRC 2016 dataset and CUB-200-2011 dataset show that the proposed method can achieve better object localization performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Simple feature pyramid network for weakly supervised object localization using multi-scale information

Article 10 May 2021

Geometry Constrained Weakly Supervised Object Localization

Shallow Feature-driven Dual-edges Localization Network for Weakly Supervised Localization

Article 09 November 2023

Data availability

The data that support the findings of this study are available in:

1) https://image-net.org/update-mar-11-2021.php.

2) http://www.vision.caltech.edu/datasets/cub_200_2011/.

3) http://host.robots.ox.ac.uk/pascal/VOC/voc2012/.

4) https://github.com/FiresWorker/WSOL.

References

Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. In: Advances in neural information processing systems (NeurIPS), pp 3856–3866
Yang S, Gao T, Wang J, Deng B, Lansdell B, Linares-Barranco B (2021) Efficient spike-driven learning with dendritic event-based processing. Front Neuro Sci 15:97
Google Scholar
Yang S, Wang J, Deng B, Azghadi MR, Linares-Barranco B (2021) Neuromorphic Context-dependent learning framework with fault-tolerant spike routing. IEEE Trans Neural Netw Learn Syst:1–15
Yang S, Wang J, Zhang N, Deng B, Pang Y, Azghadi MR (2021) CerebelluMorphic: large-scale neuromorphic model and architecture for supervised motor learning. IEEE Trans Neural Netw Learn Syst:1–15
Yang S, Wei X, Deng B, Liu C, Li H, Wang J (2018) Efficient digital implementation of a conductance-based globus pallidus neuron and the dynamics analysis. Phys A: Stat Mech Appl 494:484–502
Tychsen-Smith L, Petersson L (2018) Improving object localization with fitness nms and bounded iou loss. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6877–6885
Cui S, Wang R, Hu J, Wei J, Wang S, Lou Z (2021)In-hand object localization using a novel high-resolution Visuotactile sensor. IEEE Trans Ind Electron 69(6):6015–6025
Qin Z, Wang J, Lu Y (2019) Monogrnet: A geometric reasoning network for monocular 3d object localization. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 33, no 01, pp 8851–8858
Cao G, Xie X, Yang W, Liao Q, Shi G, Wu J (2018)Feature-fused SSD: Fast detection for small objects. In: Ninth International Conference on Graphic and Image Processing. International Society for Optics and Photonics, vol 10615, p 106151E
Mhalla A, Chateau T, Gazzah S, Amara NEB (2018) An embedded computer-vision system for multi-object detection in traffic surveillance. IEEE Trans Intell Transp Syst 20(11):4006–4018
Article Google Scholar
Cao J, Pang Y, Zhao S, Li X (2019)High-level semantic networks for multi-scale object detection. IEEE Trans Circuits Syst Video Technol 30(10):3372–3386
Article Google Scholar
Amin J, Sharif M, Yasmin M, Fernandes SL (2020) A distinctive approach in brain tumor detection and classification using MRI. Pattern Recognit Lett 139:118–127
Article Google Scholar
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2015) Object detectors emerge in deep scene cnns. International Conference on Learning Representations
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, pp 618–626
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Google Scholar
Zhang X, Wei Y, Kang G, Yang Y, Huang T (2018)Self-produced guidance for weakly-supervised object localization. In: Proceedings of the European conference on computer vision (ECCV), pp 597–613
Choe J, Shim H (2019)Attention-based dropout layer for weakly supervised object localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2219–2228
Mai J, Yang M, Luo W (2020) Erasing integrated learning: A simple yet effective approach for weakly supervised object localization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8766–8775
Meng M, Zhang T, Tian Q, Zhang Y, Wu F (2021) Foreground activation maps for weakly supervised object localization. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3385–3395
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In European conference on computer vision. Springer, Cham, pp 818–833
Singh KK, Lee YJ (2017) Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization. In: 2017 IEEE international conference on computer vision (ICCV). IEEE, pp 3544–3553
Bazzani L, Bergamo A, Anguelov D, Torresani L (2016)Self-taught object localization with deep networks. In: 2016 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 1–9
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., … Rabinovich, A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning. PMLR, pp 448–456
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Chen Y, Li J, Xiao H, Jin X, Yan S, Feng J (2017) Dual path networks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 4470–4478
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, pp 234–241
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article Google Scholar
Chen LC, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation.arXiv preprint arXiv:1706.05587
Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H (2018)Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pp 801–818
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
Sagar A, Soundrapandiyan R (2021) Semantic segmentation with multi scale spatial attention for self driving cars. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 2650–2656
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
Liu Y, Cheng MM, Hu X, Wang K, Bai X (2017) Richer convolutional features for edge detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3000–3009
Gao S, Cheng MM, Zhao K, Zhang XY, Yang MH, Torr PH (2019) Res2net: A new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., … Fei-Fei, L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The Caltech-UCSDBirds-200-2011 Dataset. Tech. Rep. Cns-Tr-2011-001, California Institute of Technology
Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: A retrospective. Int J Comput Vis 111(1):98–136
Article Google Scholar
Simonyan K, Vedaldi A, Zisserman A (2013) Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034
Zhang J, Bargal SA, Lin Z, Brandt J, Shen X, Sclaroff S (2018)Top-down neural attention by excitation backprop. Int J Comput Vis 126(10):1084–1102
Article Google Scholar
Zhang X, Wei Y, Feng J, Yang Y, Huang TS (2018) Adversarial complementary learning for weakly supervised object localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1325–1334
Bae W, Noh J, Kim G (2020), August Rethinking class activation mapping for weakly supervised object localization. In: European Conference on Computer Vision. Springer, Cham, pp 618–634
Zhang CL, Cao YH, Wu J (2020) Rethinking the route towards weakly supervised object localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 13460–13469
Choe J, Han D, Yun S, Ha JW, Oh SJ, Shim H (2021)Region-based dropout with attention prior for weakly supervised object localization. Pattern Recogn 116:107949
Article Google Scholar
Babar S, Das S (2021) Where to Look?: Mining complementary image regions for weakly supervised object localization. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp 1010–1019

Download references

Acknowledgements

The authors would like to thank the editor and the reviewers for their critical and constructive comments and suggestions. We also would like to acknowledge Dr. Vasile Palade for proofreading the whole manuscript and giving the valuable comments. This work was supported in part by the National Natural Science Foundation of China (Projects Numbers: 61673194, 61672263, 61672265), and in part by the National First-Class Discipline Program of Light Industry Technology and Engineering (Project Number: LITE2018-25).

Author information

Authors and Affiliations

School of Artificial Intelligence and Computer Science, Jiangnan University, 214122, Wuxi, Jiangsu, China
Zhongjie Mao, Yipeng Zhou, Jun Sun, Hao Wu, Feng Pan & Bilal Ahmad

Authors

Zhongjie Mao
View author publications
You can also search for this author in PubMed Google Scholar
Yipeng Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jun Sun
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wu
View author publications
You can also search for this author in PubMed Google Scholar
Feng Pan
View author publications
You can also search for this author in PubMed Google Scholar
Bilal Ahmad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jun Sun.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mao, Z., Zhou, Y., Sun, J. et al. Weakly-supervised object localization with gradient-pyramid feature. Appl Intell 53, 2923–2935 (2023). https://doi.org/10.1007/s10489-022-03686-y

Download citation

Accepted: 23 April 2022
Published: 14 May 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10489-022-03686-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weakly-supervised object localization with gradient-pyramid feature

Abstract

Access this article

Similar content being viewed by others

Simple feature pyramid network for weakly supervised object localization using multi-scale information

Geometry Constrained Weakly Supervised Object Localization

Shallow Feature-driven Dual-edges Localization Network for Weakly Supervised Localization

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Weakly-supervised object localization with gradient-pyramid feature

Abstract

Access this article

Similar content being viewed by others

Simple feature pyramid network for weakly supervised object localization using multi-scale information

Geometry Constrained Weakly Supervised Object Localization

Shallow Feature-driven Dual-edges Localization Network for Weakly Supervised Localization

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation