Hierarchical saliency mapping for weakly supervised object localization based on class activation mapping

Cheng, Zhuo; Li, Hongjian; Zeng, Xiangyan; Wang, Meiqi; Duan, Xiaolin

doi:10.1007/s11042-020-09556-4

Hierarchical saliency mapping for weakly supervised object localization based on class activation mapping

Published: 20 August 2020

Volume 79, pages 31283–31298, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Zhuo Cheng¹,
Hongjian Li¹,
Xiangyan Zeng¹,
Meiqi Wang¹ &
…
Xiaolin Duan¹

352 Accesses
Explore all metrics

Abstract

Weakly supervised object localization is a basic research in the field of computer vision. In this paper, a hierarchical saliency mapping network for object localization is proposed and designed to avoid missing detailed information of potential object. Based on the classical convolution network, we remove the fully connected part and add multiple information extraction branches. The network extracts information from convolution layers of different scales to generate Hierarchical Saliency Map. Hierarchical Saliency Maps that include Hierarchical-Class Activation Map and Hierarchical-Spatial Pyramid Saliency Map fuse deep-level features and low-level features to locate object. The datasets used for testing are Caltech-UCSD Birds 200, Caltech101 and ImageNet. Compared with Class Activation Map and Spatial Pyramid Saliency Map, the localization accuracy has been improved. This method can be used for fine-grained classification, object tracking and other fields.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Weakly Supervised Object Localization Based on Implicit Spatial Constraints

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

SESS: Saliency Enhancing with Scaling and Sliding

References

Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Berlin
Book Google Scholar
Abualigah L, Hanandeh E (2015) Applying genetic algorithms to information retrieval using vector space model. Int J Comput Sci Eng Appl 5:19–28. https://doi.org/10.5121/ijcsea.2015.5102
Google Scholar
Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73(11):4773–4795
Article Google Scholar
Abualigah LM, Khader AT, Hanandeh ES, Gandomi AH (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques. Appl Soft Comput 60:423–435
Article Google Scholar
Abualigah LM, Khader AT, Hanandeh ES (2018) A combination of objective functions and hybrid krill herd algorithm for text document clustering analysis. Eng Appl Artif Intell 73:111–125. https://doi.org/10.1016/j.engappai.2018.05.003
Article Google Scholar
Abualigah LM, Khader AT, Hanandeh ES (2018) Hybrid clustering analysis using improved krill herd algorithm. Appl Intell 48(11):4047–4071
Article Google Scholar
Bazzani L, Bergamo A, Anguelov D, Torresani L (2016) Self-taught object localization with deep networks. In: 2016 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 1–9
Chu J, Guo Z, Leng L (2018) Object detection based on multi-layer convolution feature fusion and online hard example mining. IEEE Access 1–1. https://doi.org/10.1109/ACCESS.2018.2815149
Cinbis RG, Verbeek J, Schmid C (2016) Weakly supervised object localization with multi-fold multiple instance learning. IEEE Trans Pattern Anal Mach Intell 39(1):189–203
Article Google Scholar
Fan DP, Liu JJ, Gao S, Hou Q, Borji A, Cheng MM (2018) Salient objects in clutter: bringing salient object detection to the foreground. In: European conference on computer vision (ECCV), pp 196–212
Fan DP, Lin Z, Zhao JX, Liu Y, Zhang Z, Hou Q, Zhu M, Cheng MM (2019) Rethinking rgb-d salient object detection: models, datasets and large-scale benchmarks
Fan DP, Wang W, Cheng MM, Shen J (2019) Shifting more attention to video salient object detection. In: The IEEE conference on computer vision and pattern recognition (CVPR), pp 8554–8564
Fei-Fei L, Fergus R, Perona P (2007) Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. Comput Vis Image Underst 106(1):59–70
Article Google Scholar
Fu K, Zhao Q, Gu IY, Yang J (2019) Deepside: a general deep framework for salient object detection. Neurocomputing 356:69–82
Article Google Scholar
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Leng L, Zhang J (2011) Dual-key-binding cancelable palmprint cryptosystem for palmprint protection and information security. J Netw Comput Appl 34(6):1979–1989. https://doi.org/10.1016/j.jnca.2011.07.003. Control and Optimization over Wireless Networks
Article Google Scholar
Leng L, Zhang J (2013) Palmhash code vs. palmphasor code. Neurocomputing 108:1–12. https://doi.org/10.1016/j.neucom.2012.08.028
Article Google Scholar
Leng L, Zhang J, Xu J, Khan MK, Alghathbar K (2010) Dynamic weighted discrimination power analysis in dct domain for face and palmprint recognition. In: 2010 international conference on information and communication technology convergence (ICTC). IEEE, pp 467–471
Leng L, Li M, Teoh ABJ (2013) Conjugate 2dpalmhash code for secure palm-print-vein verification. In: 2013 6th International congress on image and signal processing (CISP), vol 3. IEEE, pp 1705–1710
Leng L, Li M, Kim C, Bi X (2017) Dual-source discrimination power analysis for multi-instance contactless palmprint recognition. Multimed Tools Appl 76(1):333–354
Article Google Scholar
Li D, Huang JB, Li Y, Wang S, Yang MH (2019) Progressive representation adaptation for weakly supervised object localization. IEEE Transactions on Pattern Analysis and Machine Intelligence
Liu P, Yu H, Cang S (2019) Adaptive neural network tracking control for underactuated systems with matched and mismatched disturbances. Nonlinear Dyn 98(2):1447–1464
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: single shot multibox detector. In: European conference on computer vision. Springer, pp 21–37
Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1717–1724
Oquab M, Bottou L, Laptev I, Sivic J (2015) Is object localization for free?-weakly-supervised learning with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 685–694
Preeti, Kumar D (2017) Feature selection for face recognition using dct-pca and bat algorithm. Int J Inf Technol 9(4):411–423
Google Scholar
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Song H, Wang W, Zhao S, Shen J, Lam KM (2018) Pyramid dilated deeper convlstm for video salient object detection. In: Proceedings of the European conference on computer vision (ECCV), pp 715–731
Sun L, Zhao C, Yan Z, Liu P, Duckett T, Stolkin R (2019) A novel weakly-supervised approach for rgb-d-based nuclear waste object detection. IEEE Sens J 19(9):3487–3500
Article Google Scholar
Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: International conference on machine learning, pp 1139–1147
Tang S, Li Y, Deng L, Zhang Y (2017) Object localization based on proposal fusion. IEEE Trans Multimed 19(9):2105–2116
Article Google Scholar
Wah C, Branson S, Welinder P, Perona P, Belongie S (2011) The caltech-ucsd birds-200-2011 dataset. California Institute of Technology
Wan Z, He H (2017) Weakly supervised object localization with deep convolutional neural network based on spatial pyramid saliency map. In: 2017 IEEE international conference on image processing (ICIP). IEEE, pp 4177–4181
Xia S, Zeng J, Leng L, Fu X (2019) Ws-am: weakly supervised attention map for scene recognition. Electronics 8:1072. https://doi.org/10.3390/electronics8101072
Article Google Scholar
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833
Zhang X, Wei Y, Feng J, Yang Y, Huang TS (2018) Adversarial complementary learning for weakly supervised object localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1325–1334
Zhao J, Cao Y, Fan D, Cheng M, Li X, Zhang L (2019) Contrast prior and fluid pyramid integration for rgbd salient object detection. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 3922–3931
Zhao JX, Liu JJ, Fan DP, Cao Y, Yang J, Cheng MM (2019) Egnet: edge guidance network for salient object detection. In: Proceedings of the IEEE international conference on computer vision, pp 8779–8788
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2014) Object detectors emerge in deep scene cnns. arXiv:1412.6856
Zhou B, Lapedriza A, Xiao J, Torralba A, Oliva A (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp 487–495
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2921–2929

Download references

Acknowledgments

This work was supported by Chongqing Science and Technology Commission Project (Grant No:cstc2017jcyj-AX0142 and cstc2018jcyjAX0525), Key Research and Development Projects of Sichuan Science and Technology Department (Grant No: 2019YFG0107).

Author information

Authors and Affiliations

Department of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, People’s Republic of China
Zhuo Cheng, Hongjian Li, Xiangyan Zeng, Meiqi Wang & Xiaolin Duan

Authors

Zhuo Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Hongjian Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiangyan Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Meiqi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolin Duan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongjian Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cheng, Z., Li, H., Zeng, X. et al. Hierarchical saliency mapping for weakly supervised object localization based on class activation mapping. Multimed Tools Appl 79, 31283–31298 (2020). https://doi.org/10.1007/s11042-020-09556-4

Download citation

Received: 18 September 2019
Revised: 08 July 2020
Accepted: 06 August 2020
Published: 20 August 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s11042-020-09556-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical saliency mapping for weakly supervised object localization based on class activation mapping

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Weakly Supervised Object Localization Based on Implicit Spatial Constraints

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

SESS: Saliency Enhancing with Scaling and Sliding

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Hierarchical saliency mapping for weakly supervised object localization based on class activation mapping

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Weakly Supervised Object Localization Based on Implicit Spatial Constraints

ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization

SESS: Saliency Enhancing with Scaling and Sliding

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation