ABSTRACT
Weakly supervised object localization exploits the last convolutional feature maps of classification model and the weights of Fully-Connected (FC) layer to achieves localization. However, high-level feature maps for localization lack edge features. Additionally, the weights are specific to classification task, causing only discriminative regions to be discovered. In order to fuse edge features and adjust the attention distribution for feature map channels, we propose an efficient method called Attention-based Dual-Branches Localization (ADBL) Network, in which dual-branches structure and attention mechanism are adopted to mine edge features and non-discriminative features for locating more target areas. Specifically, dual-branches structure cascades low-level feature maps to mine target object edge regions. Additionally, during inference stage, attention mechanism assigns appropriate attention for different features to preserve non-discriminative areas. Extensive experiments on both ILSVRC and CUB-200-2011 datasets show that the ADBL method achieves substantial performance improvements.
- Wonho Bae, Junhyug Noh, and Gunhee Kim. 2020. Rethinking Class Activation Mapping for Weakly Supervised Object Localization. In Proceedings of the European Conference on Computer Vision (ECCV). 618–634.Google ScholarDigital Library
- S. Bonechi, M. Bianchini, F. Scarselli, and P. Andreini. 2020. Weak supervision for generating pixel–level annotations in scene text segmentation. Pattern Recognition Letters 138 (2020), 1–7.Google ScholarCross Ref
- J. Choe and H. Shim. 2019. Attention-Based Dropout Layer for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2214–2223.Google Scholar
- J. Fan, Z. Zhang, C. Song, and T. Tan. 2020. Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4283–4292.Google Scholar
- Guangyu Guo, Junwei Han, Fang Wan, and Dingwen Zhang. 2021. Strengthen Learning Tolerance for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7403–7412.Google ScholarCross Ref
- K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778.Google Scholar
- S. Lee, S. Tariq, J. Kim, and S. S. Woo. 2021. TAR: Generalized Forensic Framework to Detect Deepfakes using Weakly Supervised Learning. (2021).Google Scholar
- Weizeng Lu, Xi Jia, Weicheng Xie, Linlin Shen, Yicong Zhou, and Jinming Duan. 2020. Geometry Constrained Weakly Supervised Object Localization. In Proceedings of the European Conference on Computer Vision (ECCV). 481–496.Google ScholarDigital Library
- J. Mai, M. Yang, and W. Luo. 2020. Erasing Integrated Learning: A Simple Yet Effective Approach for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8763–8772.Google Scholar
- Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, Weiming Dong, Haolei Yuan, Feiyue Huang, and Changsheng Xu. 2021. Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization. CoRR abs/2103.04523(2021). arxiv:2103.04523https://arxiv.org/abs/2103.04523Google Scholar
- S. Ren, K. He, R. Girshick, and J. Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 6(2017), 1137–1149.Google ScholarDigital Library
- O. Russakovsky, J. Deng, H. Su, J. Krause, and et al. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision (IJCV) 115 (2015), 211–252.Google ScholarDigital Library
- K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556(2014).Google Scholar
- K. K. Singh and Y. J. Lee. 2017. Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 3544–3553.Google ScholarCross Ref
- F. Sun and W. Li. 2019. Saliency guided deep network for weakly-supervised image segmentation. Pattern Recognition Letters 120 (2019), 62–68.Google ScholarDigital Library
- C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. 2016. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2818–2826.Google Scholar
- C. Tan, G. Gu, T. Ruan, S. Wei, and Y. Zhao. 2020. Dual-Gradients Localization Framework for Weakly Supervised Object Localization. In Proceedings of the 28th ACM International Conference on Multimedia. 1976–1984.Google Scholar
- C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. The caltech-ucsd birds-200-2011 dataset. (2011).Google Scholar
- Jun Wei, Qin Wang, Zhen Li, Sheng Wang, S. Kevin Zhou, and Shuguang Cui. 2021. Shallow Feature Matters for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5993–6001.Google ScholarCross Ref
- Xiongwei Wu, Doyen Sahoo, and Steven C.H. Hoi. 2020. Recent advances in deep learning for object detection. Neurocomputing 396(2020), 39–64. https://doi.org/10.1016/j.neucom.2020.01.085Google ScholarCross Ref
- H. Xue, C. Liu, F. Wan, J. Jiao, X. Ji, and Q. Ye. 2019. DANet: Divergent Activation for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 6588–6597.Google Scholar
- Seunghan Yang, Yoonhyung Kim, Youngeun Kim, and Kim Changick. 2020. Combinational Class Activation Maps for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 2941–2949.Google ScholarCross Ref
- Seunghan Yang, Yoonhyung Kim, Youngeun Kim, and Changick Kim. 2020. Combinational Class Activation Maps for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).Google ScholarCross Ref
- S. Yun, D. Han, S. Chun, S. J. Oh, Y. Yoo, and J. Choe. 2019. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 6022–6031.Google Scholar
- X. Zhang, Y. Wei, J. Feng, Y. Yang, and T. Huang. 2018. Adversarial Complementary Learning for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1325–1334.Google Scholar
- X. Zhang, Y. Wei, G. Kang, Y. Yang, and T. Huang. 2018. Self-produced Guidance for Weakly-supervised Object Localization. In Proceedings of the European Conference on Computer Vision (ECCV). 597–613.Google Scholar
- Xiaolin Zhang, Yunchao Wei, and Yi Yang. 2020. Inter-Image Communication for Weakly Supervised Localization. In Computer Vision – ECCV 2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, 271–287.Google ScholarDigital Library
- Xiaolin Zhang, Yunchao Wei, Yi Yang, and Fei Wu. 2020. Rethinking Localization Map: Towards Accurate Object Perception with Self-Enhancement Maps. CoRR abs/2006.05220(2020). arxiv:2006.05220https://arxiv.org/abs/2006.05220Google Scholar
- Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning Deep Features for Discriminative Localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2921–2929.Google ScholarCross Ref
- Zhi-Hua Zhou. 2017. A brief introduction to weakly supervised learning. National Science Review 5, 1 (08 2017), 44–53. https://doi.org/10.1093/nsr/nwx106Google ScholarCross Ref
Index Terms
- Attention-based Dual-Branches Localization Network for Weakly Supervised Object Localization
Recommendations
Dual-Gradients Localization Framework for Weakly Supervised Object Localization
MM '20: Proceedings of the 28th ACM International Conference on MultimediaWeakly Supervised Object Localization (WSOL) aims to learn object locations in a given image while only using image-level annotations. For highlighting the whole object regions instead of the discriminative parts, previous works often attempt to train ...
Weakly supervised activity analysis with spatio-temporal localisation
In computer vision, an increasing number of weakly annotated videos have become available, due to the fact it is often difficult and time consuming to annotate all the details in the videos collected. Learning methods that analyse human activities in ...
Weakly Supervised Object Localization with Multi-Fold Multiple Instance Learning
Object category localization is a challenging problem in computer vision. Standard supervised training requires bounding box annotations of object instances. This time-consuming annotation process is sidestepped in weakly supervised learning. In this ...
Comments