skip to main content
10.1145/3469877.3490568acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Attention-based Dual-Branches Localization Network for Weakly Supervised Object Localization

Published:10 January 2022Publication History

ABSTRACT

Weakly supervised object localization exploits the last convolutional feature maps of classification model and the weights of Fully-Connected (FC) layer to achieves localization. However, high-level feature maps for localization lack edge features. Additionally, the weights are specific to classification task, causing only discriminative regions to be discovered. In order to fuse edge features and adjust the attention distribution for feature map channels, we propose an efficient method called Attention-based Dual-Branches Localization (ADBL) Network, in which dual-branches structure and attention mechanism are adopted to mine edge features and non-discriminative features for locating more target areas. Specifically, dual-branches structure cascades low-level feature maps to mine target object edge regions. Additionally, during inference stage, attention mechanism assigns appropriate attention for different features to preserve non-discriminative areas. Extensive experiments on both ILSVRC and CUB-200-2011 datasets show that the ADBL method achieves substantial performance improvements.

References

  1. Wonho Bae, Junhyug Noh, and Gunhee Kim. 2020. Rethinking Class Activation Mapping for Weakly Supervised Object Localization. In Proceedings of the European Conference on Computer Vision (ECCV). 618–634.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Bonechi, M. Bianchini, F. Scarselli, and P. Andreini. 2020. Weak supervision for generating pixel–level annotations in scene text segmentation. Pattern Recognition Letters 138 (2020), 1–7.Google ScholarGoogle ScholarCross RefCross Ref
  3. J. Choe and H. Shim. 2019. Attention-Based Dropout Layer for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2214–2223.Google ScholarGoogle Scholar
  4. J. Fan, Z. Zhang, C. Song, and T. Tan. 2020. Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4283–4292.Google ScholarGoogle Scholar
  5. Guangyu Guo, Junwei Han, Fang Wan, and Dingwen Zhang. 2021. Strengthen Learning Tolerance for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7403–7412.Google ScholarGoogle ScholarCross RefCross Ref
  6. K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770–778.Google ScholarGoogle Scholar
  7. S. Lee, S. Tariq, J. Kim, and S. S. Woo. 2021. TAR: Generalized Forensic Framework to Detect Deepfakes using Weakly Supervised Learning. (2021).Google ScholarGoogle Scholar
  8. Weizeng Lu, Xi Jia, Weicheng Xie, Linlin Shen, Yicong Zhou, and Jinming Duan. 2020. Geometry Constrained Weakly Supervised Object Localization. In Proceedings of the European Conference on Computer Vision (ECCV). 481–496.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. J. Mai, M. Yang, and W. Luo. 2020. Erasing Integrated Learning: A Simple Yet Effective Approach for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8763–8772.Google ScholarGoogle Scholar
  10. Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, Weiming Dong, Haolei Yuan, Feiyue Huang, and Changsheng Xu. 2021. Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization. CoRR abs/2103.04523(2021). arxiv:2103.04523https://arxiv.org/abs/2103.04523Google ScholarGoogle Scholar
  11. S. Ren, K. He, R. Girshick, and J. Sun. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 6(2017), 1137–1149.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. O. Russakovsky, J. Deng, H. Su, J. Krause, and et al. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision (IJCV) 115 (2015), 211–252.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556(2014).Google ScholarGoogle Scholar
  14. K. K. Singh and Y. J. Lee. 2017. Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 3544–3553.Google ScholarGoogle ScholarCross RefCross Ref
  15. F. Sun and W. Li. 2019. Saliency guided deep network for weakly-supervised image segmentation. Pattern Recognition Letters 120 (2019), 62–68.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. 2016. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2818–2826.Google ScholarGoogle Scholar
  17. C. Tan, G. Gu, T. Ruan, S. Wei, and Y. Zhao. 2020. Dual-Gradients Localization Framework for Weakly Supervised Object Localization. In Proceedings of the 28th ACM International Conference on Multimedia. 1976–1984.Google ScholarGoogle Scholar
  18. C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. 2011. The caltech-ucsd birds-200-2011 dataset. (2011).Google ScholarGoogle Scholar
  19. Jun Wei, Qin Wang, Zhen Li, Sheng Wang, S. Kevin Zhou, and Shuguang Cui. 2021. Shallow Feature Matters for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5993–6001.Google ScholarGoogle ScholarCross RefCross Ref
  20. Xiongwei Wu, Doyen Sahoo, and Steven C.H. Hoi. 2020. Recent advances in deep learning for object detection. Neurocomputing 396(2020), 39–64. https://doi.org/10.1016/j.neucom.2020.01.085Google ScholarGoogle ScholarCross RefCross Ref
  21. H. Xue, C. Liu, F. Wan, J. Jiao, X. Ji, and Q. Ye. 2019. DANet: Divergent Activation for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 6588–6597.Google ScholarGoogle Scholar
  22. Seunghan Yang, Yoonhyung Kim, Youngeun Kim, and Kim Changick. 2020. Combinational Class Activation Maps for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 2941–2949.Google ScholarGoogle ScholarCross RefCross Ref
  23. Seunghan Yang, Yoonhyung Kim, Youngeun Kim, and Changick Kim. 2020. Combinational Class Activation Maps for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).Google ScholarGoogle ScholarCross RefCross Ref
  24. S. Yun, D. Han, S. Chun, S. J. Oh, Y. Yoo, and J. Choe. 2019. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 6022–6031.Google ScholarGoogle Scholar
  25. X. Zhang, Y. Wei, J. Feng, Y. Yang, and T. Huang. 2018. Adversarial Complementary Learning for Weakly Supervised Object Localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1325–1334.Google ScholarGoogle Scholar
  26. X. Zhang, Y. Wei, G. Kang, Y. Yang, and T. Huang. 2018. Self-produced Guidance for Weakly-supervised Object Localization. In Proceedings of the European Conference on Computer Vision (ECCV). 597–613.Google ScholarGoogle Scholar
  27. Xiaolin Zhang, Yunchao Wei, and Yi Yang. 2020. Inter-Image Communication for Weakly Supervised Localization. In Computer Vision – ECCV 2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, 271–287.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Xiaolin Zhang, Yunchao Wei, Yi Yang, and Fei Wu. 2020. Rethinking Localization Map: Towards Accurate Object Perception with Self-Enhancement Maps. CoRR abs/2006.05220(2020). arxiv:2006.05220https://arxiv.org/abs/2006.05220Google ScholarGoogle Scholar
  29. Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning Deep Features for Discriminative Localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2921–2929.Google ScholarGoogle ScholarCross RefCross Ref
  30. Zhi-Hua Zhou. 2017. A brief introduction to weakly supervised learning. National Science Review 5, 1 (08 2017), 44–53. https://doi.org/10.1093/nsr/nwx106Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Attention-based Dual-Branches Localization Network for Weakly Supervised Object Localization
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          MMAsia '21: Proceedings of the 3rd ACM International Conference on Multimedia in Asia
          December 2021
          508 pages
          ISBN:9781450386074
          DOI:10.1145/3469877

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 10 January 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate59of204submissions,29%

          Upcoming Conference

          MM '24
          MM '24: The 32nd ACM International Conference on Multimedia
          October 28 - November 1, 2024
          Melbourne , VIC , Australia
        • Article Metrics

          • Downloads (Last 12 months)17
          • Downloads (Last 6 weeks)3

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format