research-article

Cross-Pixel Dependency with Boundary-Feature Transformation for Weakly Supervised Semantic Segmentation

Authors:

Xiangping ZhengAuthors Info & Claims

ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval

Pages 554 - 561

https://doi.org/10.1145/3512527.3531360

Published: 27 June 2022 Publication History

Abstract

Weakly supervised semantic segmentation with image-level labels is a challenging problem that typically relies on the initial responses generated by the classification network to locate object regions. However, such initial responses only cover the most discriminative parts of the object and may incorrectly activate in the background regions. To address this problem, we propose a Cross-pixel Dependency with Boundary-feature Transformation (CDBT) method for weakly supervised semantic segmentation. Specifically, we develop a boundary-feature transformation mechanism, to build strong connections among pixels belonging to the same object but weak connections among different objects. Moreover, we design a cross-pixel dependency module to enhance the initial responses, which exploits context appearance information and refines the prediction of current pixels by the relations of global channel pixels, thus generating pseudo labels of higher quality for training the semantic segmentation network. Extensive experiments on the PASCAL VOC 2012 segmentation benchmark demonstrate that our method outperforms state-of-the-art methods using image-level labels as weak supervision.

Supplementary Material

MP4 File (ICMR22-fp048.mp4)

The topic of my paper is ?Cross-Pixel Dependency with Boundary-Feature Transformation for Weakly Supervised Semantic Segmentation?. The outline of my talk as follows. The first part I want to introduce the background of this research. The second part suggests a framework of our method. And then, I introduce the experiment results. Finally, I will give a simple conclusion.

Download
14.17 MB

References

[1]

Jiwoon Ahn, Sunghyun Cho, and Suha Kwak. 2019. Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019. Computer Vision Foundation / IEEE, 2209--2218.

[2]

Jiwoon Ahn and Suha Kwak. 2018. Learning Pixel-Level Semantic Affinity With Image-Level Supervision for Weakly Supervised Semantic Segmentation. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. Computer Vision Foundation / IEEE Computer Society, 4981--4990.

[3]

Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2015. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015, Conference Track Proceedings.

[4]

Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2018. DeepLab: Semantic Image Segmentation with Deep Con- volutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 4 (2018), 834--848.

[5]

Jifeng Dai, Kaiming He, and Jian Sun. 2015. BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7--13, 2015. IEEE Computer Society, 1635--1643.

Digital Library

[6]

Henghui Ding, Xudong Jiang, Ai Qun Liu, Nadia Magnenat-Thalmann, and Gang Wang. 2019. Boundary-Aware Feature Propagation for Scene Segmentation. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. IEEE, 6818--6828.

[7]

Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K. I. Williams, John M. Winn, and Andrew Zisserman. 2015. The Pascal Visual Object Classes Challenge: A Retrospective. Int. J. Comput. Vis. 111, 1 (2015), 98--136.

Digital Library

[8]

Junsong Fan, Zhaoxiang Zhang, Tieniu Tan, Chunfeng Song, and Jun Xiao. 2020. CIAN: Cross-Image Affinity Net for Weakly Supervised Semantic Segmentation. In The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7--12, 2020. AAAI Press, 10762--10769.

[9]

Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, and Hanqing Lu. 2019. Dual Attention Network for Scene Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019. Computer Vision Foundation / IEEE, 3146--3154.

[10]

Bharath Hariharan, Pablo Arbelaez, Lubomir D. Bourdev, Subhransu Maji, and Jitendra Malik. 2011. Semantic contours from inverse detectors. In IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, Spain, November 6--13, 2011. IEEE Computer Society, 991--998.

Digital Library

[11]

Qibin Hou, Peng-Tao Jiang, Yunchao Wei, and Ming-Ming Cheng. 2018. Self-Erasing Network for Integral Object Attention. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3--8, 2018, Montréal, Canada. 547--557.

[12]

Zilong Huang, Xinggang Wang, Lichao Huang, Chang Huang, Yunchao Wei, and Wenyu Liu. 2019. CCNet: Criss-Cross Attention for Semantic Segmentation. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. IEEE, 603--612.

[13]

Zilong Huang, Xinggang Wang, Jiasi Wang, Wenyu Liu, and Jingdong Wang. 2018. Weakly-Supervised Semantic Segmentation Network With Deep Seeded Region Growing. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. Computer Vision Foundation / IEEE Computer Society, 7014--7023.

[14]

Peng-Tao Jiang, Qibin Hou, Yang Cao, Ming-Ming Cheng, Yunchao Wei, and Hongkai Xiong. 2019. Integral Object Mining via Online Attention Accumulation. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. IEEE, 2070--2079.

[15]

Anna Khoreva, Rodrigo Benenson, Jan Hendrik Hosang, Matthias Hein, and Bernt Schiele. 2017. Simple Does It: Weakly Supervised Instance and Semantic Segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017. IEEE Computer Society, 1665--1674.

[16]

Myeongjin Kim and Hyeran Byun. 2020. Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13--19, 2020. Computer Vision Foundation / IEEE, 12972--12981.

[17]

Jungbeom Lee, Eunji Kim, Sungmin Lee, Jangho Lee, and Sungroh Yoon. 2019. FickleNet: Weakly and Semi-Supervised Semantic Image Segmentation Using Stochastic Inference. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019. Computer Vision Foundation / IEEE, 5267--5276.

[18]

Di Lin, Jifeng Dai, Jiaya Jia, Kaiming He, and Jian Sun. 2016. ScribbleSup: Scribble- Supervised Convolutional Networks for Semantic Segmentation. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27--30, 2016. IEEE Computer Society, 3159--3167.

[19]

Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2020. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Int. J. Comput. Vis. 128, 2 (2020), 336--359.

Digital Library

[20]

Evan Shelhamer, Jonathan Long, and Trevor Darrell. 2017. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 4 (2017), 640--651.

Digital Library

[21]

Krishna Kumar Singh and Yong Jae Lee. 2017. Hide-and-Seek: Forcing a Network to be Meticulous for Weakly-Supervised Object and Action Localization. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22--29, 2017. IEEE Computer Society, 3544--3553.

[22]

Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. 2019. Box-Driven Class-Wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16--20, 2019. Computer Vision Foundation / IEEE, 3136--3145.

[23]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4--9, 2017, Long Beach, CA, USA. 5998--6008.

Digital Library

[24]

Paul Vernaza and Manmohan Chandraker. 2017. Learning Random-Walk Label Propagation for Weakly-Supervised Semantic Segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017. IEEE Computer Society, 2953--2961.

[25]

Xiaolong Wang, Ross B. Girshick, Abhinav Gupta, and Kaiming He. 2018. Non- Local Neural Networks. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. Computer Vision Foundation / IEEE Computer Society, 7794--7803.

[26]

Xiang Wang, Shaodi You, Xi Li, and Huimin Ma. 2018. Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. Computer Vision Foundation / IEEE Computer Society, 1354--1362.

[27]

Yude Wang, Jie Zhang, Meina Kan, Shiguang Shan, and Xilin Chen. 2020. Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13--19, 2020. Computer Vision Foundation / IEEE, 12272--12281.

[28]

Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, and Shuicheng Yan. 2017. Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21--26, 2017. IEEE Computer Society, 6488--6496.

[29]

Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, and Shuicheng Yan. 2017. STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 11 (2017), 2314--2320.

Digital Library

[30]

Yunchao Wei, Huaxin Xiao, Honghui Shi, Zequn Jie, Jiashi Feng, and Thomas S. Huang. 2018. Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi-Supervised Semantic Segmentation. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18--22, 2018. Computer Vision Foundation / IEEE Computer Society, 7268--7277.

[31]

Zifeng Wu, Chunhua Shen, and Anton van den Hengel. 2019. Wider or Deeper: Revisiting the ResNet Model for Visual Recognition. Pattern Recognit. 90 (2019), 119--133.

Digital Library

[32]

Yazhou Yao, Tao Chen, Guo-Sen Xie, Chuanyi Zhang, Fumin Shen, Qi Wu, Zhenmin Tang, and Jian Zhang. 2021. Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19--25, 2021. Computer Vision Foundation / IEEE, 2623--2632.

[33]

Zeng Yu, Yun-Zhi Zhuge, Huchuan Lu, and Lihe Zhang. 2019. Joint Learning of Saliency Detection and Weakly Supervised Semantic Segmentation. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019. IEEE, 7222--7232.

[34]

Bolei Zhou, Aditya Khosla, Àgata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning Deep Features for Discriminative Localization. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27--30, 2016. IEEE Computer Society, 2921--2929.

Index Terms

Cross-Pixel Dependency with Boundary-Feature Transformation for Weakly Supervised Semantic Segmentation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
      2. Computer vision tasks
        Scene understanding

Recommendations

Dual-aware Domain Mining and Cross-aware Supervision for Weakly-supervised Semantic Segmentation
Weakly Supervised Semantic Segmentation with image-level annotation uses localization maps from the classifier to generate pseudo labels. However, such localization maps focus only on sparse salient object regions, it is difficult to generate high-quality ...
Region-Guided Pixel-Level Label Generation for Weakly Supervised Semantic Segmentation
ICCCV '21: Proceedings of the 4th International Conference on Control and Computer Vision

The lack of reliable segmentation labels is the major obstacles to weakly supervised semantic segmentation. We provide a pseudo-label generation approach based on a deep convolutional neural network, which is supervised by the image-level category ...
Adversarial Decoupling for Weakly Supervised Semantic Segmentation
Pattern Recognition and Computer Vision
Abstract
Image semantic segmentation has been widely used in medical image analysis, autonomous driving and other fields. However, the fully-supervised semantic segmentation network requires a lot of labor cost to label pixel-level training data, so weakly ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval

June 2022

714 pages

ISBN:9781450392389

DOI:10.1145/3512527

General Chairs:
Vincent Oria
New Jersey Institute of Technology, USA
,
Maria Luisa Sapino
Università degli Studi di Torino, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Brigitte Kerhervé
Université du Québec à Montréal, Canada
,
Program Chairs:
Wen-Huang Cheng
National Yang Ming Chao Tung University, Taiwan
,
Ichiro Ide
Nagoya University, Japan
,
Vivek Singh
Rutgers University, USA

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Conference

ICMR '22

Sponsor:

SIGMM

ICMR '22: International Conference on Multimedia Retrieval

June 27 - 30, 2022

NJ, Newark, USA

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
142
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)3

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten