skip to main content
research-article

Dual-aware Domain Mining and Cross-aware Supervision for Weakly-supervised Semantic Segmentation

Published: 04 May 2023 Publication History

Abstract

Weakly Supervised Semantic Segmentation with image-level annotation uses localization maps from the classifier to generate pseudo labels. However, such localization maps focus only on sparse salient object regions, it is difficult to generate high-quality segmentation labels, which deviates from the requirement of semantic segmentation. To address this issue, we propose a dual-aware domain mining and cross-aware supervision (DDMCAS) method for weakly-supervised semantic segmentation. Specifically, we propose a dual-aware domain mining (DDM) module consisting of graph-based global reasoning unit and salient-region extension controller, which produces dense localization maps by exploring object features in salient regions and adjacent non-salient regions simultaneously. In order to further bridge the gap between salient regions and adjacent non-salient regions to generate more refined localization maps, we propose a cross-aware supervision (CAS) strategy to recover missing parts of the target objects and enhance weak attention in adjacent non-salient regions, leading to pseudo labels of higher quality for training the segmentation network. Based on the generated pseudo-labels, extensive experiments on PASCAL VOC 2012 dataset demonstrate that our method outperforms state-of-the-art methods using image-level labels for weakly supervised semantic segmentation.

References

[1]
Jiwoon Ahn and Suha Kwak. 2018. Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE Computer Society, 4981–4990.
[2]
Amy L. Bearman, Olga Russakovsky, Vittorio Ferrari, and Li Fei-Fei. 2016. What’s the point: Semantic segmentation with point supervision. In Computer Vision - ECCV 2016-14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part VII.Springer, 549–565.
[3]
Léon Bottou. 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of the 19th International Conference on Computational Statistics, COMPSTAT 2010, Paris, France, August 22–27, 2010 - Keynote, Invited and Contributed Papers.Yves Lechevallier and Gilbert Saporta (Eds.), Physica-Verlag, 177–186.
[4]
Yu-Ting Chang, Qiaosong Wang, Wei-Chih Hung, Robinson Piramuthu, Yi-Hsuan Tsai, and Ming-Hsuan Yang. 2020. Weakly-supervised semantic segmentation via sub-category exploration. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, 8988–8997.
[5]
Arslan Chaudhry, Puneet Kumar Dokania, and Philip H. S. Torr. 2017. Discovering class-specific pixels for weakly-supervised semantic segmentation. In Proceedings of the British Machine Vision Conference 2017, BMVC 2017, London, UK, September 4–7, 2017. BMVA Press.
[6]
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2015. Semantic image segmentation with deep convolutional nets and fully connected CRFs. In Proceedings of the 3rd International Conference on Learning Representations..
[7]
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. 2018. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 4 (2018), 834–848.
[8]
Liyi Chen, Weiwei Wu, Chenchen Fu, Xiao Han, and Yuntao Zhang. 2020. Weakly supervised semantic segmentation with boundary exploration. In Proceedings of the Computer Vision - ECCV 2020-16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI.Springer, 347–362.
[9]
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the Computer Vision - ECCV 2018-15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part VII.Springer, 833–851.
[10]
Yunpeng Chen, Marcus Rohrbach, Zhicheng Yan, Shuicheng Yan, Jiashi Feng, and Yannis Kalantidis. 2019. Graph-based global reasoning networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, June 16–20, 2019. Computer Vision Foundation / IEEE, 433–442.
[11]
Jifeng Dai, Kaiming He, and Jian Sun. 2015. BoxSup: Exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE International Conference on Computer Vision. IEEE Computer Society, 1635–1643.
[12]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.IEEE Computer Society, 248–255.
[13]
Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John M. Winn, and Andrew Zisserman. 2010. The pascal visual object classes (VOC) challenge. International Journal of Computer Vision 88, 2 (2010), 303–338.
[14]
Junsong Fan, Zhaoxiang Zhang, Chunfeng Song, and Tieniu Tan. 2020. Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, 4282–4291.
[15]
Junsong Fan, Zhaoxiang Zhang, and Tieniu Tan. 2020. Employing multi-estimations for weakly-supervised semantic segmentation. In Computer Vision - ECCV 2020-16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII.Springer, 332–348.
[16]
Junsong Fan, Zhaoxiang Zhang, Tieniu Tan, Chunfeng Song, and Jun Xiao. 2020. CIAN: Cross-image affinity net for weakly supervised semantic segmentation. In Proceedings of the The 34th AAAI Conference on Artificial Intelligence, AAAI 2020, The 32nd Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The 10th AAAI Symposium on Educational Advances in Artificial Intelligence, February 7–12, 2020. AAAI Press, 10762–10769.
[17]
Ruochen Fan, Qibin Hou, Ming-Ming Cheng, Gang Yu, Ralph R. Martin, and Shi-Min Hu. 2018. Associating inter-image salient instances for weakly supervised semantic segmentation. In Proceedings of the European Conference on Computer Vision.Springer, 371–388.
[18]
Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 3354–3361.
[19]
Bharath Hariharan, Pablo Arbelaez, Lubomir D. Bourdev, Subhransu Maji, and Jitendra Malik. 2011. Semantic contours from inverse detectors. In Proceedings of the IEEE International Conference on Computer Vision.Dimitris N. Metaxas, Long Quan, Alberto Sanfeliu, and Luc Van Gool (Eds.). IEEE Computer Society, 991–998.
[20]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 770–778.
[21]
Kanghuai Liu, Zhigang Chen, Jia Wu, and Leilei Wang. 2018. FCNS: A fuzzy routing-forwarding algorithm exploiting comprehensive node similarity in opportunistic social networks. Symmetry 10, 8 (2018), 338–361.
[22]
Seunghoon Hong, Donghun Yeo, Suha Kwak, Honglak Lee, and Bohyung Han. 2017. Weakly supervised semantic segmentation using web-crawled videos. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2224–2232.
[23]
Qibin Hou, Peng-Tao Jiang, Yunchao Wei, and Ming-Ming Cheng. 2018. Self-erasing network for integral object attention. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018 (NeurIPS’18, December 3-8, 2018, Montréal, Canada), Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (Eds.). 547–557.
[24]
Peiliang Huang, Junwei Han, Nian Liu, Jun Ren, and Dingwen Zhang. 2022. Scribble-supervised video object segmentation. IEEE/CAA Journal of Automatica Sinica 9, 2 (2022), 339–353.
[25]
Zilong Huang, Xinggang Wang, Jiasi Wang, Wenyu Liu, and Jingdong Wang. 2018. Weakly-supervised semantic segmentation network with deep seeded region growing. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE Computer Society, 7014–7023.
[26]
Peng-Tao Jiang, Qibin Hou, Yang Cao, Ming-Ming Cheng, Yunchao Wei, and Hongkai Xiong. 2019. Integral object mining via online attention accumulation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. IEEE, 2070–2079.
[27]
Anna Khoreva, Rodrigo Benenson, Jan Hendrik Hosang, Matthias Hein, and Bernt Schiele. 2017. Simple does it: Weakly supervised instance and semantic segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 1665–1674.
[28]
Beomyoung Kim, Sangeun Han, and Junmo Kim. 2021. Discriminative region suppression for weakly-supervised semantic segmentation. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, AAAI 2021, 33rd Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The 11th Symposium on Educational Advances in Artificial Intelligence. AAAI Press, 1754–1761.
[29]
Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings. OpenReview.net.
[30]
Alexander Kolesnikov and Christoph H. Lampert. 2016. Seed, expand and constrain: Three principles for weakly-supervised image segmentation. In Computer Vision - ECCV 2016-14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV.Springer, 695–711.
[31]
Jungbeom Lee, Eunji Kim, Sungmin Lee, Jangho Lee, and Sungroh Yoon. 2019. FickleNet: Weakly and semi-supervised semantic image segmentation using stochastic inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, 5267–5276.
[32]
Kunpeng Li, Ziyan Wu, Kuan-Chuan Peng, Jan Ernst, and Yun Fu. 2018. Tell me where to look: Guided attention inference network. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE Computer Society, 9215–9223.
[33]
Di Lin, Jifeng Dai, Jiaya Jia, Kaiming He, and Jian Sun. 2016. ScribbleSup: Scribble-supervised convolutional networks for semantic segmentation. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 3159–3167.
[34]
Ziwei Liu, Xiaoxiao Li, Ping Luo, Chen Change Loy, and Xiaoou Tang. 2015. Semantic image segmentation via deep parsing network. In Proceedings of the 2015 IEEE International Conference on Computer Vision. IEEE Computer Society, 1377–1385.
[35]
Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, and Yann LeCun. 2017. Predicting deeper into the future of semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision. IEEE Computer Society, 648–657.
[36]
Andres Milioto, Philipp Lottes, and Cyrill Stachniss. 2018. Real-time semantic segmentation of crop and weed for precision agriculture robots leveraging background knowledge in CNNs. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation. IEEE, 2229–2235.
[37]
Alexey A. Shvets, Alexander Rakhlin, Alexandr A. Kalinin, and Vladimir I. Iglovikov. 2018. Automatic instrument segmentation in robot-assisted surgery using deep learning. In Proceedings of the 17th IEEE International Conference on Machine Learning and Applications. IEEE, 624–628.
[38]
Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, May 7–9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.).
[39]
Chunfeng Song, Yan Huang, Wanli Ouyang, and Liang Wang. 2019. Box-driven class-wise region masking and filling rate guided loss for weakly supervised semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, 3136–3145.
[40]
Guolei Sun, Wenguan Wang, Jifeng Dai, and Luc Van Gool. 2020. Mining cross-image semantics for weakly supervised semantic segmentation. In Computer Vision - ECCV 2020-16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II.Springer, 347–365.
[41]
Paul Vernaza and Manmohan Chandraker. 2017. Learning random-walk label propagation for weakly-supervised semantic segmentation. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2953–2961.
[42]
Xiang Wang, Shaodi You, Xi Li, and Huimin Ma. 2018. Weakly-supervised semantic segmentation by iteratively mining common object features. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE Computer Society, 1354–1362.
[43]
Yude Wang, Jie Zhang, Meina Kan, Shiguang Shan, and Xilin Chen. 2020. Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, 12272–12281.
[44]
Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, and Shuicheng Yan. 2017. Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 6488–6496.
[45]
Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Jiashi Feng, Yao Zhao, and Shuicheng Yan. 2017. STC: A simple to complex framework for weakly-supervised semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 11 (2017), 2314–2320.
[46]
Yunchao Wei, Huaxin Xiao, Honghui Shi, Zequn Jie, Jiashi Feng, and Thomas S. Huang. 2018. Revisiting dilated convolution: A simple approach for weakly- and semi-supervised semantic segmentation. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE Computer Society, 7268–7277.
[47]
Qi Yao and Xiaojin Gong. 2020. Saliency guided self-attention network for weakly and semi-supervised semantic segmentation. IEEE Access 8, 1 (2020), 14413–14423.
[48]
Yazhou Yao, Tao Chen, Guo-Sen Xie, Chuanyi Zhang, Fumin Shen, Qi Wu, Zhenmin Tang, and Jian Zhang. 2021. Non-salient region object mining for weakly supervised semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, 2623–2632.
[49]
Zeng Yu, Yun-Zhi Zhuge, Huchuan Lu, and Lihe Zhang. 2019. Joint learning of saliency detection and weakly supervised semantic segmentation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. IEEE, 7222–7232.
[50]
Bingfeng Zhang, Jimin Xiao, Yunchao Wei, Mingjie Sun, and Kaizhu Huang. 2020. Reliability does matter: An end-to-end weakly supervised semantic segmentation approach. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, AAAI 2020, The 32nd Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The 10th AAAI Symposium on Educational Advances in Artificial Intelligence. AAAI Press, 12765–12772.
[51]
Dingwen Zhang, Junwei Han, Gong Cheng, and Ming-Hsuan Yang. 2022. Weakly supervised object localization and detection: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 9 (2022), 5866–5885.
[52]
Dingwen Zhang, Wenyuan Zeng, Jieru Yao, and Junwei Han. 2022. Weakly supervised object detection using proposal- and semantic-level relationships. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 6 (2022), 3349–3363.
[53]
Tianyi Zhang, Guosheng Lin, Weide Liu, Jianfei Cai, and Alex C. Kot. 2020. Splitting vs. merging: Mining object regions with discrepancy and intersection loss for weakly supervised semantic segmentation. In Proceedings of the Computer Vision - ECCV 2020-16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XXII.Springer, 663–679.
[54]
Xiaolin Zhang, Yunchao Wei, Jiashi Feng, Yi Yang, and Thomas S. Huang. 2018. Adversarial complementary learning for weakly supervised object localization. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE Computer Society, 1325–1334.
[55]
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017. Pyramid scene parsing network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 6230–6239.
[56]
Wangbo Zhao, Jing Zhang, Long Li, Nick Barnes, Nian Liu, and Junwei Han. 2021. Weakly supervised video salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation / IEEE, 16826–16835.
[57]
Bolei Zhou, Aditya Khosla, Àgata Lapedriza, Aude Oliva, and Antonio Torralba. 2016. Learning deep features for discriminative localization. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2921–2929.

Cited By

View all
  • (2025)Urban scene representation and summarization using knowledge graphExpert Systems with Applications10.1016/j.eswa.2025.126956275(126956)Online publication date: May-2025
  • (2024)Weakly-supervised temporal action localization using multi-branch attention weightingMultimedia Systems10.1007/s00530-024-01445-230:5Online publication date: 30-Aug-2024

Index Terms

  1. Dual-aware Domain Mining and Cross-aware Supervision for Weakly-supervised Semantic Segmentation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 17, Issue 7
      August 2023
      319 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/3589018
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 May 2023
      Online AM: 25 March 2023
      Accepted: 21 March 2023
      Revised: 03 January 2023
      Received: 08 November 2021
      Published in TKDD Volume 17, Issue 7

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Weakly-supervised semantic segmentation
      2. image-level label
      3. dual-aware domain mining
      4. cross-aware supervision

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China
      • National Social Science Foundation of China
      • Research Seed Funds of School of Interdisciplinary Studies of Renmin University of China
      • Opening Project of State Key Laboratory of Digital Publishing Technology of Founder Group

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)85
      • Downloads (Last 6 weeks)7
      Reflects downloads up to 28 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Urban scene representation and summarization using knowledge graphExpert Systems with Applications10.1016/j.eswa.2025.126956275(126956)Online publication date: May-2025
      • (2024)Weakly-supervised temporal action localization using multi-branch attention weightingMultimedia Systems10.1007/s00530-024-01445-230:5Online publication date: 30-Aug-2024

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      HTML Format

      View this article in HTML Format.

      HTML Format

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media