(SP) $$^2$$ Net for Generalized Zero-Label Semantic Segmentation

Das, Anurag; Xian, Yongqin; He, Yang; Schiele, Bernt; Akata, Zeynep

doi:10.1007/978-3-030-92659-5_15

Anurag Das¹¹,
Yongqin Xian¹⁴,
Yang He¹⁵,
Bernt Schiele¹¹ &
…
Zeynep Akata^12,13

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13024))

Included in the following conference series:

DAGM German Conference on Pattern Recognition

1902 Accesses
2 Citations

Abstract

Generalized zero-label semantic segmentation aims to make pixel-level predictions for both seen and unseen classes in an image. Prior works approach this task by leveraging semantic word embeddings to learn a semantic projection layer or generate features of unseen classes. However, those methods rely on standard segmentation networks that may not generalize well to unseen classes. To address this issue, we propose to leverage a class-agnostic segmentation prior provided by superpixels and introduce a superpixel pooling (SP-pooling) module as an intermediate layer of a segmentation network. Also, while prior works ignore the pixels of unseen classes that appear in training images, we propose to minimize the log probability of seen classes alleviating biased predictions in those ignore regions. We show that our (SP)$^2$Net significantly outperforms the state-of-the-art on different data splits of PASCAL VOC 2012 and PASCAL-Context benchmarks.

Y. Xian and Y. He—The majority of the work was done when Yongqin Xian and Yang He were with MPI for Informatics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation

Enhancing weakly supervised semantic segmentation through multi-class token attention learning

Article 24 October 2024

Learning class-agnostic masks with cross-task refinement for weakly supervised semantic segmentation

Article Open access 19 July 2023

Notes

1.
(SP)$^2$Net refers to (SP)$^2$Net with COB superpixels,unless otherwise stated.
2.
COB based superpixels are pretrained on object boundaries on PASCAL Dataset.

References

Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. TPAMI (2012)
Google Scholar
Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for image classification. TPAMI (2016)
Google Scholar
Arbeláez, P., Pont-Tuset, J., Barron, J.T., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 328–335 (2014)
Google Scholar
Bearman, A., Russakovsky, O., Ferrari, V., Fei-Fei, L.: What’s the point: semantic segmentation with point supervision. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 549–565. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_34
Chapter Google Scholar
Bucher, M., Vu, T.H., Cord, M., Pérez, P.: Zero-shot semantic segmentation. In: NeurIPS (2019)
Google Scholar
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. TPAMI (2017)
Google Scholar
Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV (2018)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal Visual Object Classes (VOC) challenge. Int. J. Comput. Vis. 88, 303–338 (2010). https://doi.org/10.1007/s11263-009-0275-4
Article Google Scholar
Frome, A., et al.: Devise: a deep visual-semantic embedding model. In: NeurIPS (2013)
Google Scholar
Gadde, R., Jampani, V., Kiefel, M., Kappler, D., Gehler, P.V.: Superpixel convolutional networks using bilateral inceptions. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 597–613. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_36
Chapter Google Scholar
Gu, Z., Zhou, S., Niu, L., Zhao, Z., Zhang, L.: Context-aware feature generation for zero-shot semantic segmentation. In: ACM Multimedia (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
He, Y., Chiu, W.C., Keuper, M., Fritz, M.: Std2p: RGBD semantic segmentation using spatio-temporal data-driven pooling. In: CVPR (2017)
Google Scholar
Jayaraman, D., Grauman, K.: Zero-shot recognition with unreliable attributes. In: NIPS (2014)
Google Scholar
Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., Mikolov, T.: Fasttext.zip: compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)
Khoreva, A., Benenson, R., Hosang, J., Hein, M., Schiele, B.: Simple does it: weakly supervised instance and semantic segmentation. In: CVPR (2017)
Google Scholar
Kwak, S., Hong, S., Han, B.: Weakly supervised semantic segmentation using superpixel pooling network. In: AAAI (2017)
Google Scholar
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR (2009)
Google Scholar
Li, P., Wei, Y., Yang, Y.: Consistent structural relation learning for zero-shot segmentation. In: NeurIPS (2020)
Google Scholar
Lin, D., Dai, J., Jia, J., He, K., Sun, J.: Scribblesup: Scribble-supervised convolutional networks for semantic segmentation. In: CVPR (2016)
Google Scholar
Lin, D., Ji, Y., Lischinski, D., Cohen-Or, D., Huang, H.: Multi-scale context intertwining for semantic segmentation. In: ECCV (2018)
Google Scholar
Liu, W., Rabinovich, A., Berg, A.C.: Parsenet: looking wider to see better. arXiv preprint arXiv:1506.04579 (2015)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Google Scholar
Maninis, K.K., Pont-Tuset, J., Arbeláez, P., Van Gool, L.: Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE TPAMI (2017)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS (2013)
Google Scholar
Mottaghi, R., et al.: The role of context for object detection and semantic segmentation in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 891–898 (2014)
Google Scholar
Romera-Paredes, B., OX, E., Torr, P.H.: An embarrassingly simple approach to zero-shot learning. In: ICML (2015)
Google Scholar
Stutz, D., Hermans, A., Leibe, B.: Superpixels: an evaluation of the state-of-the-art. In: CVIU (2018)
Google Scholar
Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: CVPR (2016)
Google Scholar
Xian, Y., Choudhury, S., He, Y., Schiele, B., Akata, Z.: Semantic projection network for zero-and few-label semantic segmentation. In: CVPR, pp. 8256–8265 (2019)
Google Scholar
Zhang, L., Xiang, T., Gong, S.: Learning a deep embedding model for zero-shot learning. In: CVPR (2017)
Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
Google Scholar

Download references

Acknowledgements

This work has been partially funded by the ERC (853489 - DEXIM) and by the DFG (2064/1 - Project number 390727645).

Author information

Authors and Affiliations

MPI for Informatics, Saarland Informatics Campus, Saarbrücken, Germany
Anurag Das & Bernt Schiele
MPI for Intelligent Systems, Tubingen, Germany
Zeynep Akata
University of Tübingen, Tübingen, Germany
Zeynep Akata
ETH Zurich, Zürich, Switzerland
Yongqin Xian
Amazon, Bellevue, USA
Yang He

Authors

Anurag Das
View author publications
You can also search for this author in PubMed Google Scholar
Yongqin Xian
View author publications
You can also search for this author in PubMed Google Scholar
Yang He
View author publications
You can also search for this author in PubMed Google Scholar
Bernt Schiele
View author publications
You can also search for this author in PubMed Google Scholar
Zeynep Akata
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anurag Das .

Editor information

Editors and Affiliations

Fraunhofer IAIS, Sankt Augustin, Germany
Christian Bauckhage
University of Bonn, Bonn, Germany
Juergen Gall
University of Illinois at Urbana-Champaign, Urbana, IL, USA
Alexander Schwing

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 13488 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Das, A., Xian, Y., He, Y., Schiele, B., Akata, Z. (2021). (SP)$^2$Net for Generalized Zero-Label Semantic Segmentation. In: Bauckhage, C., Gall, J., Schwing, A. (eds) Pattern Recognition. DAGM GCPR 2021. Lecture Notes in Computer Science(), vol 13024. Springer, Cham. https://doi.org/10.1007/978-3-030-92659-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-92659-5_15
Published: 13 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-92658-8
Online ISBN: 978-3-030-92659-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

(SP)\(^2\)Net for Generalized Zero-Label Semantic Segmentation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation

Enhancing weakly supervised semantic segmentation through multi-class token attention learning

Learning class-agnostic masks with cross-task refinement for weakly supervised semantic segmentation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 13488 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

(SP)\(^2\)Net for Generalized Zero-Label Semantic Segmentation

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation

Enhancing weakly supervised semantic segmentation through multi-class token attention learning

Learning class-agnostic masks with cross-task refinement for weakly supervised semantic segmentation

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 13488 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us