Weakly Supervised Semantic Segmentation with Patch-Based Metric Learning Enhancement

Chan, Patrick P. K.; Chen, Keke; Xu, Linyi; Hu, Xiaoman; Yeung, Daniel S.

doi:10.1007/978-3-030-86365-4_38

Weakly Supervised Semantic Segmentation with Patch-Based Metric Learning Enhancement

Patrick P. K. Chan¹²,
Keke Chen¹²,
Linyi Xu¹²,
Xiaoman Hu¹² &
…
Daniel S. Yeung¹³

Conference paper
First Online: 07 September 2021

2546 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12893))

Abstract

Weakly supervised semantic segmentation (WSSS) methods are more flexible and less costly than supervised ones since no pixel-level annotation is required. Class activation maps (CAMs) are commonly used in existing WSSS methods with image-level annotations to identify seed localization cues. However, as CAMs are obtained from a classification network that mainly focuses on the most discriminative parts of an object, less discriminative parts may be ignored and not identified. This study aims to improve the local visual understanding on objects of the classification network by considering an additional metric learning task on patches sampled from each CAM-based object proposal. As the patches contain different object parts and surrounding backgrounds, not only the most discriminative object parts but the entire objects are learned through leveraging the patch similarity. After the joint training process with the proposed patch-based metric learning and classification tasks, we expect more discriminative local features can be learned by the backbone network. As a result, more complete class-specific regions of an object can be identified. Extensive experiments on the PASCAL VOC 2012 dataset validate the superiority of our method. Our proposed model achieves improvement compared with the state-of-the-art methods.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Dai, J.F., He, K.M., Sun, J.: BoxSup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1635–1643. IEEE (2015)
Google Scholar
Lin, D., Dai, J.F., et al.: ScribbleSup: scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3159–3167. IEEE (2017)
Google Scholar
Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: three principles for weakly-supervised image segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 695–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_42
Chapter Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Zhou, B.L., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929. IEEE (2016)
Google Scholar
Wu, C., Manmatha, R., Smola, A.J., Krahenbuhl, P.: Sampling matters in deep embedding learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2859–2867. IEEE (2017)
Google Scholar
Wei, Y.C., Feng, J.S., Liang, X.D., Cheng, M.M., Zhao, Y., Yan, S.C: Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6488–6496. IEEE (2017)
Google Scholar
Lee, J., Kim, E., et al.: FickleNet: weakly and semi-supervised semantic image segmentation using stochastic inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5267–5276. IEEE (2019)
Google Scholar
Wang, Y.D., Zhang, J., Kan, M.N., Shan, S.G., Chen, X.L.: Self-supervised scale equivariant network for weakly supervised semantic segmentation. arXiv: Computer Vision and Pattern Recognition (2019)
Wang, Y., Zhang, J., Kan, M., Shan, S., Chen, X.: Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12272–12281. IEEE (2020)
Google Scholar
Chang, Y.T., Wang, Q.S., Hung, W.C., Robinson, P.: Weakly-supervised semantic segmentation via sub-category exploration. In: Proceedings of the International Conference on Computer Vision, IEEE (2020)
Google Scholar
Ahn, J., Kwak, S.: Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4981–4990. IEEE (2018)
Google Scholar
Ahn, J., Cho, S., Kwak, S.: Weakly supervised learning of instance segmentation with inter-pixel relations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2204–2213. IEEE (2019)
Google Scholar
Huang, Z.L., Wang, X.G., Wang, J.S., Liu, W.Y., Wang, J.D.: Weakly-supervised semantic segmentation network with deep seeded region growing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7014–7023. IEEE (2018)
Google Scholar
Fan, J.S., Zhang, Z.X., Tan, T.N., Song, C.F., Xiao, J.: CIAN: cross-image affinity net for weakly supervised semantic segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 10762–10769 (2020)
Google Scholar
Fan, J., Zhang, Z., Song, C., Tan, T.: Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4282–4291. IEEE (2020)
Google Scholar
Zhang, D., Zhang, H.W., Tang, J.H., Hua, X.S., Sun, Q.R.: Causal intervention for weakly-supervised semantic segmentation. In: Proceedings of the Conference on Neural Information Processing Systems (2020)
Google Scholar
Boiarov, A., Tyantov, E.: Large scale landmark recognition via deep metric learning. In: Proceedings of the ACM International Conference, pp. 169–178 (2019)
Google Scholar
Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823. IEEE (2015)
Google Scholar
Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: Proceedings of the International Conference on Pattern Recognition, pp. 850–855. IEEE (2006)
Google Scholar
Everingham, M., Eslami, S.M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vision 111(1), 98–136 (2015)
Article Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
Jiang, P.T., Hou, Q.B., Cao, Y., Cheng, M.M., Wei, Y.C., Xiong, H.K.: Integral object mining via online attention accumulation. In: Proceedings of the International Conference on Computer Vision. IEEE (2019)
Google Scholar

Download references

Acknowledgments

This paper is supported by the Natural Science Foundation of Guangdong Province, China (No. 2018A030313203).

Author information

Authors and Affiliations

School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
Patrick P. K. Chan, Keke Chen, Linyi Xu & Xiaoman Hu
Hong Kong, China
Daniel S. Yeung

Authors

Patrick P. K. Chan
View author publications
You can also search for this author in PubMed Google Scholar
Keke Chen
View author publications
You can also search for this author in PubMed Google Scholar
Linyi Xu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoman Hu
View author publications
You can also search for this author in PubMed Google Scholar
Daniel S. Yeung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Patrick P. K. Chan .

Editor information

Editors and Affiliations

Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
iMotions A/S, Copenhagen, Denmark
Paolo Masulli
University of Tübingen, Tübingen, Baden-Württemberg, Germany
Sebastian Otte
Universität Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chan, P.P.K., Chen, K., Xu, L., Hu, X., Yeung, D.S. (2021). Weakly Supervised Semantic Segmentation with Patch-Based Metric Learning Enhancement. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12893. Springer, Cham. https://doi.org/10.1007/978-3-030-86365-4_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-86365-4_38
Published: 07 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86364-7
Online ISBN: 978-3-030-86365-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics