Skip to main content

Weakly Supervised Semantic Segmentation with Patch-Based Metric Learning Enhancement

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12893))

Abstract

Weakly supervised semantic segmentation (WSSS) methods are more flexible and less costly than supervised ones since no pixel-level annotation is required. Class activation maps (CAMs) are commonly used in existing WSSS methods with image-level annotations to identify seed localization cues. However, as CAMs are obtained from a classification network that mainly focuses on the most discriminative parts of an object, less discriminative parts may be ignored and not identified. This study aims to improve the local visual understanding on objects of the classification network by considering an additional metric learning task on patches sampled from each CAM-based object proposal. As the patches contain different object parts and surrounding backgrounds, not only the most discriminative object parts but the entire objects are learned through leveraging the patch similarity. After the joint training process with the proposed patch-based metric learning and classification tasks, we expect more discriminative local features can be learned by the backbone network. As a result, more complete class-specific regions of an object can be identified. Extensive experiments on the PASCAL VOC 2012 dataset validate the superiority of our method. Our proposed model achieves improvement compared with the state-of-the-art methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Dai, J.F., He, K.M., Sun, J.: BoxSup: exploiting bounding boxes to supervise convolutional networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1635–1643. IEEE (2015)

    Google Scholar 

  2. Lin, D., Dai, J.F., et al.: ScribbleSup: scribble-supervised convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3159–3167. IEEE (2017)

    Google Scholar 

  3. Kolesnikov, A., Lampert, C.H.: Seed, expand and constrain: three principles for weakly-supervised image segmentation. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 695–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_42

    Chapter  Google Scholar 

  4. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  5. Zhou, B.L., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929. IEEE (2016)

    Google Scholar 

  6. Wu, C., Manmatha, R., Smola, A.J., Krahenbuhl, P.: Sampling matters in deep embedding learning. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2859–2867. IEEE (2017)

    Google Scholar 

  7. Wei, Y.C., Feng, J.S., Liang, X.D., Cheng, M.M., Zhao, Y., Yan, S.C: Object region mining with adversarial erasing: a simple classification to semantic segmentation approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6488–6496. IEEE (2017)

    Google Scholar 

  8. Lee, J., Kim, E., et al.: FickleNet: weakly and semi-supervised semantic image segmentation using stochastic inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5267–5276. IEEE (2019)

    Google Scholar 

  9. Wang, Y.D., Zhang, J., Kan, M.N., Shan, S.G., Chen, X.L.: Self-supervised scale equivariant network for weakly supervised semantic segmentation. arXiv: Computer Vision and Pattern Recognition (2019)

  10. Wang, Y., Zhang, J., Kan, M., Shan, S., Chen, X.: Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12272–12281. IEEE (2020)

    Google Scholar 

  11. Chang, Y.T., Wang, Q.S., Hung, W.C., Robinson, P.: Weakly-supervised semantic segmentation via sub-category exploration. In: Proceedings of the International Conference on Computer Vision, IEEE (2020)

    Google Scholar 

  12. Ahn, J., Kwak, S.: Learning pixel-level semantic affinity with image-level supervision for weakly supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4981–4990. IEEE (2018)

    Google Scholar 

  13. Ahn, J., Cho, S., Kwak, S.: Weakly supervised learning of instance segmentation with inter-pixel relations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2204–2213. IEEE (2019)

    Google Scholar 

  14. Huang, Z.L., Wang, X.G., Wang, J.S., Liu, W.Y., Wang, J.D.: Weakly-supervised semantic segmentation network with deep seeded region growing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7014–7023. IEEE (2018)

    Google Scholar 

  15. Fan, J.S., Zhang, Z.X., Tan, T.N., Song, C.F., Xiao, J.: CIAN: cross-image affinity net for weakly supervised semantic segmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 10762–10769 (2020)

    Google Scholar 

  16. Fan, J., Zhang, Z., Song, C., Tan, T.: Learning integral objects with intra-class discriminator for weakly-supervised semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4282–4291. IEEE (2020)

    Google Scholar 

  17. Zhang, D., Zhang, H.W., Tang, J.H., Hua, X.S., Sun, Q.R.: Causal intervention for weakly-supervised semantic segmentation. In: Proceedings of the Conference on Neural Information Processing Systems (2020)

    Google Scholar 

  18. Boiarov, A., Tyantov, E.: Large scale landmark recognition via deep metric learning. In: Proceedings of the ACM International Conference, pp. 169–178 (2019)

    Google Scholar 

  19. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815–823. IEEE (2015)

    Google Scholar 

  20. Neubeck, A., Van Gool, L.: Efficient non-maximum suppression. In: Proceedings of the International Conference on Pattern Recognition, pp. 850–855. IEEE (2006)

    Google Scholar 

  21. Everingham, M., Eslami, S.M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vision 111(1), 98–136 (2015)

    Article  Google Scholar 

  22. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)

    Article  Google Scholar 

  23. Jiang, P.T., Hou, Q.B., Cao, Y., Cheng, M.M., Wei, Y.C., Xiong, H.K.: Integral object mining via online attention accumulation. In: Proceedings of the International Conference on Computer Vision. IEEE (2019)

    Google Scholar 

Download references

Acknowledgments

This paper is supported by the Natural Science Foundation of Guangdong Province, China (No. 2018A030313203).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrick P. K. Chan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chan, P.P.K., Chen, K., Xu, L., Hu, X., Yeung, D.S. (2021). Weakly Supervised Semantic Segmentation with Patch-Based Metric Learning Enhancement. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2021. ICANN 2021. Lecture Notes in Computer Science(), vol 12893. Springer, Cham. https://doi.org/10.1007/978-3-030-86365-4_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86365-4_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86364-7

  • Online ISBN: 978-3-030-86365-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics