Skip to main content

SW-VAE: Weakly Supervised Learn Disentangled Representation via Latent Factor Swapping

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 Workshops (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13802))

Included in the following conference series:

  • 2019 Accesses

Abstract

Representation disentanglement is an important goal of the representation learning that benefits various of downstream tasks. To achieve this goal, many unsupervised learning representation disentanglement approaches have been developed. However, the training process without utilizing any supervision signal have been proved to be inadequate for disentanglement representation learning. Therefore, we propose a novel weakly-supervised training approach, named as SW-VAE, which incorporates pairs of input observations as supervision signal by using the generative factors of datasets. Furthermore, we introduce strategies to gradually increase the learning difficulty during training to smooth the training process. As shown on several datasets, our model shows significant improvement over state-of-the-art (SOTA) methods on representation disentanglement tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50

    Article  Google Scholar 

  2. Burgess, C., Kim, H.: 3d shapes dataset (2018). https://github.com/deepmind/3dshapes-dataset/

  3. Burgess, C.P., et al.: Understanding disentangling in \(\beta \)-VAE (2018)

    Google Scholar 

  4. Chen, J., Batmanghelich, K.: Weakly supervised disentanglement by pairwise similarities. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, 7–12 February 2020, pp. 3495–3502. AAAI Press (2020). https://aaai.org/ojs/index.php/AAAI/article/view/5754

  5. Chen, R.T.Q., Li, X., Grosse, R., Duvenaud, D.: Isolating sources of disentanglement in variational autoencoders (2019)

    Google Scholar 

  6. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations (2020)

    Google Scholar 

  7. Creager, E., et al.: Flexibly fair representation learning by disentanglement. CoRR abs/1906.02589 (2019). arxiv.org:1906.02589

  8. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.-F.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848

  9. Eastwood, C., Williams, C.K.I.: A framework for the quantitative evaluation of disentangled representations. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=By-7dz-AZ

  10. Feng, Z., Wang, X., Ke, C., Zeng, A.X., Tao, D., Song, M.: Dual swap disentangling. In: Advances in Neural Information Processing Systems. Curran Associates, Inc. (2018)

    Google Scholar 

  11. Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations (2018)

    Google Scholar 

  12. Gondal, M.W., et al.: On the transfer of inductive bias from simulation to the real world: a new disentanglement dataset (2019)

    Google Scholar 

  13. Goodfellow, I.J., et al.: Generative adversarial networks (2014)

    Google Scholar 

  14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)

    Google Scholar 

  15. Higgins, I., et al.: Beta-VAE: learning basic visual concepts with a constrained variational framework. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings (2017). https://openreview.net/forum?id=Sy2fzU9gl

  16. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  17. Kim, H., Mnih, A.: Disentangling by factorising. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 2649–2658. PMLR, Stockholmsmässan, Stockholm Sweden, 10–15 July 2018. https://proceedings.mlr.press/v80/kim18b.html

  18. Kingma, D.P., Welling, M.: Auto-encoding variational bayes (2014)

    Google Scholar 

  19. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014). arxiv.org:1312.6114

  20. Kumar, A., Sattigeri, P., Balakrishnan, A.: Variational inference of disentangled latent concepts from unlabeled observations. In: ICLR arxiv:1711.00848 (2018)

  21. Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Autoencoding beyond pixels using a learned similarity metric. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 1558–1566. PMLR, New York, New York, USA, 20–22 June 2016. https://proceedings.mlr.press/v48/larsen16.html

  22. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV), December 2015

    Google Scholar 

  23. Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations (2019)

    Google Scholar 

  24. Locatello, F., Poole, B., Raetsch, G., Schölkopf, B., Bachem, O., Tschannen, M.: Weakly-supervised disentanglement without compromises. In: III, H.D., Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 6348–6359. PMLR, 13–18 July 2020. https://proceedings.mlr.press/v119/locatello20a.html

  25. Locatello, F., Tschannen, M., Bauer, S., Rätsch, G., Schölkopf, B., Bachem, O.: Disentangling factors of variations using few labels. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=SygagpEKwB

  26. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015). https://doi.org/10.1109/CVPR.2015.7298965

  27. Matthey, L., Higgins, I., Hassabis, D., Lerchner, A.: dSprites: Disentanglement testing sprites dataset. https://github.com/deepmind/dsprites-dataset/ (2017)

  28. Park, T., et al.: Swapping autoencoder for deep image manipulation. In: Advances in Neural Information Processing Systems (2020)

    Google Scholar 

  29. Siddharth, N., et al.: Learning disentangled representations with semi-supervised deep generative models (2017)

    Google Scholar 

  30. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2015)

    Google Scholar 

  31. van Steenkiste, S., Locatello, F., Schmidhuber, J., Bachem, O.: Are disentangled representations helpful for abstract visual reasoning? (2019)

    Google Scholar 

  32. Suter, R., ore Miladinović, Schölkopf, B., Bauer, S.: Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness (2019)

    Google Scholar 

  33. Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014). https://doi.org/10.1109/CVPR.2014.220

  34. Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29. Curran Associates, Inc. (2016). https://proceedings.neurips.cc/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf

  35. Zhou, Z.H.: A brief introduction to weakly supervised learning. Natl. Sci. Rev. 5(1), 44–53 (2017). https://doi.org/10.1093/nsr/nwx106

    Article  Google Scholar 

Download references

Acknowledgement

This material is based on research sponsored by Air Force Research Laboratory under agreement number FA8750-19-1-1000. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Air Force Research Laboratory or the US, Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiageng Zhu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 6841 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhu, J., Xie, H., Abd-Almageed, W. (2023). SW-VAE: Weakly Supervised Learn Disentangled Representation via Latent Factor Swapping. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13802. Springer, Cham. https://doi.org/10.1007/978-3-031-25063-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25063-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25062-0

  • Online ISBN: 978-3-031-25063-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics