SW-VAE: Weakly Supervised Learn Disentangled Representation via Latent Factor Swapping

Zhu, Jiageng; Xie, Hanchen; Abd-Almageed, Wael

doi:10.1007/978-3-031-25063-7_5

Jiageng Zhu^10,11,12,
Hanchen Xie^11,12 &
Wael Abd-Almageed^10,11,12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13802))

Included in the following conference series:

European Conference on Computer Vision

2099 Accesses

Abstract

Representation disentanglement is an important goal of the representation learning that benefits various of downstream tasks. To achieve this goal, many unsupervised learning representation disentanglement approaches have been developed. However, the training process without utilizing any supervision signal have been proved to be inadequate for disentanglement representation learning. Therefore, we propose a novel weakly-supervised training approach, named as SW-VAE, which incorporates pairs of input observations as supervision signal by using the generative factors of datasets. Furthermore, we introduce strategies to gradually increase the learning difficulty during training to smooth the training process. As shown on several datasets, our model shows significant improvement over state-of-the-art (SOTA) methods on representation disentanglement tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Weakly Supervised Invariant Representation Learning via Disentangling Known and Unknown Nuisance Factors

DnA: Improving Few-Shot Transfer Learning with Low-Rank Decomposition and Alignment

MineGAN++: Mining Generative Models for Efficient Knowledge Transfer to Limited Data Domains

Article 09 September 2023

References

Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50
Article Google Scholar
Burgess, C., Kim, H.: 3d shapes dataset (2018). https://github.com/deepmind/3dshapes-dataset/
Burgess, C.P., et al.: Understanding disentangling in $\beta $-VAE (2018)
Google Scholar
Chen, J., Batmanghelich, K.: Weakly supervised disentanglement by pairwise similarities. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, 7–12 February 2020, pp. 3495–3502. AAAI Press (2020). https://aaai.org/ojs/index.php/AAAI/article/view/5754
Chen, R.T.Q., Li, X., Grosse, R., Duvenaud, D.: Isolating sources of disentanglement in variational autoencoders (2019)
Google Scholar
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations (2020)
Google Scholar
Creager, E., et al.: Flexibly fair representation learning by disentanglement. CoRR abs/1906.02589 (2019). arxiv.org:1906.02589
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.-F.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848
Eastwood, C., Williams, C.K.I.: A framework for the quantitative evaluation of disentangled representations. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=By-7dz-AZ
Feng, Z., Wang, X., Ke, C., Zeng, A.X., Tao, D., Song, M.: Dual swap disentangling. In: Advances in Neural Information Processing Systems. Curran Associates, Inc. (2018)
Google Scholar
Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations (2018)
Google Scholar
Gondal, M.W., et al.: On the transfer of inductive bias from simulation to the real world: a new disentanglement dataset (2019)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial networks (2014)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)
Google Scholar
Higgins, I., et al.: Beta-VAE: learning basic visual concepts with a constrained variational framework. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April 2017, Conference Track Proceedings (2017). https://openreview.net/forum?id=Sy2fzU9gl
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Kim, H., Mnih, A.: Disentangling by factorising. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 2649–2658. PMLR, Stockholmsmässan, Stockholm Sweden, 10–15 July 2018. https://proceedings.mlr.press/v80/kim18b.html
Kingma, D.P., Welling, M.: Auto-encoding variational bayes (2014)
Google Scholar
Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014). arxiv.org:1312.6114
Kumar, A., Sattigeri, P., Balakrishnan, A.: Variational inference of disentangled latent concepts from unlabeled observations. In: ICLR arxiv:1711.00848 (2018)
Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Autoencoding beyond pixels using a learned similarity metric. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 1558–1566. PMLR, New York, New York, USA, 20–22 June 2016. https://proceedings.mlr.press/v48/larsen16.html
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV), December 2015
Google Scholar
Locatello, F., et al.: Challenging common assumptions in the unsupervised learning of disentangled representations (2019)
Google Scholar
Locatello, F., Poole, B., Raetsch, G., Schölkopf, B., Bachem, O., Tschannen, M.: Weakly-supervised disentanglement without compromises. In: III, H.D., Singh, A. (eds.) Proceedings of the 37th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 119, pp. 6348–6359. PMLR, 13–18 July 2020. https://proceedings.mlr.press/v119/locatello20a.html
Locatello, F., Tschannen, M., Bauer, S., Rätsch, G., Schölkopf, B., Bachem, O.: Disentangling factors of variations using few labels. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=SygagpEKwB
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3431–3440 (2015). https://doi.org/10.1109/CVPR.2015.7298965
Matthey, L., Higgins, I., Hassabis, D., Lerchner, A.: dSprites: Disentanglement testing sprites dataset. https://github.com/deepmind/dsprites-dataset/ (2017)
Park, T., et al.: Swapping autoencoder for deep image manipulation. In: Advances in Neural Information Processing Systems (2020)
Google Scholar
Siddharth, N., et al.: Learning disentangled representations with semi-supervised deep generative models (2017)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2015)
Google Scholar
van Steenkiste, S., Locatello, F., Schmidhuber, J., Bachem, O.: Are disentangled representations helpful for abstract visual reasoning? (2019)
Google Scholar
Suter, R., ore Miladinović, Schölkopf, B., Bauer, S.: Robustly disentangled causal mechanisms: Validating deep representations for interventional robustness (2019)
Google Scholar
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014). https://doi.org/10.1109/CVPR.2014.220
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 29. Curran Associates, Inc. (2016). https://proceedings.neurips.cc/paper/2016/file/90e1357833654983612fb05e3ec9148c-Paper.pdf
Zhou, Z.H.: A brief introduction to weakly supervised learning. Natl. Sci. Rev. 5(1), 44–53 (2017). https://doi.org/10.1093/nsr/nwx106
Article Google Scholar

Download references

Acknowledgement

This material is based on research sponsored by Air Force Research Laboratory under agreement number FA8750-19-1-1000. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of Air Force Research Laboratory or the US, Government.

Author information

Authors and Affiliations

USC Ming Hsieh Department of Electrical and Computer Engineering, Los Angeles, USA
Jiageng Zhu & Wael Abd-Almageed
USC Information Sciences Institute, Marina del Rey, USA
Jiageng Zhu, Hanchen Xie & Wael Abd-Almageed
Visual Intelligence and Multimedia Analytics Laboratory, Los Angeles, USA
Jiageng Zhu, Hanchen Xie & Wael Abd-Almageed

Authors

Jiageng Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Hanchen Xie
View author publications
You can also search for this author in PubMed Google Scholar
Wael Abd-Almageed
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiageng Zhu .

Editor information

Editors and Affiliations

IBM Research - MIT-IBM Watson AI Lab, Massachusetts, USA
Leonid Karlinsky
Technion – Israel Institute of Technology, Haifa, Israel
Tomer Michaeli
Kyoto University, Kyoto, Japan
Ko Nishino

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 6841 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, J., Xie, H., Abd-Almageed, W. (2023). SW-VAE: Weakly Supervised Learn Disentangled Representation via Latent Factor Swapping. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13802. Springer, Cham. https://doi.org/10.1007/978-3-031-25063-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-25063-7_5
Published: 16 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25062-0
Online ISBN: 978-3-031-25063-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

SW-VAE: Weakly Supervised Learn Disentangled Representation via Latent Factor Swapping