research-article

Learning Optimal Priors for Task-Invariant Representations in Variational Autoencoders

Authors:

Hiroshi Takahashi,

Tomoharu Iwata,

Atsutoshi Kumagai,

Sekitoshi Kanai,

Masanori Yamada,

Yuuki Yamanaka,

Hisashi KashimaAuthors Info & Claims

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 1739 - 1748

https://doi.org/10.1145/3534678.3539291

Published: 14 August 2022 Publication History

Abstract

The variational autoencoder (VAE) is a powerful latent variable model for unsupervised representation learning. However, it does not work well in case of insufficient data points. To improve the performance in such situations, the conditional VAE (CVAE) is widely used, which aims to share task-invariant knowledge with multiple tasks through the task-invariant latent variable. In the CVAE, the posterior of the latent variable given the data point and task is regularized by the task-invariant prior, which is modeled by the standard Gaussian distribution. Although this regularization encourages independence between the latent variable and task, the latent variable remains dependent on the task. To reduce this task-dependency, the previous work introduced an additional regularizer. However, its learned representation does not work well on the target tasks. In this study, we theoretically investigate why the CVAE cannot sufficiently reduce the task-dependency and show that the simple standard Gaussian prior is one of the causes. Based on this, we propose a theoretical optimal prior for reducing the task-dependency. In addition, we theoretically show that unlike the previous work, our learned representation works well on the target tasks. Experiments on various datasets show that our approach obtains better task-invariant representations, which improves the performances of various downstream applications such as density estimation and classification.

References

[1]

Alessandro Achille, Tom Eccles, Loic Matthey, Chris Burgess, Nicholas Watters, Alexander Lerchner, and Irina Higgins. 2018. Life-long disentangled representation learning with cross-domain latent homologies. Advances in Neural Information Processing Systems 31 (2018).

[2]

Kei Akuzawa, Yusuke Iwasawa, and Yutaka Matsuo. 2018. Expressive Speech Synthesis via Modeling Expressions with Variational Autoencoder. Proc. Interspeech 2018 (2018), 3067--3071.

[3]

Jyoti Aneja, Alex Schwing, Jan Kautz, and Arash Vahdat. 2021. A contrastive learning approach for training variational autoencoder priors. Advances in Neural Information Processing Systems 34 (2021).

[4]

Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 8 (2013), 1798--1828.

Digital Library

[5]

Samuel Bowman, Luke Vilnis, Oriol Vinyals, Andrew Dai, Rafal Jozefowicz, and Samy Bengio. 2016. Generating Sentences from a Continuous Space. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. 10--21.

[6]

Gino Brunner, Andres Konrad, Yuyi Wang, and Roger Wattenhofer. 2018. MIDIVAE: Modeling dynamics and instrumentation of music with applications to style transfer. In Proceedings of the 19th International Society for Music Information Retrieval Conference. ISMIR, 747--754.

[7]

Yuri Burda, Roger B Grosse, and Ruslan Salakhutdinov. 2016. Importance Weighted Autoencoders. In ICLR (Poster).

[8]

Rich Caruana. 1997. Multitask learning. Machine learning 28, 1 (1997), 41--75.

[9]

Francesco Paolo Casale, Adrian Dalca, Luca Saglietti, Jennifer Listgarten, and Nicolo Fusi. 2018. Gaussian process prior variational autoencoders. Advances in neural information processing systems 31 (2018).

[10]

Ondej Cífka, Alexey Ozerov, and Gael Richard. 2021. Selfsupervised VQ-VAE for one-shot music style transfer. In ICASSP 2021--2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 96--100.

[11]

Thomas M Cover and Joy A Thomas. 2012. Elements of information theory. John Wiley & Sons.

[12]

Tim R Davidson, Luca Falorsi, Nicola De Cao, Thomas Kipf, and JakubMTomczak. 2018. Hyperspherical Variational Auto-Encoders. 34th Conference on Uncertainty in Artificial Intelligence (UAI-18) (2018).

[13]

Evgenii Egorov, Anna Kuzina, and Evgeny Burnaev. 2021. BooVAE: Boosting Approach for Continual Learning of VAE. Advances in Neural Information Processing Systems 34 (2021).

[14]

M Ehsan Abbasnejad, Anthony Dick, and Anton van den Hengel. 2017. Infinite variational autoencoder for semi-supervised learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5888--5897.

[15]

Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised Domain Adaptation by Backpropagation. In International Conference on Machine Learning. 1180--1189.

Digital Library

[16]

Muhammad Ghifary, W Bastiaan Kleijn, Mengjie Zhang, and David Balduzzi. 2015. Domain generalization for object recognition with multi-task autoencoders. In Proceedings of the IEEE international conference on computer vision. 2551--2559.

Digital Library

[17]

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.

Digital Library

[18]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.

[19]

Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Rezende, and Daan Wierstra. 2015. DRAW: A Recurrent Neural Network For Image Generation. In Proceedings of the 32nd International Conference on Machine Learning. 1462--1471.

[20]

Arthur Gretton, Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, and Alex J Smola. 2007. A kernel method for the two-sample-problem. In Advances in neural information processing systems. 513--520.

[21]

Jakob D. Havtorn, Jes Frellsen, Søren Hauberg, and Lars Maaløe. 2021. Hierarchical VAEs Know What They Don't Know. In Proceedings of the 38th International Conference on Machine Learning, Vol. 139. 4117--4128.

[22]

Matthew D Hoffman and Matthew J Johnson. 2016. ELBO surgery: yet another way to carve up the variational evidence lower bound. In Workshop in Advances in Approximate Bayesian Inference, NIPS.

[23]

Wei-Ning Hsu, Yu Zhang, and James Glass. 2017. Learning Latent Representations for Speech Generation and Transformation. Proc. Interspeech 2017 (2017), 1273--1277.

[24]

Jonathan J. Hull. 1994. A database for handwritten text recognition research. IEEE Transactions on pattern analysis and machine intelligence 16, 5 (1994), 550--554.

Digital Library

[25]

Maximilian Ilse, Jakub M Tomczak, Christos Louizos, and Max Welling. 2020. DIVA: Domain invariant variational autoencoders. In Medical Imaging with Deep Learning. PMLR, 322--348.

[26]

Boris Ivanovic, Karen Leung, Edward Schmerling, and Marco Pavone. 2020. Multimodal deep generative models for trajectory prediction: A conditional variational autoencoder approach. IEEE Robotics and Automation Letters 6, 2 (2020), 295--302.

[27]

Dhruv Khattar, Jaipal Singh Goud, Manish Gupta, and Vasudeva Varma. 2019. MVAE: Multimodal variational autoencoder for fake news detection. In The world wide web conference. 2915--2921.

[28]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations.

[29]

Durk P Kingma, Shakir Mohamed, Danilo Jimenez Rezende, and Max Welling. 2014. Semi-supervised learning with deep generative models. In Advances in neural information processing systems. 3581--3589.

Digital Library

[30]

Diederik P. Kingma and Max Welling. 2014. Auto-encoding variational Bayes. In 2nd International Conference on Learning Representations.

[31]

Thomas N Kipf and Max Welling. 2016. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016).

[32]

Alexej Klushyn, Nutan Chen, Richard Kurle, Botond Cseke, and Patrick van der Smagt. 2019. Learning hierarchical priors in vaes. Advances in neural information processing systems 32 (2019).

[33]

Anders Krogh and John A Hertz. 1992. A simple weight decay can improve generalization. In Advances in neural information processing systems. 950--957.

[34]

Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, et al. 1998. Gradientbased learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.

[35]

Jia Li, Jianwei Yu, Jiajin Li, Honglei Zhang, Kangfei Zhao, Yu Rong, Hong Cheng, and Junzhou Huang. 2020. Dirichlet Graph Variational Autoencoder. Advances in Neural Information Processing Systems 33 (2020).

[36]

Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2017. Adversarial Multi-task Learning for Text Classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1--10.

[37]

Mingsheng Long, Zhangjie Cao, Jianmin Wang, and S Yu Philip. 2017. Learning multiple tasks with multilinear relationship networks. In Advances in neural information processing systems. 1594--1603.

[38]

Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard Zemel. 2016. The variational fair autoencoder. International Conference on Learning Representations (ICLR) (2016).

[39]

Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov (2008), 2579--2605.

[40]

Eric Nalisnick and Padhraic Smyth. 2017. Stick-breaking variational autoencoders. In International Conference on Learning Representations (ICLR).

[41]

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011.

[42]

Danilo J Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In Proceedings of the 31st International Conference on Machine Learning. 1278--1286.

Digital Library

[43]

Kihyuk Sohn, Honglak Lee, and Xinchen Yan. 2015. Learning structured output representation using deep conditional generative models. In Advances in neural information processing systems. 3483--3491.

[44]

Masashi Sugiyama, Taiji Suzuki, and Takafumi Kanamori. 2012. Density ratio estimation in machine learning. Cambridge University Press.

[45]

Hiroshi Takahashi, Tomoharu Iwata, Yuki Yamanaka, Masanori Yamada, and Satoshi Yagi. 2019. Variational Autoencoder with Implicit Optimal Priors. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 5066--5073.

Digital Library

[46]

Jakub M Tomczak and Max Welling. 2018. VAE with a VampPrior. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. 1214--1223.

[47]

Aaron van den Oord, Oriol Vinyals, and koray kavukcuoglu. 2017. Neural discrete representation learning. In Advances in Neural Information Processing Systems. 6309--6318.

[48]

Ricardo Vilalta and Youssef Drissi. 2002. A perspective view and survey of meta-learning. Artificial intelligence review 18, 2 (2002), 77--95.

[49]

Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Olshausen, and Trevor Darrell. 2021. Tent: Fully test-time adaptation by entropy minimization. In International Conference on Learning Representations.

[50]

Zhisheng Xiao, Qing Yan, and Yali Amit. 2020. Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder. Advances in Neural Information Processing Systems 33 (2020).

[51]

Weidi Xu, Haoze Sun, Chao Deng, and Ying Tan. 2017. Variational autoencoder for semi-supervised text classification. In Thirty-First AAAI Conference on Artificial Intelligence.

Digital Library

[52]

Zichao Yang, Zhiting Hu, Ruslan Salakhutdinov, and Taylor Berg-Kirkpatrick. 2017. Improved Variational Autoencoders for Text Modeling using Dilated Convolutions. In International Conference on Machine Learning. 3881--3890.

Cited By

Togo TTogo RMaeda KOgawa THaseyama M(2024)Analysis of Continual Learning Techniques for Image Generative Models with Learned Class Information ManagementSensors10.3390/s2410308724:10(3087)Online publication date: 13-May-2024
https://doi.org/10.3390/s24103087
Kärkkäinen THänninen J(2023)Additive autoencoder for dimension estimationNeurocomputing10.1016/j.neucom.2023.126520551:COnline publication date: 28-Sep-2023
https://dl.acm.org/doi/10.1016/j.neucom.2023.126520

Index Terms

Learning Optimal Priors for Task-Invariant Representations in Variational Autoencoders
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Multi-task learning
      2. Unsupervised learning
    2. Machine learning approaches
      1. Learning latent representations

Recommendations

Arbitrary conditional inference in variational autoencoders via fast prior network training
Abstract
Variational Autoencoders (VAEs) are a popular generative model, but one in which conditional inference can be challenging. If the decomposition into query and evidence variables is fixed, conditionally trained VAEs provide an attractive solution. ...
Variational multi-task learning with gumbel-softmax priors
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems

Multi-task learning aims to explore task relatedness to improve individual tasks, which is of particular significance in the challenging scenario that only limited data is available for each task. To tackle this challenge, we propose variational multi-...
Representation Learning for Point Clouds with Variational Autoencoders
Computer Vision – ECCV 2022 Workshops
Abstract
Deep generative networks provide a way to generalize complex multi-dimensional data such as 3D point clouds. In this work, we present a novel method that operates on depth images and with the use of geometric images is able to learn the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2022

5033 pages

ISBN:9781450393850

DOI:10.1145/3534678

General Chairs:
Aidong Zhang
University of Virginia
,
Huzefa Rangwala
Amazon/George Mason University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '22

Sponsor:

KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 14 - 18, 2022

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
437
Total Downloads

Downloads (Last 12 months)53
Downloads (Last 6 weeks)5

Reflects downloads up to 01 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Togo TTogo RMaeda KOgawa THaseyama M(2024)Analysis of Continual Learning Techniques for Image Generative Models with Learned Class Information ManagementSensors10.3390/s2410308724:10(3087)Online publication date: 13-May-2024
https://doi.org/10.3390/s24103087
Kärkkäinen THänninen J(2023)Additive autoencoder for dimension estimationNeurocomputing10.1016/j.neucom.2023.126520551:COnline publication date: 28-Sep-2023
https://dl.acm.org/doi/10.1016/j.neucom.2023.126520

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten