skip to main content
10.1145/3534678.3539291acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Learning Optimal Priors for Task-Invariant Representations in Variational Autoencoders

Published: 14 August 2022 Publication History

Abstract

The variational autoencoder (VAE) is a powerful latent variable model for unsupervised representation learning. However, it does not work well in case of insufficient data points. To improve the performance in such situations, the conditional VAE (CVAE) is widely used, which aims to share task-invariant knowledge with multiple tasks through the task-invariant latent variable. In the CVAE, the posterior of the latent variable given the data point and task is regularized by the task-invariant prior, which is modeled by the standard Gaussian distribution. Although this regularization encourages independence between the latent variable and task, the latent variable remains dependent on the task. To reduce this task-dependency, the previous work introduced an additional regularizer. However, its learned representation does not work well on the target tasks. In this study, we theoretically investigate why the CVAE cannot sufficiently reduce the task-dependency and show that the simple standard Gaussian prior is one of the causes. Based on this, we propose a theoretical optimal prior for reducing the task-dependency. In addition, we theoretically show that unlike the previous work, our learned representation works well on the target tasks. Experiments on various datasets show that our approach obtains better task-invariant representations, which improves the performances of various downstream applications such as density estimation and classification.

References

[1]
Alessandro Achille, Tom Eccles, Loic Matthey, Chris Burgess, Nicholas Watters, Alexander Lerchner, and Irina Higgins. 2018. Life-long disentangled representation learning with cross-domain latent homologies. Advances in Neural Information Processing Systems 31 (2018).
[2]
Kei Akuzawa, Yusuke Iwasawa, and Yutaka Matsuo. 2018. Expressive Speech Synthesis via Modeling Expressions with Variational Autoencoder. Proc. Interspeech 2018 (2018), 3067--3071.
[3]
Jyoti Aneja, Alex Schwing, Jan Kautz, and Arash Vahdat. 2021. A contrastive learning approach for training variational autoencoder priors. Advances in Neural Information Processing Systems 34 (2021).
[4]
Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence 35, 8 (2013), 1798--1828.
[5]
Samuel Bowman, Luke Vilnis, Oriol Vinyals, Andrew Dai, Rafal Jozefowicz, and Samy Bengio. 2016. Generating Sentences from a Continuous Space. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning. 10--21.
[6]
Gino Brunner, Andres Konrad, Yuyi Wang, and Roger Wattenhofer. 2018. MIDIVAE: Modeling dynamics and instrumentation of music with applications to style transfer. In Proceedings of the 19th International Society for Music Information Retrieval Conference. ISMIR, 747--754.
[7]
Yuri Burda, Roger B Grosse, and Ruslan Salakhutdinov. 2016. Importance Weighted Autoencoders. In ICLR (Poster).
[8]
Rich Caruana. 1997. Multitask learning. Machine learning 28, 1 (1997), 41--75.
[9]
Francesco Paolo Casale, Adrian Dalca, Luca Saglietti, Jennifer Listgarten, and Nicolo Fusi. 2018. Gaussian process prior variational autoencoders. Advances in neural information processing systems 31 (2018).
[10]
Ondej Cífka, Alexey Ozerov, and Gael Richard. 2021. Selfsupervised VQ-VAE for one-shot music style transfer. In ICASSP 2021--2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 96--100.
[11]
Thomas M Cover and Joy A Thomas. 2012. Elements of information theory. John Wiley & Sons.
[12]
Tim R Davidson, Luca Falorsi, Nicola De Cao, Thomas Kipf, and JakubMTomczak. 2018. Hyperspherical Variational Auto-Encoders. 34th Conference on Uncertainty in Artificial Intelligence (UAI-18) (2018).
[13]
Evgenii Egorov, Anna Kuzina, and Evgeny Burnaev. 2021. BooVAE: Boosting Approach for Continual Learning of VAE. Advances in Neural Information Processing Systems 34 (2021).
[14]
M Ehsan Abbasnejad, Anthony Dick, and Anton van den Hengel. 2017. Infinite variational autoencoder for semi-supervised learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5888--5897.
[15]
Yaroslav Ganin and Victor Lempitsky. 2015. Unsupervised Domain Adaptation by Backpropagation. In International Conference on Machine Learning. 1180--1189.
[16]
Muhammad Ghifary, W Bastiaan Kleijn, Mengjie Zhang, and David Balduzzi. 2015. Domain generalization for object recognition with multi-task autoencoders. In Proceedings of the IEEE international conference on computer vision. 2551--2559.
[17]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org.
[18]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.
[19]
Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Rezende, and Daan Wierstra. 2015. DRAW: A Recurrent Neural Network For Image Generation. In Proceedings of the 32nd International Conference on Machine Learning. 1462--1471.
[20]
Arthur Gretton, Karsten Borgwardt, Malte Rasch, Bernhard Schölkopf, and Alex J Smola. 2007. A kernel method for the two-sample-problem. In Advances in neural information processing systems. 513--520.
[21]
Jakob D. Havtorn, Jes Frellsen, Søren Hauberg, and Lars Maaløe. 2021. Hierarchical VAEs Know What They Don't Know. In Proceedings of the 38th International Conference on Machine Learning, Vol. 139. 4117--4128.
[22]
Matthew D Hoffman and Matthew J Johnson. 2016. ELBO surgery: yet another way to carve up the variational evidence lower bound. In Workshop in Advances in Approximate Bayesian Inference, NIPS.
[23]
Wei-Ning Hsu, Yu Zhang, and James Glass. 2017. Learning Latent Representations for Speech Generation and Transformation. Proc. Interspeech 2017 (2017), 1273--1277.
[24]
Jonathan J. Hull. 1994. A database for handwritten text recognition research. IEEE Transactions on pattern analysis and machine intelligence 16, 5 (1994), 550--554.
[25]
Maximilian Ilse, Jakub M Tomczak, Christos Louizos, and Max Welling. 2020. DIVA: Domain invariant variational autoencoders. In Medical Imaging with Deep Learning. PMLR, 322--348.
[26]
Boris Ivanovic, Karen Leung, Edward Schmerling, and Marco Pavone. 2020. Multimodal deep generative models for trajectory prediction: A conditional variational autoencoder approach. IEEE Robotics and Automation Letters 6, 2 (2020), 295--302.
[27]
Dhruv Khattar, Jaipal Singh Goud, Manish Gupta, and Vasudeva Varma. 2019. MVAE: Multimodal variational autoencoder for fake news detection. In The world wide web conference. 2915--2921.
[28]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations.
[29]
Durk P Kingma, Shakir Mohamed, Danilo Jimenez Rezende, and Max Welling. 2014. Semi-supervised learning with deep generative models. In Advances in neural information processing systems. 3581--3589.
[30]
Diederik P. Kingma and Max Welling. 2014. Auto-encoding variational Bayes. In 2nd International Conference on Learning Representations.
[31]
Thomas N Kipf and Max Welling. 2016. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308 (2016).
[32]
Alexej Klushyn, Nutan Chen, Richard Kurle, Botond Cseke, and Patrick van der Smagt. 2019. Learning hierarchical priors in vaes. Advances in neural information processing systems 32 (2019).
[33]
Anders Krogh and John A Hertz. 1992. A simple weight decay can improve generalization. In Advances in neural information processing systems. 950--957.
[34]
Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, et al. 1998. Gradientbased learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
[35]
Jia Li, Jianwei Yu, Jiajin Li, Honglei Zhang, Kangfei Zhao, Yu Rong, Hong Cheng, and Junzhou Huang. 2020. Dirichlet Graph Variational Autoencoder. Advances in Neural Information Processing Systems 33 (2020).
[36]
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2017. Adversarial Multi-task Learning for Text Classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1--10.
[37]
Mingsheng Long, Zhangjie Cao, Jianmin Wang, and S Yu Philip. 2017. Learning multiple tasks with multilinear relationship networks. In Advances in neural information processing systems. 1594--1603.
[38]
Christos Louizos, Kevin Swersky, Yujia Li, Max Welling, and Richard Zemel. 2016. The variational fair autoencoder. International Conference on Learning Representations (ICLR) (2016).
[39]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, Nov (2008), 2579--2605.
[40]
Eric Nalisnick and Padhraic Smyth. 2017. Stick-breaking variational autoencoders. In International Conference on Learning Representations (ICLR).
[41]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011.
[42]
Danilo J Rezende, Shakir Mohamed, and Daan Wierstra. 2014. Stochastic Backpropagation and Approximate Inference in Deep Generative Models. In Proceedings of the 31st International Conference on Machine Learning. 1278--1286.
[43]
Kihyuk Sohn, Honglak Lee, and Xinchen Yan. 2015. Learning structured output representation using deep conditional generative models. In Advances in neural information processing systems. 3483--3491.
[44]
Masashi Sugiyama, Taiji Suzuki, and Takafumi Kanamori. 2012. Density ratio estimation in machine learning. Cambridge University Press.
[45]
Hiroshi Takahashi, Tomoharu Iwata, Yuki Yamanaka, Masanori Yamada, and Satoshi Yagi. 2019. Variational Autoencoder with Implicit Optimal Priors. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 5066--5073.
[46]
Jakub M Tomczak and Max Welling. 2018. VAE with a VampPrior. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. 1214--1223.
[47]
Aaron van den Oord, Oriol Vinyals, and koray kavukcuoglu. 2017. Neural discrete representation learning. In Advances in Neural Information Processing Systems. 6309--6318.
[48]
Ricardo Vilalta and Youssef Drissi. 2002. A perspective view and survey of meta-learning. Artificial intelligence review 18, 2 (2002), 77--95.
[49]
Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Olshausen, and Trevor Darrell. 2021. Tent: Fully test-time adaptation by entropy minimization. In International Conference on Learning Representations.
[50]
Zhisheng Xiao, Qing Yan, and Yali Amit. 2020. Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder. Advances in Neural Information Processing Systems 33 (2020).
[51]
Weidi Xu, Haoze Sun, Chao Deng, and Ying Tan. 2017. Variational autoencoder for semi-supervised text classification. In Thirty-First AAAI Conference on Artificial Intelligence.
[52]
Zichao Yang, Zhiting Hu, Ruslan Salakhutdinov, and Taylor Berg-Kirkpatrick. 2017. Improved Variational Autoencoders for Text Modeling using Dilated Convolutions. In International Conference on Machine Learning. 3881--3890.

Cited By

View all
  • (2024)Analysis of Continual Learning Techniques for Image Generative Models with Learned Class Information ManagementSensors10.3390/s2410308724:10(3087)Online publication date: 13-May-2024
  • (2023)Additive autoencoder for dimension estimationNeurocomputing10.1016/j.neucom.2023.126520551:COnline publication date: 28-Sep-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2022
5033 pages
ISBN:9781450393850
DOI:10.1145/3534678
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multi-task learning
  2. variational autoencoder

Qualifiers

  • Research-article

Conference

KDD '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)5
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Analysis of Continual Learning Techniques for Image Generative Models with Learned Class Information ManagementSensors10.3390/s2410308724:10(3087)Online publication date: 13-May-2024
  • (2023)Additive autoencoder for dimension estimationNeurocomputing10.1016/j.neucom.2023.126520551:COnline publication date: 28-Sep-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media