Skip to main content
Log in

SeqVAE: Sequence variational autoencoder with policy gradient

  • Original Submission
  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In the paper, we propose a variant of Variational Autoencoder (VAE) for sequence generation task, called SeqVAE, which is a combination of recurrent VAE and policy gradient in reinforcement learning. The goal of SeqVAE is to reduce the deviation of the optimization goal of VAE, which we achieved by adding the policy-gradient loss to SeqVAE. In the paper, we give two ways to calculate the policy-gradient loss, one is from SeqGAN and the other is proposed by us. In the experiments on them, our proposed method is better than all baselines, and experiments show that SeqVAE can alleviate the “post-collapse” problem. Essentially, SeqVAE can be regarded as a combination of VAE and Generative Adversarial Net (GAN) and has better learning ability than the plain VAE because of the increased adversarial process. Finally, an application of our SeqVAE to music melody generation is available online12.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. https://github.com/INotWant/seqvae

  2. https://github.com/jukedeck/nottingham-dataset

  3. https://github.com/tensorflow/magenta

References

  1. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan. arXiv:1701.07875

  2. Bachman P, Precup D Data generation as sequential decision making. In: Advances in Neural Information Processing Systems, pp. 3249–3257

  3. Bao J, Chen D, Wen F, Li H, Hua G Cvae-gan: fine-grained image generation through asymmetric training. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2745–2754

  4. Bengio S, Vinyals O, Jaitly N, Shazeer N Scheduled sampling for sequence prediction with recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 1171–1179

  5. Bowman SR, Vilnis L, Vinyals O, Dai AM, Jozefowicz R, Bengio S (2015) Generating sentences from a continuous space. arXiv:1511.06349

  6. Carter S, Nielsen M (2017) Using artificial intelligence to augment human intelligence. Distill 2(12):e9

    Article  Google Scholar 

  7. Dong HW, Hsiao WY, Yang LC, Yang YH Musegan: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In: Thirty-Second AAAI Conference on Artificial Intelligence

  8. Engel J, Resnick C, Roberts A, Dieleman S, Norouzi M, Eck D, Simonyan K Neural audio synthesis of musical notes with wavenet autoencoders. In: Proceedings of the 34th International Conference on Machine Learning, vol 70, pp 1068–1077. JMLR. org

  9. Goodfellow I (2016) Generative adversarial networks for text http://goo.gl/wg9DR7

  10. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y Generative adversarial nets. In: Advances in neural information processing systems, pp. 2672–2680

  11. Ha D, Eck D (2017) A neural representation of sketch drawings. arXiv:1704.03477

  12. He K, Zhang X, Ren S, Sun J Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  13. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  14. Huszár F (2015) How (not) to train your generative model: Scheduled sampling, likelihood, adversary. arXiv:1511.05101

  15. Karras T, Laine S, Aila T A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4401–4410

  16. Kim Y (2014) Convolutional neural networks for sentence classification. arXiv:1408.5882

  17. KingmaD A (2015) A methodforstochasticoptimization. arxiv: 1412.6980

  18. Konda VR, Tsitsiklis JN Actor-critic algorithms. In: Advances in neural information processing systems, pp 1008–1014

  19. Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv:1312.5602

  20. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

    Article  Google Scholar 

  21. Oord A.v.d, Dieleman S, Zen H, Simonyan K, Vinyals O, Graves A, Kalchbrenner N, Senior A, Kavukcuoglu K (2016) Wavenet: A generative model for raw audio. arXiv:1609.03499

  22. Papineni K, Roukos S, Ward T, Zhu WJ Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics, pp 311–318. Association for Computational Linguistics

  23. Roberts A, Engel J, Raffel C, Hawthorne C, Eck D (2018) A hierarchical latent vector model for learning long-term structure in music. arXiv:1803.05428

  24. Semeniuta S, Severyn A, Barth E (2017) A hybrid convolutional variational autoencoder for text generation. arXiv:1702.02390

  25. Sutton RS, McAllester DA, Singh SP, Mansour Y Policy gradient methods for reinforcement learning with function approximation. In: Advances in neural information processing systems, pp. 1057–1063

  26. Veselý K, Ghoshal A, Burget L, Povey D Sequence-discriminative training of deep neural networks. In: Interspeech, vol 2013, pp 2345–2349

  27. Wang H, Qin Z, Wan T Text generation based on generative adversarial nets with latent variables. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, pp 92–103

  28. Yu L, Zhang W, Wang J, Yu Y Seqgan: Sequence generative adversarial nets with policy gradient. In: Thirty-First AAAI Conference on Artificial Intelligence

  29. Zhou F, Yang S, Fujita H, Chen D, Wen C (2020) Deep learning fault diagnosis method based on global optimization gan for unbalanced data. Knowl-Based Syst 187(104):837

    Google Scholar 

  30. Zhu JY, Park T, Isola P, Efros AA Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232

Download references

Acknowledgements

Finally, I must express my very profound gratitude to my parents and to my thesis advisor for providing me with unfailing support and continuous encouragement throughout my years of study and through the process of researching and writing this thesis. This accomplishment would not have been possible without them. Thank you.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yidong Cui.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

https://github.com/INotWant/MelodyGeneration_SeqVAE1

https://github.com/INotWant/MelodyGeneration_SeqVAE2

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, T., Cui, Y. & Ding, F. SeqVAE: Sequence variational autoencoder with policy gradient. Appl Intell 51, 9030–9037 (2021). https://doi.org/10.1007/s10489-021-02374-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02374-7

Keywords

Navigation